wc -l empty_searches_7d.log
12914 empty_searches_7d.log
cat empty_searches_7d.log | jq -c '[ .q, .lang ]' | sort | uniq | wc -l
7670
so
7670/12914
.5939
is unique.
cat empty_searches_7d.log | jq -c '[ .q, .lang ]' | sort | uniq -c | sort -nr | head -n 50
1046 ["the","de"]
237 ["the","en"]
87 ["the","da"]
52 ["undefined","en"]
47 ["street art","en"]
45 ["56084474","en"]
34 ["sssieddqxsx","en"]
34 ["harley joker","de"]
34 ["array","en"]
22 ["mittlerer osten","de"]
22 ["ISUZU","en"]
20 ["road","en"]
20 ["killing","en"]
17 ["hütte hugo","en"]
16 ["../../../../../../../../../../../../../../windows/win.ini","en"]
16 ["theesi:include src=http://bxss.me/rpb.png/","en"]
16 ["fsssiedxa'sssiedx","en"]
16 ["fsssiedxa"sssiedx","en"]
16 ["fsssiedxa\"sssiedx","en"]
16 ["fsssiedxa'sssiedx","en"]
16 ["fsssiedxafdsaxbx</title>asddsssiedx","en"]
16 ["file:///etc/passwd","en"]
16 ["../../../../../../../../../../../../../../etc/passwd","en"]
16 ["12345\\\\);|]*\u0000{ \u0000?💡","en"]
15 ["xfs.bxss.me","en"]
15 ["xfs.bxss.me?colourbox.com","en"]
15 ["/\\xfs.bxss.me?colourbox.com","en"]
15 ["//xfs.bxss.me?colourbox.com","en"]
15 ["the????%27%22\\\\","en"]
15 [".print(md5(31337)).","en"]
15 ["http://xfs.bxss.me?colourbox.com","en"]
15 ["http://dicrpdbjmemujemfyopp.zzz/yrphmgdpgulaszriylqiipemefmacafkxycjaxjs?.jpg","en"]
15 ["http://bxss.me/t/xss.html?","en"]
15 ["http://bxss.me/t/fit.txt?.jpg","en"]
15 ["http://bxss.me/t/fit.txt","en"]
15 ["/etc/shells","en"]
15 ["Engelsflügel","de"]
15 [")))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))","en"]
15 [")","en"]
15 ["()","en"]
15 ["!--","en"]
15 ["!(()&&!|*|*|","en"]
15 ["c:/windows/win.ini","en"]
15 ["bxss.me/t/xss.html?","en"]
15 ["bxss.me","en"]
15 [";assert(base64 decode(chjpbnqobwq1kdmxmzm3ksk7));","en"]
15 ["1yrphmgdpgulaszriylqiipemefmacafkxycjaxjs\u0000.jpg","en"]
15 ["${@print(md5(31337))}\\","en"]
15 ["${@print(md5(31337))}","en"]
15 ["^(#$!@#$)(()))******","en"]
This thing has a long tail.
Would be cool to classify these searches, but
7670/3600
2.130
little more than 2 hours, if we asume I can categorize on search pr sec of straight rip.
Skimming over the logs it's hard to see a pattern. There are misspellings, searching in the wrong lanuage (we had language detection to help people at some point, is it gone?) and hackerman stuff.
A surprising thing is what looks like autocomplete
425 ["Änderu","de"]
426 ["Änderun","de"]
427 ["Änderungs","de"]
428 ["Änderungssc","de"]
429 ["Änderungssch","de"]
430 ["Änderungsschn","de"]
431 ["Änderungsschne","de"]
432 ["Änderungsschnei","de"]
433 ["Änderungsschneider","de"]
Maybe a colourbox api customer has no clue we have actual auto complete and just searches on every keystroke?