I have bad news about the bot problem.
We have seen a huge influx of slow, insanely distributed Chinese AI scraper bot traffic that's happening with real devices in a clickfarm.
They are so good that they throw off our google analytics metrics lately.
This is hard to defend against and it's the kind of crap big sites have to deal with.
They have managed to double our natural traffic.

During the previous bot battle months ago, this amount was almost 10x higher though, so it's not an immediate danger like before.
Meanwhile, the Xenforo forum has 4.5x the bot count and an equal size of content to us, but they are relying on cloudflare to protect them. So at least our homebrew protection is outperforming the state of the art.
Nonetheless it looks like in a week or so we need to implement some new controls on these out of control scraper bots.
Actions taken now:
[X] Ban upper 90% of AWS singapore that's abusing the site; china network will reveal itself once that's filtered out
[X] an IP address can only hit the website 200 times in a half day period <-- this will be adjusted up if we get false positives
Next round:
[ ] Finish commercial providers banlist generator
[ ] Everyone is allowed to read 10 links a day, but they need to sign up to see more in a day. ( will they bother to sign up their scrapers? if they do, then they have an account and we can monitor them even more closely )
[ ] Ban China ( don't want to do this if i can avoid it )
[ ] Force new visitors to submit 5 seconds of CPU time ( aka - "checking if you are a bot" ) or get banned if their client doesn't, to help jam up the operations
@amberwolf i know you are the most likely to run into the 200 hits per day limitation.. if the protection mechanism gives you a love tap ( ~5 minutes of no response ), let me know. Also, if you don't have my email on file, PM me to get it.
We have seen a huge influx of slow, insanely distributed Chinese AI scraper bot traffic that's happening with real devices in a clickfarm.
They are so good that they throw off our google analytics metrics lately.
This is hard to defend against and it's the kind of crap big sites have to deal with.
They have managed to double our natural traffic.

During the previous bot battle months ago, this amount was almost 10x higher though, so it's not an immediate danger like before.
Meanwhile, the Xenforo forum has 4.5x the bot count and an equal size of content to us, but they are relying on cloudflare to protect them. So at least our homebrew protection is outperforming the state of the art.
Nonetheless it looks like in a week or so we need to implement some new controls on these out of control scraper bots.
Actions taken now:
[X] Ban upper 90% of AWS singapore that's abusing the site; china network will reveal itself once that's filtered out
[X] an IP address can only hit the website 200 times in a half day period <-- this will be adjusted up if we get false positives
Next round:
[ ] Finish commercial providers banlist generator
[ ] Everyone is allowed to read 10 links a day, but they need to sign up to see more in a day. ( will they bother to sign up their scrapers? if they do, then they have an account and we can monitor them even more closely )
[ ] Ban China ( don't want to do this if i can avoid it )
[ ] Force new visitors to submit 5 seconds of CPU time ( aka - "checking if you are a bot" ) or get banned if their client doesn't, to help jam up the operations
@amberwolf i know you are the most likely to run into the 200 hits per day limitation.. if the protection mechanism gives you a love tap ( ~5 minutes of no response ), let me know. Also, if you don't have my email on file, PM me to get it.






