Brazil mass IP addresses AI scraping attack 11/02/2025

Got bad news about this.

Over the weekend, i was able to compile a good banlist against 2 of 3 Chinese networks hammering our site.
I thought this would get us ahead.

Then i saw the Brazilian traffic flood over last weekend..
Xenforo's site ( using cloudflare ) got hit first and i saw it rise to 16k bots.
Ours got hit next and went up to 20k bots and the flood lasted 2x longer.

Even though my system averages as good as cloudflare.. neither technologies provide the level of protection against what the internet can dish out lately.

The bots are not doing anything productive and are just slamming 2 URLs that don't provide them information.
The problem: the IP addresses are too numerous ( 10,000's of them ) and each one makes 2-4 very slow hits.. so this is impossible to defend against with even some sophisticated rate limiting.

hate this hacker crap.jpg

Our site held up fine and i think the reason was the recent round of TCP/IP tuning.
The only damage was to my weekend peace.

A major cost problem exists: we're on AWS, and bandwidth is expensive at a couple TB per month.
95% of this bandwidth cost is bots.

I looked over things and it appears the AI scraper bots are increasing our monthly cost of operation by 275 USD/month in bandwidth costs alone.
On top of the time i spend checking the logs, updating permanent banlists, etc.

Which means this site is ~4x more expensive to run than since the beginning of AI bot armageddon.

Thoght process on what to do:

OptionProsConsMakes Money SenseTime needed to make change
Move from AWS to Hetzner DedicatedWe can shave server costs by 80%. No longer need to worry about the Bandwidth cost, just the problem that our data is being heisted.
Good long term move if the hell is going to persist / ramp up ( most likely )
- Can no longer simply export a Virtual Machine to create a perfectly accurate development environment, now this needs to be done by hand or automated ( although this only needs to be done every 2-4 years )
- Unmanaged bare hardware can be less reliable than cloud hosting, don't like that i possibly have to respond to hardware SHTF
- No selective upgrade/downgrade of server
- Located in Germany so Americans in the middle of the USA will see +150ms to response time
Totally4 days ( move, plan, automate )
Move from AWS to Hetzner Dedicated Cloud USSame as above but 60%- Web server will be much less snappy, you'll notice it during heavy searches.Totally, but with compromises4 days ( move, plan, automate )
Submit to cloudflareBetter at handling mass insanity
Should require less intervention
- Amberwolf can't see the site
- Reduction in uptime ( +1-3 days/year )
- Still needs to be messed with from time to time
- Bandwidth reduction may only be ~25%
- Contributes to centralization of the internet
Yes, but too many compromises1-3 days?
Adopt AnubisApproximately equal to cloudflare, more effective as more website operators use it- Rapidly improving but not mature yet
- $50/mo for unbranded version
- From a reduction of operations cost perspective, it needs to pay for itself
Very possible1-3 days?
Develop my protection systemHave your cake / eat it too solution that can be catered to ES' needs- In what free time? i will be slow to complete this.
- Small possibility that it won't work well
Barely ( development cost can be split between 5 parties )~30 days / 5 = 6 days/party
Do nothingDoesn't cost developer timeServer and bandwidth cost continues to increaseNo0 days

Goals:
- dramatically reduce costs
- make the sysadmin's life much easier
- make the lives of people running internet strip mining operations very hard

Chosen route:
- move to Hetzner ( after testing for speed ) for much lower costs
- test and possibly implement anubis as a 2nd layer defense to make these people's lives harder
- finish the next generation, much smarter protection mechanism and replace anubis with it.

I'll update this thread once we get cracking.
 
Last edited:
Ran networks 25yrs ago and realized the stress was bad for me. Have not kept up much but i'm happy.
I personally thank you for what you are doing.
 
Hetzner Dedicated
Many servers ago in the beginning of the internet, I was searching for a $2.99 a month server who would answer the phone 24 hours a day. To test I called at 4:00 am with a random question. Got my question answered correctly by a guy who talked perfect English.

Looked at:


Are they in Germany? Might want to try contacting them by typing or talking and ask a internet related question. I hate typing. Telephone is best way for me to solve computer confusion.

OK.... Just thought of a question? If most members of a internet forum live in the US. Where would be the best place to set up a million / billion dollar building full of humming servers? Answer? Wouldn't it be best to have 2 million / billion dollar buildings full of humming servers encase something goes wrong with one building.
 
Hetzner's customer service is not known for being great. But their service seems to be good so far.

OK.... Just thought of a question? If most members of a internet forum live in the US. Where would be the best place to set up a million / billion dollar building full of humming servers? Answer? Wouldn't it be best to have 2 million / billion dollar buildings full of humming servers encase something goes wrong with one building.

Hold up, where are the million / billion dollars coming from?
You're thinking of Amazon Web Services..
 
Well my thoughts is that one could start a hosting company, but you have to be enormous-sized to make money at it, because profit margins have thinned out.

I'd never get into a commoditized business unless i had some real trick shot technology up my sleeve to make it easier to compete with the giants.
 
We're having another bot storm this weekend causing some heavy stuttering for 10 minutes.

During this event, i was able to login over SSH, so TCP/IP ports were available. CPU utilization was only 25%.
Bots figured out some other arbitrary limit to test.

Previously i noticed problems like these before the (mostly) brazilian flood starts. My guess is that they are benchmarking the system before slamming it.

1766342559348.png

This is not major enough for me to immediately go hackerman on and if my suspicion is correct, i actually want to fail the benchmark instead of pass it.
Will continue to observe and sorry for the interruption.
 
We experienced a large amount of stuttering due to old endless sphere being crawled very rapidly and all the TCP/IP ports being consumed.
The redirector is temporarily turned off and our site is stable again
 
Redirector for ES 1.0 partially working
Image downloads from 1.0 has been broken for a while and is the only part of ES 1.0 that needs to work
Will rewrite it next week and it will fix the ability to view old forum images on the new site ( example: some of amberwolf's older threads )
 
When we're on hetzner and bandwidth costs ~10% of what it does now, there's lots of monkey wrenching possibilities that we didn't have before 😇

The only good news i got is that our bandwidth costs are down to 40% of what they were before i banned those china nets!
 
I did some basic benchmark to find out if Hetzner Dedicated cloud CPUs in the USA were fast enough to power the search function, which can guzzle CPU.

aws vs hetzner benchmark.png

The machine on the right is ES' machine. The left is a little optimized wordpress one i run for my company.

This is a synthetic benchmark but the seat of the pants feel is a faster than our current machine, so that's promising.

So it'll work!
Major win and i'm glad i tested before i deployed here because i had tried the shared hosting before and it ran at molassess speed. It nearly put me off from them. This is much better.

Should be able to reduce hosting costs by >= 75%, excited for our AWS contract to be over in some months!
 
Last edited:
We had a major AI scraper attack yesterday and i spent 2 hours before bed battling it.
It was causing the site to stutter as usual.

Turns out bytedance just bought a giant block of IP addresses as a means of getting through existing banlists, and they were going nuts eating every possible tcp/ip port.

They sprang up another IP block while i was sleeping and that's also banned as of this morning.

Because it's now increasingly common to get hit with a new batch of 10k+ IP addresses suddenly going nuts on the site, and this is a common cause of site stuttering, it looks like we need to re-engineer some architecture to prepare it for a high scale setup.. because randomly it needs to be capable of 5000x the usual traffic since the protection kicks in slower, the higher the new IP address count is :/

Anyway site is in good condition over the last 6 hours, i'll keep an eye on it
 
Last edited:
Hmm, i forget their exact status. So many things up in the air geopolitically.

Didn't even bother to look into where the ip addresses were from, my mentality was more like:

2026-01-25 14_50_57-Reddit - https___preview.redd.it_he5cuzumqq351.png_width=640&crop=smart&au...jpg
 
Turns out bytedance just bought a giant block of IP addresses as a means of getting through existing banlists, and they were going nuts eating every possible tcp/ip port.

They sprang up another IP block while i was sleeping and that's also banned as of this morning.

Because it's now increasingly common to get hit with a new batch of 10k+ IP addresses suddenly going nuts on the site, and this is a common cause of site stuttering, it looks like we need to re-engineer some architecture to prepare it for a high scale setup..
Resistance is futile. You will be assimilated.
- The Borg
 
Back
Top