[TUHS] TUHS: Maintenance, Succession and Funding

Sun Apr 19 02:24:18 AEST 2026

On 4/18/26 10:20 AM, Kenneth Goodwin via TUHS wrote:
> On your physical firewall,  block the entire subnet range that 
> they were assigned  by their ISP using a single access control list 
> statement with ip address and appropriate subnet mask. Drop all packets 
> from this range. Its been awhile, but I believe IANA maintains a 
> list of ip address ranges per internet client. Other organizations 
> might as well. It makes your site disappear from their view. They 
> may automatically stop connecting once enough failed attempts are 
> registered at their end.

The last time I looked IANA maintained a list of which IP ranges were 
handed out to the Regional Internet Registries (RIRs).  You'd need to go 
(multiple hops) deeper to find a viable subnet for the offending IP(s).

I found that (access to) a (read-only) BGP (monitoring) feed can be very 
useful for this.  The BGP feed will have down to the /24 for the network 
the offending IP is in.  What's more is you can see what other prefixes 
the ASN is advertising and block them as well if you want to.

> If you are using a server based firewall such as iptables or 
> a successor, do the above ACL there. Instead of one ACL per ip 
> address. It's one ACL per offender blocking everything.

Agreed.

I might recommend an ipset instead that is referenced by iptables / 
nftables.  Changes to the ipset tend to be more efficient than the 
entire *tables.

> Failing that.  you could add-in subroutines to your web server code 
> that implement a blacklist at connection time. Works If you are not 
> NATing their true source ip to an internal firewall address at the 
> web server level.

Are you talking about in web application code or in the actual web 
server daemon?

> You can drop the connection or send them back a small static packet 
> of well chosen, But polite words to cease and desist and then close 
> the connection.

I feel like this is the domain of the 403 error document.

Arrange for the web server to return 403 to them and that should include 
a simple static "please go away" type message.

> If you have a bank of webservers with a load balancer in front of 
> them, you might be able to use that to your advantage.  You have 
> to become uncivilized for this approach.  Basically the if all else 
> fails solution.  Costs may prohibit this type of response.

Are you talking about filtering unwelcome clients?  Or are you talking 
about scaling up compute to be able to serve the load?

> You might be able to have the load balancer or firewall redirect 
> all traffic from these source ips to a separate server which is 
> setup as a counterattack server. This server accepts the connection, 
> verifies the source ip as blacklisted and holds that connection open 
> for a very long time. Essentially a form.of reverse denial of service 
> counterattack. Every HTML et.al link they try to connect to goes to 
> the same underlying text file. This text file, aka THE PACKAGE. Is a 
> enormous text file of absolute rubbish created on a daily basis. You 
> want it to look.like new information.Terabytes or Petabytes in overall 
> size. Perhaps even a different one created for every possible link 
> pathway into your system. The file should have a paragraph at the 
> top that mimics your cease and desist request letter. Basically you 
> are tying up their network resources for as long as possible. Then 
> sending them massive amounts of useless garbage which may tie up 
> their disk and cpu resources. "Do unto others as they do unto you". 
> In order to get noticed and cause the desired change in behavior.

If you aren't careful with how you do this, you may inadvertently 
participating in the DOS against yourself.

Aside:  I wonder if there is a viable way to induce a Slowloris reply 
attack.  Send one byte per second or something like that.  --  But that 
still uses some resources.

> Regarding the flat file of blacklisted source ip addresses. Read 
> it into memory at program start or when you send a signal to the 
> program. Caching the list instead of reading the drive contents 
> for performance.

Lots of places to help and / or hurt performance / resource consumption 
/ etc.

> You can build this blacklist automatically by extracting ip addresses, 
> Timestamps, etc from the web server logs. Use a program to Process 
> those entries to find the top number of connections from ip 
> addresses. ...

Agreed.

> ... The absurdly high count that points out who is doing this.

I want to agree, but I thought the current M.O. was to do small numbers 
of connections from a large number of IPs distributed out all over the 
Internet.  Effectively a form of DDoS.

> You also also use this raw data connection list in another way.
> 
> Use the IANA demographic information to contact the CEO, CFO, 
> and CTO of the offending companies. Email and certified postal 
> mail. Inform them in detail of the situation and it's impact on your 
> costs and general internet population.  Use a Cease and Desist legal 
> format. Imply legal action will be necessary if this continues. Make 
> sure they understand that it is the absurd high volume of nonsense 
> connections and not their occasional connection for information 
> gathering that is the issue.

While I agree in concept.  I view almost all of these actions as a 
massive form of DoS on the THUS team's time and resources.  Sadly this 
cure may be worse than the disease.

> Encourage them to reduce connection frequency to something far 
> more reasonable for what is essentially a static site.

If the TUHS site is not doing so already, consider adding ETAG and 
cache-control headers.

With these in place, I'd be willing to tolerate clients sending 
If-Not-Modified (et al.) requests for files.  At least more so than just 
out and out unconditional requests for them.

> Include the connection data that you extracted from the log files 
> proving their abuse of your systems. If you can quantify the out of 
> pocket costs to you. Include an invoice for said same in your cease and 
> desist response or at least a mention of actual costs in the letter.

Sadly, I don't think the out-of-band cease and desist will be effective.

I believe that technical solutions to actively refuse the requests will 
be required.

> So basically-
> 
> Phase one - TO ease your current pain. Block their entire assigned 
> IP ADDRESS RANGE with a single ACL.

You can cheat and forego the ACL by adding a null route and break the 
communications path.  Routers are exceedingly good at processing large 
routing tables.  ;-)

If you enable reverse path filtering on your router, it will effectively 
act like a firewall that will drop the inbound packets which don't match 
the null route interface they would go out of.  So the packets -> 
connections never make it to your server(s).

> Phase two

Sadly, I don't think this will work.

I am confident that the AI Tech' Bro's are EXTREMELY WELL AWARE of what 
they are doing and they seem to not care.

> Phase three - the Counter offensive. If all else fails....  watch out 
> for the applicable laws in your location. Send the cease and desist 
> message in another way. Give them more data than they expect, tie up 
> their resources in every way possible. A non lethal counter offensive.

You can put the cease and desist in the 403 error page.

If their bot is dumb enough to suck down a TB of data at 1 byte per 
second ... then REDACTED them.  Save for the fact that you are serving 
up that TB of data over a LONG period of time.

I wonder if it might be possible to leverage some older technology to 
identify longer lived connections and cause them to fail.  If the first 
two standard deviations of connection time for requests is a single 
digit number of minutes (hypothetical numbers for discussion) then kill 
any connections to the web server that have lasted longer than 30 / 60 / 
90 minutes.

Aside:  You can use the IPs of clients that have lasted 30 / 60 / 90 
minutes as a list of problem IPs to organically grow your ban list.

> Basically a honeypot server setup to deal with these extreme situations 
> if no reasonable compromise can be had.

I think this is where we are effectively at.  I see discussions like 
this happening LOTS of places.  There are largely two classes of 
abusers; AI scrapers and malicious bots.  Sometimes it's neigh 
impossible to differentiate between them based on their behavior.

> Just suggestions, you don't have to be a victim and tolerate this 
> impact.

100% agree

One thing that I will add is that I've seen people changing (or adding) 
a license to the (served) content indicating that training AI with it is 
expressly forbidden.  --  It's effectively a white picket fence that can 
easily be stepped over.  But it is a clear delineation of a line that 
should NOT be crossed.  --  It's germane if legal action is pursued.

-- 
Grant. . . .
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4270 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://www.tuhs.org/pipermail/tuhs/attachments/20260418/2c8937ac/attachment.p7s>