Yesterday my VPS set off a warning, as it was hit by a huge spike in incoming traffic, peaking at 55GB at 2:15pm and lasting for an hour.
Upon investigating, it turns out it was my PeerTube instance that was targeted.
Where did the traffic come from?
meta-externalagent (aka Meta's web crawler which is used to grab content to train its AI system).
I feel a little bit violated thinking my Fediverse promo video was grabbed by it, sigh.
Lisa Melton reshared this.

Mitex Leo
in reply to Elena Rossini ⁂ • • •Andy Piper
in reply to Mitex Leo • • •Ben Hardill
in reply to Andy Piper • • •@andypiper @ml
I got hit by this as well last week, 30% of all hits from the bot in the last 14 days.
I've not had any response from the email address they published on their bot page, so all those requests are getting 301'd to 100GiB gzip bomb for now
blog.hardill.me.uk/2026/03/12/…
WTF is Facebook doing?
Ben's Placecomputer maus
in reply to Andy Piper • • •Mitex Leo
in reply to computer maus • • •computer maus
in reply to Mitex Leo • • •Jools
in reply to Elena Rossini ⁂ • • •Chuckles
in reply to Elena Rossini ⁂ • • •D1re_W0lf ⁂🇪🇺🇵🇹
Unknown parent • • •Or the self-hosted equivalent, Pangolin + CrowdSec.
If you are really into it, you can add Anubis as an extra layer.
Jools
Unknown parent • • •@Elena Rossini ⁂ I know, I had that problem too. I got a good tip from @Rainer "friendica" Sokoll someone the other day. This helped me and others immediately:
rainer.sokoll.com/?p=8353
MFierst
in reply to Elena Rossini ⁂ • • •sam
in reply to Elena Rossini ⁂ • • •Jools
Unknown parent • • •Ben Hardill
Unknown parent • • •RichBartlett
in reply to Elena Rossini ⁂ • • •not sure if you've seen this bluetoot.hardill.me.uk/@ben/11…, I particularly like his response of using a 301 redirect to a massive file!
Ben Hardill
2026-03-17 09:48:53
nathan
Unknown parent • • •RichBartlett
Unknown parent • • •狐ヴィクシー
in reply to Elena Rossini ⁂ • • •Sylvia
in reply to Elena Rossini ⁂ • • •ugh. That’s just so aggravating. I have read several people mention that the meta bot is being aggressive and crashing sites.
That they can so blatantly steal data is just…
Really hope that the eu is going to do something about their theft.
Marian Scales
in reply to Elena Rossini ⁂ • • •Thom
in reply to Elena Rossini ⁂ • • •RootHosts
in reply to Elena Rossini ⁂ • • •that’s frustrating — especially when it spikes traffic like that without warning.
I’m a Linux/Windows system administrator, and this kind of load can be managed. You can limit or block such crawlers and also protect your VPS with anti-DDoS, rate limiting, and traffic filtering.
If you want, I can help you secure and optimize your setup — or we can provide a VPS with built-in protection.
Mastodon Migration
in reply to Elena Rossini ⁂ • • •Scott Starkey
in reply to Elena Rossini ⁂ • • •Ed
in reply to Elena Rossini ⁂ • • •Would you be able to use a user agent block list like ai.robots.txt? I have a cron job that updates it daily from their git repo and then restarts nginx.
Except I strip out the part that refers known agents to robots.txt and just give them a 403, because none of them ever honor the robots file anyway.
github.com/ai-robots-txt/ai.ro…
GitHub - ai-robots-txt/ai.robots.txt: A list of AI agents and robots to block.
GitHubChocobozzz
Unknown parent • • •Elena Rossini ⁂
in reply to Chocobozzz • • •@Chocobozzz thank you! 🙏
@ScottStarkey @Framasoft
Elena Rossini ⁂
Unknown parent • • •@sylvie thanks! I have investigated whether I could use Anubis but it would mess up with my YunoHost installation.
I need to see if I can use BunnyCDN instead (I already use it for my website)
@Chocobozzz @ScottStarkey @Framasoft
ylvie
in reply to Chocobozzz • • •PaulH
in reply to Elena Rossini ⁂ • • •that's so stupid... 🫤
Here's a repo that blocks AI crawlers on webserver level, in this case Apache: codeberg.org/creatura85/htacce…
There's probably a similar repo for ngix as well?
htaccess
Codeberg.org