Today, #facebook bots (Meta-ExternalAgent, you know, the one to train AI) took 21 days of wallclock time and 19.9 GB of bandwidth to index the development version of one of our websites for which crawling makes no sense and is explicitly forbidden by robots.txt.
Maybe it is a signal that we must spend some of _our_ time to make them lose more of _them's_. Any creative idea?

Aral Balkan
in reply to Romain Tartière • • •Davide_Sandini
in reply to Aral Balkan • • •I do not remeber who had a similar project, but for sure it was cited here on mastodon.
@smortex
flyinggecko
in reply to Davide_Sandini • • •It uses a relative small amount of resources for a lot of work for crawlers.
Edit: And it's fun following the development toots of @algernon
iocaine - the deadliest poison known to AI
iocaine.madhouse-project.orgAral Balkan
in reply to flyinggecko • • •