manton
manton

Since Cloudflare’s AI Labyrinth was announced a few days ago I’ve been trying to figure out how I feel about it. Blocking misbehaving bots is good, but creating fake pages and hidden links reminds me of other hacks to trick crawlers that I think could be detrimental to the web. Just not sure yet.

|
Embed
Progress spinner
ffmike
ffmike

@manton I wonder how long it will take the cutting-edge AI crawlers to learn that invisible links aren't worth following? Not all that long, I suspect. And meanwhile using AI on both sides of the arms race will exponentially increase energy use/waste.

|
Embed
Progress spinner
rom
rom

@manton as long as they are creating pages with facts, I have no issue. :)

the content we generate is real and related to scientific facts, just not relevant or proprietary to the site being crawled.

|
Embed
Progress spinner
manton
manton

@ffmike Right, it’s kind of ironic to use AI to purposefully waste other AI’s time.

|
Embed
Progress spinner
manton
manton

@rom That is why I’m conflicted because they are trying to do this right. But it still feels wrong to create slop pages on the web, even if humans should never see them.

|
Embed
Progress spinner
In reply to
tetov
tetov

@manton How about Anubis (https://anubis.techaro.lol/) or Nephente (https://zadzmo.org/code/nepenthes/)? Proof of work instead of mazes.

Also, is this a problem for micro.blog? Would guess not so much since the blogs are static?

|
Embed
Progress spinner
manton
manton

@tetov Shouldn't be a problem for Micro.blog. The only annoying crawling we notice is people trying to exploit WordPress security holes, which don't work here of course, so we block those requests now.

|
Embed
Progress spinner
SteveSawczyn
SteveSawczyn

@manton I also worry about the accessibility implications of this a bit: For better or worse, screen readers look at pages, mainly through their underlying structure, in the way that bots do. As such, they aren't able to tell if a link is truly visible, or if it's just in the DOM. This often leads to me thinking there's an error or modal present on screen when it's not, it's just in the DOM and so the screen reader assumes it's there. Dealing with bots while not breaking accessibility is admittedly a real challenge as any techniques screen readers might use to better understand what's going on could also be used by the bots.

|
Embed
Progress spinner
tetov
tetov

@manton yeah, I really wonder what the success rate are on those bots.

|
Embed
Progress spinner
manton
manton

@SteveSawczyn I worry about that too. It seems like there are bound to be consequences with these tricks that will affect real users.

|
Embed
Progress spinner