cory@social.lol
cory@social.lol

Does anyone know Arc Search's user agent so we can block it?

|
Embed
Progress spinner
knowler@sunny.garden
knowler@sunny.garden

@cory I think it might be PerplexityBot. I’m not 100% certain though.

blog.perplexity.ai/blog/arc-x-
darkvisitors.com/agents/perple

|
Embed
Progress spinner
cory@social.lol
cory@social.lol

@knowler 👀 thank you! Going to make sure it’s added on my end and keep an eye out for anything else they’re using.

|
Embed
Progress spinner
austincnunn@weird.autos
austincnunn@weird.autos

@cory Why? I feel like Im out of the loop here.

|
Embed
Progress spinner
cory@social.lol
cory@social.lol

@austincnunn it’s an AI powered engine that aggregates information for you and provides suspect answers — I really don’t see any benefit to allowing it.

|
Embed
Progress spinner
austincnunn@weird.autos
austincnunn@weird.autos

@cory Thats fair. Just confused when I saw that and thought I had missed a big hubballoo.

|
Embed
Progress spinner
cory@social.lol
cory@social.lol

@austincnunn ah yeah it's yet another misguided attempt to layer AI into absolutely everything

|
Embed
Progress spinner
austincnunn@weird.autos
austincnunn@weird.autos

@cory Neato. I can't wait till they add AI to our AI.

...

I need a Xzibit emoji.

|
Embed
Progress spinner
cory@social.lol
cory@social.lol

@austincnunn use AI to remove AI from your codebase

|
Embed
Progress spinner
knowler@sunny.garden
knowler@sunny.garden

@cory Ok, so not PerplexityBot, this is the user-agent string: ArcMobile2/11 CFNetwork/1492.0.1 Darwin/23.3.0

|
Embed
Progress spinner
austincnunn@weird.autos
austincnunn@weird.autos

@cory AI companies: "Wait, no."

|
Embed
Progress spinner
cory@social.lol
cory@social.lol

@austincnunn

|
Embed
Progress spinner
cory@social.lol
cory@social.lol

@knowler got it! So we'd need to block `ArcMobile2`? I haven't dug into this myself but can't imagine the entire string is required. 😅

|
Embed
Progress spinner
knowler@sunny.garden
knowler@sunny.garden

@cory It also doesn’t seem like they’re respecting robots.txt

|
Embed
Progress spinner
cory@social.lol
cory@social.lol

@knowler oh lord — I wish I were surprised by that 🙃

|
Embed
Progress spinner
bahua@groupthink.fun
bahua@groupthink.fun

@cory

I have these in one of my vhost access logs:

"Arc/1.19.1 (Mac OS X Version 14.0 (Build 23A5337a))"
"Arc/1.26.2 (Mac OS X Version 14.2.1 (Build 23C71))"
"Arc/1.25.1 (Mac OS X Version 14.3 (Build 23D5051b))"
"ARC Reader (arc.semsol.org/)"

Hope this helps!

|
Embed
Progress spinner
cory@social.lol
cory@social.lol

@bahua thank you! My site's on Netlify and I don't have any insight into access logs from any visitors.

|
Embed
Progress spinner
cory@social.lol
cory@social.lol

@andyn I don't like the AI-based approach and have blocked and intend to keep blocking similar crawlers (darkvisitors.com is a good reference).

|
Embed
Progress spinner
cory@social.lol
cory@social.lol

@andyn I suppose that's fair — I'm more concerned with the extractive nature of AI writ large and would rather draw the line and add new crawlers as they arise.

|
Embed
Progress spinner
In reply to
torb
torb

@cory Try to make sure you don’t block Arc in it’s entirety. Many of us Arc users are not exactly fans of the LLM stuff they’ve added lately (which thankfully is opt-in on the Mac version, I never turned any of them on).

|
Embed
Progress spinner
cory@social.lol
cory@social.lol

@torb I sure won’t! I’m not interested in blocking visitors or browsers, just robots and scrapers (provide the honor robots.txt). 😄

|
Embed
Progress spinner
Mojeek@mastodon.social
Mojeek@mastodon.social

@cory @torb another reason why the robots solution for these LLMs is not so good, a meta tag would be better: noml.info/

|
Embed
Progress spinner