manton
manton

I reworked our robots.txt parsing to be much better. In the process, I looked at a bunch of website’s robots.txt files… Blah. No wonder the Internet Archive gave up and now ignores it.

The definition of a “crawler” is increasingly debatable. A search engine? Yes. Bookmarking and archiving? Dunno.

|
Embed
Progress spinner
omg.jacky.wtf
omg.jacky.wtf

@manton at this point, it seems like anything that’s not explicitly one layer of operation away from a human driver.

a web browser responding to a user click? sure.
cURL from the command line by a user hitting ENTER? okay.
but a bookmarklet that fires two requests down? iffy!

|
Embed
Progress spinner
omg.jacky.wtf
omg.jacky.wtf

@manton

at this point, it seems like anything that’s not explicitly one layer of operation away from a human driver.

a web browser responding to a user click? sure.
cURL from the command line by a user hitting ENTER? okay.
but a bookmarklet that fires two requests down? iffy!

|
Embed
Progress spinner
manton
manton

@omg.jacky.wtf That’s a pretty good definition.

|
Embed
Progress spinner
ayjay
ayjay

@manton I wish that something like iocaine was implenmentable on micro.blog. I’m not formally requesting it and don’t expect it, but it would be cool.

|
Embed
Progress spinner
In reply to
manton
manton

@ayjay Thanks. On first glance I don’t think it would fit well in our architecture. But maybe something similar in the future could work.

|
Embed
Progress spinner