This is very exciting. Here is the APK I downloaded. And the associated discussion.
It even already seems to support stylus input which is very exciting seeing as there has been talk of porting RNote to Android.
This is very exciting. Here is the APK I downloaded. And the associated discussion.
It even already seems to support stylus input which is very exciting seeing as there has been talk of porting RNote to Android.
Do you have a better way? It is way more private than anything else I’ve seen.
From a energy usage perspective it also isn’t bad. Spiking the CPU for a few seconds is minor especially compared to other tasks.
The mersenneforums have users solve an obscure (to a non-mathematician) but relatively simple number theory problem.
Yeah, tarpits. Or, even just intentionally fractionally lagging the connection, or putting a delay on the response to some mime types. Delays don’t consume nearly as much processing as PoW. Personally, I like tar pits that trickle out content like a really slow server. Hidden URLs that users are not likely to click on. These are about the least energy-demanding solutions that have a chance of fooling bots; a true, no-response tarpit would use less energy, but is easily detected by bots and terminated.
Proof of work is just a terrible idea, once you’ve accepted that PoW is bad for the environment, which it demonstrably is.
Tar pits rely on crawlers being dumb. That isn’t necessarily the case with a lot of stuff on the internet. It isn’t uncommon for a not to render a page and then only process the visible stuff.
Also I’ve yet to see any evidence that Arubis is any worse for the environment than any basic computer function.
Tarpits suck. Not worth the implementation or overhead. Instead the better strat is to pretend the server is down with a 503 code or that the url is onvalid with a 404 code so the bots stop clinging to your content.
Also we already have non-PoW captchas that dont require javascript. See: go-away for these implemwntations
Good luck detecting bots…
It’s actually not that hard. Most of these bots are using a predictable scheme of headless browsers with no js or minimal js rendering to scrape the web page. Fully deployed browser instances are demonstrably harder to scale and basically impossible to detect without behavioral pattern detection or sophisticated captchas that also cause friction to users.
The problem with bots has never rested solely on detectability. It’s about:
A. How much you inconvenience the user to detect them
B. Impacting good or acceptable bots like archival, curl, custom search tools, and loads of other totally benign use cases.
There is negligible server overhead for a tarpit. It can be merely a script that listens on a socket and never replies, or it can reply with markov-generated html with a few characters a second, taking minutes to load a full page. This has almost no overhead. Implementation is adding a link to your page headers and running the script. It’s not exactly rocket science.
Which part of that is overhead, or difficult?
It certainly is not negligble compared to static site delivery which can breezily be cached compared to on-the-fly tarpits. Even traditional static sites are getting their asses kicked sometimes by these bots. And yoy want to make that worse by having the server generate text with markov chains for each request? The point for most is reducing the sheer bandwidth and cpu cycles being eating up by these bots hitting every endpoint.
Many of these bots are designed to stop hitting endpoints when they return codes that signal they’ve flattened it.
Tarpits only make sense from the perspective of someone trying to cause monetary harm to an otherwise uncaring VC funded mob with nigh endless amounts of cache to burn. Chances are your middling attempt at causing them friction isn’t going to, alone, actually get them to leave you.
Meanwhile you burn significant amounts of resources and traffic is still stalled for normal users. This is not the kind of method a server operator actually wanting a dependable service is deploying to try to get up and running gain. You want the bots to hit nothing even slightly expensive (read: preferably something minimal you can cache or mostly cache) and to never come back.
A compromise between these two things is what Anubis is doing. It inflicts maximum pain (on those attempting to bypass it - otheriwse it just fails) for minimal cost by creating a small seed (more trivial than even a markov chain – it’s literally just an sha256) that a client then has to solve a challenge based on. It’s nice, but certainly not my preference: I like go-away because it leverages browser apis these headless agents dont use (and subsequnetly let’s js-less browsers work) in this kind of field of problems. Then, if you have a record of known misbehavers (their ip ranges, etc), or some other scheme to keeo track of failed challeneges, you hit them with fake server down errors.
Markov chains and slow loading sites are costing you material just to cost them more material.
None of those things work well is the problem. It doesn’t stop the bots from hammering you site. Crawlers will just timeout and move on.
I run a service that gets attacked by AI bots, and while PoW isn’t the only way to do things, none of your suggestions work at all.
I think Anubis is born out desperation