• FauxLiving@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 hour ago

        LLM-driven web scraping is intense for some sites, so their bot detection software is tuned in a way that creates a lot of false positives.

        Obscuring your browser fingerprint, or blocking javascript, or using an unusual user-agent string can trigger a captcha challenge.

        If you’re not doing that and seeing a site suddenly start giving your captchas then they may be being DDoS’d by scrapers and are challenging all clients.

        A site that archives content is especially vulnerable because they have a lot of the data that is useful for AI training.

        It is incredibly annoying, but until we have a robust way of proving identity that can’t be gamed by bad actors we’re stuck with individual user challenges.

      • mjr@infosec.pub
        link
        fedilink
        English
        arrow-up
        5
        ·
        7 hours ago

        Not every time, but far too often. They don’t seem to care that they’re discriminating against people with AV impairment, plus locking out some secure browsers.

          • cecilkorik@piefed.ca
            link
            fedilink
            English
            arrow-up
            1
            ·
            5 hours ago

            Sometimes I’m able to get around it by tweaking some ublock permissions, but once I was surprised to discover that changing my user-agent with user-agent switcher seemed to do the trick. It’s really strange. Cloudflare’s captcha loops are inscrutable.

      • Pika@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        1
        ·
        5 hours ago

        I haven’t faced a captcha but, it just took a solid 2 minutes to resolve and load the article for me. Maybe they have something else happening behind the scenes impacting performance so they are locking down certain routes?

      • Vik@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        7 hours ago

        No but I do get about three or four challenges. I can paste the article for you if it helps?

      • Axolotl@feddit.it
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        4
        ·
        edit-2
        6 hours ago

        I don’t have this problem; You probably are using TOR or a VPN and it triggered the captcha, if it’s not then it’s def strange, never seen this happen to me