• FauxLiving@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 hour ago

      LLM-driven web scraping is intense for some sites, so their bot detection software is tuned in a way that creates a lot of false positives.

      Obscuring your browser fingerprint, or blocking javascript, or using an unusual user-agent string can trigger a captcha challenge.

      If you’re not doing that and seeing a site suddenly start giving your captchas then they may be being DDoS’d by scrapers and are challenging all clients.

      A site that archives content is especially vulnerable because they have a lot of the data that is useful for AI training.

      It is incredibly annoying, but until we have a robust way of proving identity that can’t be gamed by bad actors we’re stuck with individual user challenges.

    • mjr@infosec.pub
      link
      fedilink
      English
      arrow-up
      5
      ·
      7 hours ago

      Not every time, but far too often. They don’t seem to care that they’re discriminating against people with AV impairment, plus locking out some secure browsers.

        • cecilkorik@piefed.ca
          link
          fedilink
          English
          arrow-up
          1
          ·
          5 hours ago

          Sometimes I’m able to get around it by tweaking some ublock permissions, but once I was surprised to discover that changing my user-agent with user-agent switcher seemed to do the trick. It’s really strange. Cloudflare’s captcha loops are inscrutable.

    • Pika@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      1
      ·
      5 hours ago

      I haven’t faced a captcha but, it just took a solid 2 minutes to resolve and load the article for me. Maybe they have something else happening behind the scenes impacting performance so they are locking down certain routes?

    • Vik@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      7 hours ago

      No but I do get about three or four challenges. I can paste the article for you if it helps?

    • Axolotl@feddit.it
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      4
      ·
      edit-2
      6 hours ago

      I don’t have this problem; You probably are using TOR or a VPN and it triggered the captcha, if it’s not then it’s def strange, never seen this happen to me