• cecilkorik@lemmy.ca
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    15 hours ago

    Absolutely true. They’ll buy the data they want from some shitty crawler running from some data broker in some far-flung and lawless part of the world, hallucinate the actual source, and pretend they had no idea their “data partner” wasn’t respecting robots.txt if they have to, which they won’t ever have to do because it’s literally impossible to detect and prove and realistically unenforceable.

    This is a company that removed it’s company motto of “Don’t be evil” because it found it too “limiting”. Don’t be naive.

    • General_Effort@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      14 hours ago

      That’s very different from what I called false.

      What you describe may happen, but probably not as much as you think. Much of that stuff is just not that valuable. Some personal, colloquial writing is necessary, but Google already pays Reddit. Other stuff is better obtained from torrents or shadow libraries like Anna’s Archive.