For one month beginning on October 5, I ran an experiment: Every day, I asked ChatGPT 5 (more precisely, its “Extended Thinking” version) to find an error in “Today’s featured article”. In 28 of these 31 featured articles (90%), ChatGPT identified what I considered a valid error, often several. I have so far corrected 35 such errors.

  • kalkulat@lemmy.world
    link
    fedilink
    English
    arrow-up
    12
    arrow-down
    1
    ·
    2 days ago

    Finding inconsistencies is not so hard. Pointing them out might be a -little- useful. But resolving them based on trustworthy sources can be a -lot- harder. Most science papers require privileged access. Many news stories may have been grounded in old, mistaken histories … if not on outright guesses, distortions or even lies. (The older the history, the worse.)

    And, since LLMs are usually incapable of citing sources for their own (often batshit) claims any – where will ‘the right answers’ come from? I’ve seen LLMs, when questioned again, apologize that their previous answers were wrong.

      • kalkulat@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        4 hours ago

        To quote ChatGPT:

        “Large Language Models (LLMs) like ChatGPT cannot accurately cite sources because they do not have access to the internet and often generate fabricated references. This limitation is common across many LLMs, making them unreliable for tasks that require precise source citation.”

        It also mentions Claude. Without a cite, of course.

        Reliable information must be provided by a source with a reputation for accuracy … trustworthy. Else it’s little more than a rumor. Of course, to reveal a source is to reveal having read that source … which might leave the provider open to a copyright lawsuit.

      • jacksilver@lemmy.world
        link
        fedilink
        English
        arrow-up
        7
        ·
        2 days ago

        All of them. If you’re seeing sources cited, it means it’s a RAG (LLM with extra bits). The extra bits make a big difference as it means the response is limited to a select few points of reference and isn’t comparing all known knowledge on a subject matter.