• chaonaut@lemmy.4d2.org
    link
    fedilink
    English
    arrow-up
    5
    ·
    5 hours ago

    Maybe the marketers should be a bit more picky about what they slap “AI” on and maybe decision makers should be a little less eager to follow whatever Better Auto complete spits out, but maybe that’s just me and we really should be pretending that all these algorithms really have made humans obsolete and generating convincing language is better than correspondence with reality.

    • surph_ninja@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      1
      ·
      4 hours ago

      I’m not sure the anti-AI marketing stance is any more solid of a position. Though it’s probably easier to defend, since it’s so vague and not based on anything measurable.

      • chaonaut@lemmy.4d2.org
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        1
        ·
        4 hours ago

        Calling AI measurable is somewhat unfounded. Between not having a coherent, agreed-upon definition of what does and does not constitute an AI (we are, after all, discussing LLMs as though they were AGI), and the difficulty that exists in discussing the qualifications of human intelligence, saying that a given metric covers how well a thing is an AI isn’t really founded on anything but preference. We could, for example, say that mathematical ability is indicative of intelligence, but claiming FLOPS is a proxy for intelligence falls rather flat. We can measure things about the various algorithms, but that’s an awful long ways off from talking about AI itself (unless we’ve bought into the marketing hype).

        • surph_ninja@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          4 hours ago

          So you’re saying the article’s measurements about AI agents being wrong 70% of the time is made up? Or is AI performance only measurable when the results help anti-AI narratives?

          • Jakeroxs@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            3
            ·
            3 hours ago

            I would definitely bet it’s made up and poorly designed.

            I wish that weren’t the case because having actual data would be nice, but these are almost always funded with some sort of intentional slant, for example nic vape safety where they clearly don’t use the product sanely and then make wild claims about how there’s lead in the vapes!

            Homie you’re fucking running the shit completely dry for longer then any humans could possible actually hit the vape, no shit it’s producing carcinogens.

            Go burn a bunch of paper and directly inhale the smoke and tell me paper is dangerous.

          • chaonaut@lemmy.4d2.org
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            1
            ·
            2 hours ago

            I mean, sure, in that the expectation is that the article is talking about AI in general. The cited paper is discussing LLMs and their ability to complete tasks. So, we have to agree that LLMs are what we mean by AI, and that their ability to complete tasks is a valid metric for AI. If we accept the marketing hype, then of course LLMs are exactly what we’ve been talking about with AI, and we’ve accepted LLMs features and limitations as what AI is. If LLMs are prone to filling in with whatever closest fits the model without regard to accuracy, by accepting LLMs as what we mean by AI, then AI fits to its model without regard to accuracy.

            • surph_ninja@lemmy.world
              link
              fedilink
              English
              arrow-up
              1
              ·
              2 hours ago

              Except you yourself just stated that it was impossible to measure performance of these things. When it’s favorable to AI, you claim it can’t be measured. When it’s unfavorable for AI, you claim of course it’s measurable. Your argument is so flimsy and your understanding so limited that you can’t even stick to a single idea. You’re all over the place.

              • chaonaut@lemmy.4d2.org
                link
                fedilink
                English
                arrow-up
                1
                arrow-down
                1
                ·
                2 hours ago

                It questionable to measure these things as being reflective of AI, because what AI is changes based on what piece of tech is being hawked as AI, because we’re really bad at defining what intelligence is and isn’t. You want to claim LLMs as AI? Go ahead, but you also adopt the problems of LLMs as the problems of AIs. Defining AI and thus its metrics is a moving target. When we can’t agree to what is is, we can’t agree to what it can do.

                • surph_ninja@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  1 hour ago

                  Again, you only say it’s a moving target to dispel anything favorable towards AI. Then you do a complete 180 when it’s negative reporting on AI. Makes your argument meaningless, if you can’t even stick to your own point.

                  • chaonaut@lemmy.4d2.org
                    link
                    fedilink
                    English
                    arrow-up
                    1
                    arrow-down
                    1
                    ·
                    1 hour ago

                    I mean, I argue that we aren’t anywhere near AGI. Maybe we have a better chatbot and autocomplete than we did 20 years, but calling that AI? It doesn’t really track, does it? With how bad they are at navigating novel situations? With how much time, energy and data it takes to eek out just a tiny bit more model fitness? Sure, these tools are pretty amazing for what they are, but general intelligences, they are not.