You know how Google’s new feature called AI Overviews is prone to spitting out wildly incorrect answers to search queries? In one instance, AI Overviews told a user to use glue on pizza to make sure the cheese won’t slide off (pssst…please don’t do this.)

Well, according to an interview at The Vergewith Google CEO Sundar Pichai published earlier this week, just before criticism of the outputs really took off, these “hallucinations” are an “inherent feature” of  AI large language models (LLM), which is what drives AI Overviews, and this feature “is still an unsolved problem.”

  • Leate_Wonceslace@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    31
    ·
    6 months ago

    it’s just expensive

    I’m a mathematician who’s been following this stuff for about a decade or more. It’s not just expensive. Generative neural networks cannot reliably evaluate truth values; it will take time to research how to improve AI in this respect. This is a known limitation of the technology. Closely controlling the training data would certainly make the information more accurate, but that won’t stop it from hallucinating.

    The real answer is that they shouldn’t be trying to answer questions using an LLM, especially because they had a decent algorithm already.

    • Aceticon@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      edit-2
      6 months ago

      Yeah, I’ve learned Neural Networks way back when those thing were starting in the late 80s/early 90s, use AI (though seldom Machine Learning) in my job and really dove into how LLMs are put together when it started getting important, and these things are operating entirelly at the language level and on the probabilities of language tokens appearing in certain places given context and do not at all translate from language to meaning and back so there is no logic going on there nor is there any possibility of it.

      Maybe some kind of ML can help do the transformation from the language space to a meaning space were things can be operated on by logic and then back, but LLMs aren’t a way to do it as whatever internal representation spaces (yeah, plural) they use in their inners layers aren’t those of meaning and we don’t really have a way to apply logic to them).

    • snooggums@midwest.social
      link
      fedilink
      English
      arrow-up
      4
      ·
      6 months ago

      So with reddit we had several pieces of information that went along with every post.

      User, community along with up, and downvotes would inform the majority of users as to whether an average post was actually information or trash. It wasn’t perfect, because early posts always got more votes and jokes in serious topics got upvotes, bit the majority of the examples of bad posts like glue on food came from joke subs. If they can’t even filter results by joke sub, there is no way they will successfully handle saecasm.

      Only basing results on actual professionals won’t address the sarcasm filtering issue for general topics. It would be a great idea for a serious model that is intended to only return results for a specific set of topics.

      • Leate_Wonceslace@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        5
        ·
        6 months ago

        only return results for a specific set of topics.

        This is true, but when we’re talking about something that limited you’ll probably get better results with less work by using human-curated answers rather than generating a reply with an LLM.

        • snooggums@midwest.social
          link
          fedilink
          English
          arrow-up
          5
          ·
          6 months ago

          Yes, that would be the better solution. Maybe the humans could write down their knowledge and put it into some kind of journal or something!

          • Excrubulent@slrpnk.net
            link
            fedilink
            English
            arrow-up
            3
            ·
            edit-2
            6 months ago

            You could call it Hyperpedia! A disruptive new innovation brought to us via AI that’s definitely not just three encyclopedias in a trenchcoat.

    • sudo42@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      1
      ·
      6 months ago

      It’s worse than that. “Truth” can no more reliably found by machines than it can be by humans. We’ve spent centuries of philosophy trying to figure out what is “true”. The best we’ve gotten is some concepts we’ve been able to convince a large group of people to agree to.

      But even that is shaky. For a simple example, we mostly agree that bleach will kill “germs” in a petri dish. In a single announcement, we saw 40% of the American population accept as “true” that bleach would also cure them if injected straight into their veins.

      We’re never going to teach machine to reason for us when we meatbags constantly change truth to be what will be profitable to some at any given moment.

      • Leate_Wonceslace@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        1
        ·
        6 months ago

        Are you talking about epistemics in general or alethiology in particular?

        Regardless, the deep philosophical concerns aren’t really germain to the practical issue of just getting people to stop falling for obvious misinformation or people being wantonly disingenuous to score points in the most consequential game of numbers-go-up.