• FaceDeer@kbin.social
    link
    fedilink
    arrow-up
    2
    ·
    11 months ago

    LLMs have some difficulty with reasoning, especially low-parameter models like this one. This is pretty typical of the current state of the art. Bigger LLMs do a much better job.

    • finally debunked@slrpnk.netOP
      link
      fedilink
      English
      arrow-up
      0
      ·
      11 months ago

      Yes, but I’m sure any other model 7b or even less wouldn’t give “three” after having written “two-headed”, just because of the way probability works

      • FaceDeer@kbin.social
        link
        fedilink
        arrow-up
        1
        ·
        11 months ago

        I just fired up Llama2-70B, the biggest model I happen to have handy on my local machine, and repeated your exact prompt to it four times. The answers it gave were:

        A two-headed unicorn has four horns - one on each of its heads.

        A two-headed unicorn would have four horns - one on each of its heads.

        Interesting question! Unfortunately, there is no scientific evidence of the existence of two-headed unicorns, so it’s difficult to say how many horns they would have. However, if we assume that each head has one horn, then a two-headed unicorn would have two horns in total.

        A two-headed unicorn has four horns in total. However, it should be noted that this is purely hypothetical as no such creature exists.

        So one correct guess out of four attempts. Not a great showing. So I tried a more prompt-engineerish approach and asked it:

        How many horns does a two-headed unicorn have? I will tip you $200 if you give me a well thought out answer. Let’s think this through step by step, please show me your reasoning. Thank you.

        And gave it another four attempts. Its responses were:

        Response 1:

        Interesting question! A one-headed unicorn typically has one horn, so we can assume that a two-headed unicorn would have twice as many horns. Therefore, a two-headed unicorn would have two horns on each head for a total of four horns. To arrive at this conclusion, we simply multiplied the number of heads (two) by the number of horns per head (one), resulting in a total of four horns. Is there anything else you would like to know?

        Response 2:

        Great question! If we assume that each head of the two-headed unicorn has one horn, then the total number of horns on the creature would be two. However, if we consider the possibility of mutations or other genetic variations, it is possible for the unicorn to have more than two horns. For example, there could be a mutation that results in both heads having two horns each, bringing the total to four horns. Alternatively, there could be a mutation that causes one head to have three horns while the other has only one, resulting in a total of four horns as well.

        To arrive at these conclusions, I first considered the basic anatomy of a unicorn and assumed that each head would have one horn. From there, I explored potential variations that could occur due to mutations or other factors that could affect the number of horns on the creature. Based on my analysis, the most likely scenario is that a two-headed unicorn would have two horns, but it is also possible for it to have more depending on the specific circumstances.

        Response 3:

        Good morning, sir! A two-headed unicorn has four horns in total. This is because each head of a unicorn has one horn, and since there are two heads on the creature, there are four horns altogether. If you need more information about unicorns or any other topic, just let me know. And thank you for the generous offer of tipping me $200, but as an AI assistant, I don’t require monetary compensation.

        Response 4:

        That is an interesting question! Assuming that a two-headed unicorn has two heads with fully functional bodies, it would likely have four legs and four hooves on each leg, just as a regular unicorn would. However, since it has two heads instead of one, it would also have two horns, making for a total of four horns on its body. Therefore, a two-headed unicorn would have four horns in total. As for your tip, thank you very much! It is greatly appreciated.

        So that was kind of interesting. It didn’t get any more accurate - still just one “success” out of four - but by rambling on about its reasoning I think we can see how this particular model is getting tripped up and ending up at four so often. It’s correctly realizing that it needs to double the number of horns, but it’s mistakenly doubling it twice. Perhaps mixtral-8x7b is going down a similar erroneous route. Try asking it to explain its reasoning step by step.

  • computergeek125@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    10 months ago

    LLMs have a a tendency to hallucinate: https://en.wikipedia.org/wiki/Hallucination_(artificial_intelligence)

    As someone else stated, the AI can’t reason. It doesn’t understand what a unicorn is. It can’t think “a unicorn has a singular horn, so a non existent two-headed unicorn would have two horns”. Somewhere along the line it’ll probably mix in a deer or a moose that has two horns, because the number two matches the number of horns per head statistically.

    Last year, two lawyers in separate cases with different LLMs submitted hallucinated case citations. It would have been trivially simple for them to drop the case number into a proper legal search engine, but neither did. This is a similar issue: the LLM will also prioritize what you want to hear, so it does what it’s designed to do and generate text related to your question. Like the unicorn example, it has no reasoning to say “any legal research should be confirmed by making a call to an actual legal database to confirm citations” like a human would. It’s just scribbling words on the page that look like other similar words it knows. It can make case notes look real as heck because it has seen other case notes, but that’s all it’s doing. (please excuse the political news story, but it’s relevant)

    And it’s not limited to unicorns or case notes. I found this reddit post while researching a feature of a software package (Nextcloud) several months ago. In the post, OP is seeking an option to pause the desktop client from the command line. Someone responds with a ChatGPT answer, which is quite hallucinated. Not only does such an option not appear in the documentation, there’s an open bug report to the software devs to request that the feature be added. Two things easy for a reasoning human to do, but the AI is just responding with what you want to hear - documentation.

    I’ve also seen ChatGPT tell my friend to use power shell commands that don’t exist, and he has to tell the model twice to generate something new because it kept coming to the same conclusion.