Engagement poisoning of ChatGPT

esaru@beehaw.org · edit-2 1 年前

Engagement poisoning of ChatGPT

Zaleramancer@beehaw.org · 1 年前

As I understand it, most LLM are almost literally the Chinese rooms thought experiment. They have a massive collection of data, strong algorithms for matching letters to letters in a productive order, and sufficiently advanced processing power to make use of that. An LLM is very good at presenting conversation; completing sentences, paragraphs or thoughts; or, answering questions of very simple fact- they’re not good at analysis, because that’s not what they were optimized for.

This can be seen when people discovered that if ask them to do things like tell you how many times a letter shows up in a word, or do simple math that’s presented in a weird way, or to write a document with citations- they will hallucinate information because they are just doing what they were made to do: complete sentences, expand words along a probability curve that produces legible, intelligible text.

I opened up chat-gpt and asked it to provide me with a short description of how Medieval European banking worked, with citations and it provided me with what I asked for. However, the citations it made were fake:

The minute I asked it, I assume a bit of sleight of hand happened, where it’s been set up so that if someone asks a question like that it’s forwarded to a search engine that verifies if the book exists, probably using Worldcat or something. Then I assume another search is made to provide the prompt for the LLM to present the fact that the author does exist, and possibly accurately name some of their books.

I say sleight of hand because this presents the idea that the model is capable of understanding it made a mistake, but I don’t think it does- if it knew that the book wasn’t real, why would it have mentioned it in the first place?

I tested each of the citations it made. In one case, I asked it to tell me more about one of them and it ended up supplying an ISBN without me asking, which I dutifully checked. It was for a book that exists, but it didn’t share a title or author, because those were made up. The book itself was about the correct subject, but the LLM can’t even tell me what the name of the book is correctly; and, I’m expected to believe what it says about the book itself?

localhost@beehaw.org · 1 年前

As I understand it, most LLM are almost literally the Chinese rooms thought experiment.

Chinese room is not what you think it is.

Searle’s argument is that a computer program cannot ever understand anything, even if it’s a 1:1 simulation of an actual human brain with all capabilities of one. He argues that understanding and consciousness are not emergent properties of a sufficiently intelligent system, but are instead inherent properties of biological brains.

“Brain is magic” basically.

Zaleramancer@beehaw.org · 1 年前

Let me try again: In the literal sense of it matching patterns to patterns without actually understanding them.

localhost@beehaw.org · 1 年前

If I were to have a discussion with a person responding to me like ChatGPT does, I would not dare suggest that they don’t understand the conversation, much less that they are incapable of understanding anything whatsoever.

What is making you believe that LLMs don’t understand the patterns? What’s your idea of “understanding” here?

Zaleramancer@beehaw.org · 1 年前

What’s yours? I’m stating that LLMs are not capable of understanding the actual content of any words they arrange into patterns. This is why they create false information, especially in places like my examples with citations- they are purely the result of it creating “academic citation” sounding sets of words. It doesn’t know what a citation actually is.

Can you prove otherwise? In my sense of “understanding” it’s actually knowing the content and context of something, being able to actually subject it to analysis and explain it accurately and completely. An LLM cannot do this. It’s not designed to- there are neural network AI built on similar foundational principles towards divergent goals that can produce remarkable results in terms of data analysis, but not ChatGPT. It doesn’t understand anything, which is why you can repeatedly ask it about a book only to look it up and discover it doesn’t exist.

localhost@beehaw.org · 1 年前

In my sense of “understanding” it’s actually knowing the content and context of something, being able to actually subject it to analysis and explain it accurately and completely.

This is something that sufficiently large LLMs like ChatGPT can do pretty much as well as non-expert people on a given topic. Sometimes better.

This definition is also very knowledge dependent. You can find a lot of people that would not meet this criteria, especially if the subject they’d have to explain is arbitrary and not up to them.

Can you prove otherwise?

You can ask it to write a poem or a song on some random esoteric topic. You can ask it to play DnD with you. You can instruct it to write something more concisely, or more verbosely. You can tell it to write in specific tone. You can ask follow-up questions and receive answers. This is not something that I would expect of a system fundamentally incapable of any understanding whatsoever.

But let me reverse this question. Can you prove that humans are capable of understanding? What test can you posit that every English-speaking human would pass and every LLM would fail, that would prove that LLMs are not capable of understanding while humans are?

Zaleramancer@beehaw.org · 1 年前

And, yes, I can prove that a human can understand things when I ask: Hey, go find some books on a subject, then read them and summarize them. If I ask for that, and they understood it, they can then tell me the names of those books because their summary is based on actually taking in the information, analyzing it and reorganizing it by apprehending it as actual information.

They do not immediately tell me about the hypothetical summaries of fake books and then state with full confidence that those books are real. The LLM does not understand what I am asking for, but it knows what the shape is. It knows what an academic essay looks like and it can emulate that shape, and if you’re just using an LLM for entertainment that’s really all you need. The shape of a conversation for a D&D npc is the same as the actual content of it, but the shape of an essay is not the same as the content of that essay. They’re too diverse, and they have critical information in them and they are about that information. The LLM does not understand the information, which is why it makes up citations- it knows that a citation fits in the pattern, and that citations are structured with a book name and author and all the other relevant details. None of those are assured to be real, because it doesn’t understand what a citation is for or why it’s there, only that they should exist. It is not analyzing the books and reporting on them.

Zaleramancer@beehaw.org · 1 年前

Hello again! So, I am interested in engaging with this question, but I have to say: My initial post is about how an LLM cannot provide actual, real citations with any degree of academic rigor for a random esoteric topic. This is because it cannot understand what a citation is, only what it is shaped like.

An LLM deals with context over content. They create structures that are legible to humans, and they are quite good at that. An LLM can totally create an entire conversation with a fictional character in their style and voice- that doesn’t mean it knows what that character is. Consider how AI art can have problems that arise from the fact that they understand the shape of something, but they don’t know what it actually is- that’s why early AI art had a lot of problems with objects ambigiously becoming other objects. The fidelity of these creations has improved with the technology, but that doesn’t imply understanding of the content.

Do you think an LLM understands the idea of truth? Do you think if you ask it to say a truthful thing, and be very sure of itself and think it over, it will produce something that’s actually more accurate or truthful- or just something that has the language hall-marks of being truthful? I know that an LLM will produce complete fabrications that distort the truth if you expect a base-line level of rigor from them, and I proved that above, in that the LLM couldn’t even accurately report the name of a book it was supposedly using as a source.

What is understanding, if the LLM can make up an entire author, book and bibliography if you ask it to tell you about the real world?

localhost@beehaw.org · 1 年前

Hey again! First of all, thank you for continuing to engage with me in good faith and for your detailed replies. We may differ in our opinions on the topic but I’m glad that we are able to have a constructive and friendly discussion nonetheless :)

I agree with you that LLMs are bad at providing citations. Similarly they are bad at providing urls, id numbers, titles, and many other things that require high accuracy memorization. I don’t necessarily agree that this is a definite proof of their incapability to understand.

In my view, LLMs are always in an “exam mode”. That is to say, due to the way they are trained, they have to provide answers even if they don’t know them. This is similar to how students act when they are taking an exam - they make up facts not because they’re incapable of understanding the question, but because it’s more beneficial for them to provide a partially wrong answer than no answer at all.

I’m also not taking a definitive position on whether or not LLMs have capability to understand (IMO that’s pure semantics). I am pushing back against the recently widespread idea that they provably don’t. I think LLMs have some tasks that they are very capable at and some that they are not. It’s disingenuous and possibly even dangerous to downplay a powerful technology under a pretense that it doesn’t fit some very narrow and subjective definition of a word.

And this is unfortunately what I often see here, on other lemmy instances, and on reddit - people not only redefining what “understand”, “reason”, or “think” means so that generative AI falls outside of it, but then using this self-proclaimed classification to argue that they aren’t capable of something else entirely. A car doesn’t lose its ability to move if I classify it as a type of chair. A bomb doesn’t stop being dangerous if I redefine what it means to explode.

Do you think an LLM understands the idea of truth?

I don’t think it’s impossible. You can give ChatGPT a true statement, instruct it to lie to you about it, and it will do it. You can then ask it to point out which part of its statement was a lie, and it will do it. You can interrogate it in numerous ways that don’t require exact memorization of niche subjects and it will generally produce an output that, to me, is consistent with the idea that it understands what truth is.

Let me also ask you a counter question: do you think a flat-earther understands the idea of truth? After all, they will blatantly hallucinate incorrect information about the Earth’s shape and related topics. They might even tell you internally inconsistent statements or change their mind upon further questioning. And yet I don’t think this proves that they have no understanding about what truth is, they just don’t recognize some facts as true.

Zaleramancer@beehaw.org · 1 年前

Hi, once more, I’m happy to have a discussion about this. I have very firm views on it, and enjoy getting a chance to discuss them and work towards an ever greater understanding of the world.

I completely understand the desire to push back against certain kinds of “understandings” people have about LLM due to their potentially harmful inaccuracy and the misunderstandings that they could create. I have had to deal with very weird, like, existentialist takes on AI art lacking a quintessential humanity that all human art is magically endowed with- which, come on, there are very detailed technical art reasons why they’re different, visually! It’s a very complicated phenomenon, but, it’s not an inexplicable cosmic mystery! Take an art critique class!

Anyway, I get it- I have appreciated your obvious desire to have a discussion.

On the subject of understanding, I guess what I mean is this: Based on everything I know about an LLM, their “information processing” happens primarily in their training. This is why you can run an LLM instance on, like, a laptop but it takes data centers to train them. They do not actually process new information, because if they did, you wouldn’t need to train them, would you- you’d just have them learn and grow over time. An LLM breaks its training data down into patterns and shapes and forms, and uses very advanced techniques to generate the most likely continuation of a collection of words. You’re right in that they must answer, but that’s because their training data is filled with that pattern of answering the question. The natural continuation of a question is, always, an answer-shaped thing. Because of the miracles of science, we can get a very accurate and high fidelity simulation of what that answer would look like!

Understanding, to me, implies a real processing of new information and a synthesis of prior and new knowledge to create a concept. I don’t think it’s impossible for us to achieve this, technologically, humans manage it and I’m positive that we could eventually figure out a synthetic method of replicating it. I do not think an LLM does this. The behavior they exhibit and the methods they use seem radically inconsistent with that end. Because, the ultimate goal of them was not to create a thinking thing, but to create something that’s able to make human-like speech that’s coherent, reliable and conversational. They totally did that! It’s incredibly good at that. If it were not for the context of them politically, environmentally and economically, I would be so psyched about using them! I would have been trying to create templates to get an LLM to be an amazing TTRPG oracle if it weren’t for the horrors of the world.

It’s incredible that we were able to have a synthetic method of doing that! I just wish it was being used responsibly.

An LLM, based on how it works, cannot understand what it is saying, or what you are saying, or what anything means. It can continue text in a conversational and coherent way, with a lot of reliability on how it does that. The size, depth and careful curation of its training data mean that those responses are probably as accurate to being an appropriate response as they can be. This is why, for questions of common knowledge, or anything you’d do a light google for, they’re fine. They will provide you with an appropriate response because the question probably exists hundreds of thousands of times in the training data; and, the information you are looking for also exists in huge redundancies across the internet that got poured into that data. If I ask an LLM which of the characters of My Little Pony has a southern accent, they will probably answer correctly because that information has been repeated so much online that it probably dwarfs the human written record of all things from 1400 and earlier.

The problem becomes evident when you ask something that is absolutely part of a structured system in the english language, but which has a highly variable element to it. This is why I use the “citation problem” when discussing them, because they’re perfect for this: A citation is part of a formal or informal essay, which are deeply structured and information dense, making them great subjects for training data. Their structure includes a series of regular, repeating elements in particular orders: Name, date, book name, year, etc- these are present and repeated with such regularity that the pattern must be quite established for the LLM as a correct form of speech. The names of academic books are often also highly patterned, and an LLM is great at creating human names, so there’s no problem there.

The issue is this: How can an LLM tell if a citation it makes is real? It gets a pattern that says, “The citation for this information is:” and it continues that pattern by putting a name, date, book title, etc in that slot. However, this isn’t like asking what a rabbit is- the pattern of citations leads into an endless warren of hundreds of thousands names, book titles, dates, and publishing companies. It generates them, but it cannot understand what a citation really means, just that there is a pattern it must continue- so it does.

Let me also ask you a counter question: do you think a flat-earther understands the idea of truth? After all, they will blatantly hallucinate incorrect information about the Earth’s shape and related topics. They might even tell you internally inconsistent statements or change their mind upon further questioning. And yet I don’t think this proves that they have no understanding about what truth is, they just don’t recognize some facts as true.

A flat-earther has some understanding of what truth is, even if their definition is divergent from the norm. The things they say are deeply inaccurate, but you can tell that they are the result of a chain of logic that you can ask about and follow. It’s possible to trace flat-earth ideas down to sources. They’re incorrect, but they’re arrived at because of an understanding of prior (incorrect) information. A flat-earther does not always invent their entire argument and the basis for their beliefs on the spot, they are presenting things they know about from prior events- they can show the links. An LLM cannot tell you how it arrived at a conclusion, because if you ask it, you are just receiving a new continuation of your prior text. Whatever it says is accurate only when probability and data set size is on its side.