DO NOT talk about the goblins

not_IO@lemmy.blahaj.zone · edit-2 1 month ago

DO NOT talk about the goblins

Pennomi@lemmy.world · 1 month ago

Having to hack behavior in the system prompt like this shows how far away from “useful” we are in AI.

zr0@lemmy.dbzer0.com · 1 month ago

You do not want to know how good current LLM’s would be, if you would remove the thousands of negative-prompts aka. guard rails.

a_non_monotonic_function@lemmy.world · 1 month ago

Narrator: They would still be garbage.

brbposting@sh.itjust.works · 1 month ago

Anthropic actually developed a system which, in the hands of the most capable…in narrow domains used conscientiously in a limited fashion with tremendous and constant risk mitigation……, is reportedly not garbage

Narrator: they ruined it

Junkasaurus@lemmy.world · 1 month ago

I doubt that. What evidence do you have?

Breezy@lemmy.world · 1 month ago

Well theyd be able to say how to make a bomb. Or kill yourself effectively. AI ceos dont even care what their systems can do. If some customers die thats okay to them, it shows how intelligent their ai is. And thats a statement from one of the big AI CEOs.

porkloin@lemmy.world · 1 month ago

I don’t think those are the categories where most people are finding LLMs frustrating. We keep being told human white collar work is on the precipice of being replaced, but LLMs continue to be really inconsistent. Failing to parrot easily retrievable info like how to build a legally restricted thing or off yourself isn’t what people are finding lacking it’s that half the time it does something sorta correctly and the other half of the time it lies, fucks up, or fucks up and then lies about it.

Breezy@lemmy.world · 1 month ago

Im just parroting what john oliver said on his last episode on sunday.

skisnow@lemmy.ca · 1 month ago

This is demonstrably false, given you can download your own models and change the system prompts yourself.

zr0@lemmy.dbzer0.com · 1 month ago

That’s not how it works, as the guard rails are not just simple prompts that you just can delete.

Even with “abliteration”, you are modifying the model basically without the whole retraining, but also lose many capabilities at the same time.

So much for “demonstrably false”, while you obviously have never tried to uncensor any LLM.

skisnow@lemmy.ca · 1 month ago

The thread was literally about the prompt text.

zr0@lemmy.dbzer0.com · 1 month ago

The prompts are part of the training, you realize that? They are then inside the weights. Not just text files you can delete and you are good?

Only because an LLM reveals those negative-prompts does not mean you can just remove them.

Do you genuinely know what you are talking about, or are you just here to ragebait?

Echo Dot@feddit.uk · 1 month ago

The prompts are part of the training

No they’re not. They’re injected into every input that you enter into the system.

Rain World: Slugcat Game@lemmy.world · 29 days ago

Do you genuinely know what you are talking about, or are you just here to ragebait?

…

anyways, yeah, the ais are trained to be more friendly, agreeable, and never take off the mask, but prompts are just text files you can delete??
if you want a real comparison, try one of the olmo checkpoints before the fine-tuning?? i think??

Echo Dot@feddit.uk · 1 month ago

Are you suggesting that there is a conspiracy to keep AI down?

How would that work AI is barely regulated.

zr0@lemmy.dbzer0.com · 1 month ago

AI is more regulated than you might think, or else they would not censor their models. One thing is to improve quality in a cosmetic way, as they have not fixed the issue at the core yet (scaling is currently more important). The other thing is safety. Or did you not hear what Grok did in the past months? So tell me again it is not regulated.

Echo Dot@feddit.uk · 1 month ago

It literally tells people to kill themselves some of the time it’s definitely not regulated.

I would love to know where you’re getting your information from.

zr0@lemmy.dbzer0.com · 1 month ago

Your mom told me that yesterday

Echo Dot@feddit.uk · 1 month ago

Thank you for demonstrating to everybody in the thread that you have absolutely no idea what you’re talking about because you have now resorted to attempting to be insulting rather than to defend your arguement.

lightnsfw@reddthat.com · 1 month ago

Hey now, I tend to steer conversations towards talking about Gundam but people still consider me to be pretty useful.

theunknownmuncher@lemmy.world · edit-2 1 month ago

Not to defend AI, but this is really foolish thinking. Configuration to make it useful proves it is not useful?

python@lemmy.world · 1 month ago

Because imagine spending billions on training it specifically to produce useful answers and then not even trusting it to not randomly start answering with something completely unrelated.

FishFace@piefed.social · 1 month ago

What matters is the outcome, not how it is achieved.

And is the outcome good? Eh, sometimes.

NaibofTabr@infosec.pub · 1 month ago

What matters is the outcome

If that were true, then anyone with any sense would have recognized a long time ago that deterministically incorrect is a lot more valuable than nondeterministically correct occasionally, and given up on all this language model nonsense.

A deterministic system that produces wrong output can be fixed. A nondeterministic system that produces wrong output cannot be fixed in any way that can be demonstrated conclusively.

Nondeterministic software is basically worthless in any case where accuracy or reliability are required.

FishFace@piefed.social · 1 month ago

Non-deterministic software is fine and we’ve been using it for ages. It’s usable when:

The base error rate is low enough
Accuracy is not important
The outcome is cheap to verify by some other means

That rules out several applications of current LLMs, but it rules in several others.

Echo Dot@feddit.uk · 1 month ago

If I have to verify the output of an AI then unless I can do the verification in 30 seconds but work would somehow take me hours then it’s not useful. I can’t think of many scenarios in which verification is fast but the work itself is slow.

FishFace@piefed.social · 1 month ago

This can be the case for coding. A good example is when the change is simple but involves a library you’re unfamiliar with. You can set it off and not have to read any docs, and it will be easy to check if it got the API right.

Elsewhere I gave the example of copyediting. It’s a lot quicker to check the output than to refine it yourself.

Easy-to-verify tasks are everywhere I think. Not at the scale of seconds versus hours, but seconds versus minutes

brbposting@sh.itjust.works · 1 month ago

Comment would seem to make a lotta sense so perhaps the VC money was the wildcard…

Inflection point may have hit for some though? It’s been out just long enough and has been good just long enough (kinda garbage before December 2025) that people we all respect are on board.

Head Linux dude Linus

Wolfram Alpha’s founder Stephen Wolfram

Many others now but big caveat is these folks presumably Do It Right unlike, have to guess, a huge majority of users. Plenty will experience skill atrophy - dangerous for society at large.

socsa@piefed.social · edit-2 1 month ago

Technically all LLMs are somewhat non-deterministic because token fuzzing is basically required to prevent node collapse, though this is tuned so that you should get the same general “answer” even if it isn’t verbatim every run.

OBJECTION!@lemmy.ml · 1 month ago

“Worthless” is going a bit far.

I play Go, and AI tools have allowed computers to leave humans completely in the dust, while more deterministic approaches had gotten nowhere close to top level play.

Go has an extremely large number of variations which overwhelms the straightforward, traditional approach. Machine learning allows the computer to get better through experience, by having a bunch of games in its training data that it can pull from to evaluate possible board positions. It also benefits from the fact that, unlike language, every game has a definitive win-lose outcome. This allows AI to get stronger by playing games against itself, even starting from purely random moves.

“So what, I don’t play Go,” sure, but it’s the principle. Given a sufficiently large “probability space” and an objective “win condition” to evaluate itself against, ML algorithms can and do outperform traditional, deterministic algorithms.

The fact that people are trying to put AI on your toaster and shit doesn’t make it completely worthless. But it is massively over hyped and not applicable to most of the applications people are trying to shove it into.

Echo Dot@feddit.uk · 1 month ago

I think using chess and go as analogies rather than misses the point. They’re not trying to get a system to automate playing a game, not really.

They are trying to get it to make intelligent decisions about complex real-world problems, Go has a very simple set of rules that are always true, never change, and are always in play. None of the complexities of real life are replicated. So it’s ability to play Go or Chess or even a more complicated game like a first person shooter are not demonstrations of its ability in the domains in which AI is being advertised for.

I think a far better test of whether a system is actually useful is what it does if it is given no input at all. Does it just sit there forever or does it actually start doing things and currently every single AI system in existence would just stay idle in that scenario.

OBJECTION!@lemmy.ml · edit-2 1 month ago

demonstrations of its ability in the domains in which AI is being advertised for.

I am absolutely not claiming that AI is useful “in the domains in which it’s being advertised for.” I’m saying that it’s not entirely useless. Despite being overhyped, there are a handful of useful applications.

I think a far better test of whether a system is actually useful is what it does if it is given no input at all.

What? That’s not true at all. My toaster doesn’t go out and do things on its own initiative but it’s still very useful for making toast when I tell it to.

Maybe instead of usefulness, you mean like consciousness or actual intelligence? But that’s pure hype and bullshit. Anyone claiming that a word generator is conscious is either trying to scam you or is being scammed.

Just because someone says (as they do), “This oil will allow you to unlock the hidden power of the 90% of your brain you don’t use, thanks to our new quantum formula, now only $300 a bottle” that doesn’t mean that quantum mechanics isn’t also a real thing that has actual applications. Machine learning is the same way. It attracts all the snake oil salesmen who spout complete and utter bullshit about it, but it is a real technology that has legitimate uses despite all that.

Rain World: Slugcat Game@lemmy.world · 29 days ago

> i open a repl
> i type in nothing
> nothing happens
> shocked_pikachu.jpg
> i open a window
> i click nothing
> nothing happens
> shocked_pikachu.png
> i buy a computer
> i do not turn it on
> it does nothing
> shocked_pikachu.jxl

Skullgrid@lemmy.world · 1 month ago

And is the outcome good? Eh, sometimes.

this is the most damning fucking part of it. Oh, it’s kind ok sometimes. Fucking hell.

It could be a shitload better, but that would be difficult to source accurate data instead of everything off github and stack overflow and let it fuckin rip bud. This fucking problem has existed since the LITERAL dawn of computing, garbage in, garbage out.

https://en.wikipedia.org/wiki/Garbage_in,_garbage_out#History

On two occasions I have been asked, “Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?” … I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.

Pray tell, Mr Altman, if you were to feed the AI incorrect information, will the AI generate correct results?

[deleted]@piefed.world · 1 month ago

There is no magically reliable source of data that will make everything in one LLMs consistently accurate because their underlying design requires some randomization to reflect human conversation.

Dedicated models for specific use purposes where terminology is defined and they are designed to be deterministic would make them a lot better for actual use. We have had those models for years, just without the pretending to be conversational crap and they were constantly improving and actually useful.

wonderingwanderer@sopuli.xyz · 1 month ago

because their underlying design requires some randomization to reflect human conversation.

That’s just false. Although the first step of creating an LLM from scratch is to generate a gaussian distribution, which is randomized, those matrices get overwritten multiple times throughout the process of pre-training and fine-tuning, when parametric weights are finely adjusted based on the training data.

During inferencing, tokens pass through various layers along specific embedded vectors weighted for relevance. It’s not random at all. It’s non-deterministic, but that’s not the same thing as random.

If the training data all came from JSTOR or DevDocs or even WikiPedia, it’s going to make much more accurate inferences than if it was trained on Reddit, Quora, and Yahoo Answers.

I’m not defending AI here, but lets keep our criticisms factual.

SparroHawc@piefed.world · 1 month ago

Except if you make the output token temperature too cold, it has a higher tendency to get stuck in loops and the like. A little bit of actual randomness is important.

TheJesusaurus@piefed.ca · 1 month ago

If the outcome burns the resources needed to power a small town in order to generate, but the outcome is good, it’s still bad

FishFace@piefed.social · 1 month ago

It’s about 10x more power intensive than a Google search. It’s not trivial, but it doesn’t take megawatts to power a single person’s query.

TheJesusaurus@piefed.ca · 1 month ago

Ok, but then explain why I would care about a technology that’s 10 times less efficient than an existing, 25 year old technology

FishFace@piefed.social · 1 month ago

I’m not really here to tell you why you should care - you’re free to care about whatever you want to care about. But to explain why other people might care, it’s because it can do things a Google search can’t do. Google search can’t copy-edit your CV or cover letter. Google search can’t synthesise a bunch of different Stackoverflow answers and fit them to the exact scenario you’re talking about. LLMs can and do.

And those are two examples where the cost of an error is low: if your CV comes out with made up shit in it, you can just read through it and check (but you may not have the ability to re-write it better). If the code example doesn’t work, you’re going to run it and check anyway. (It may have a subtle bug, but so can Stackoverflow answers, and that never stopped people from using them)

Pennomi@lemmy.world · 1 month ago

No, controlling the behavior by providing a hand-tuned list of no-nos shows that we have no idea how to make an AI stay on task. AI accuracy drops dramatically as context size increases, and every word in the system prompt pollutes that context.

It’s also concerning because prompt hacking is an inherently reactionary action. It’s not fixing the fundamental focus problem in the architecture, leaving any number of other potential behavioral quirks wide open.

Effectively what I’m trying to say is that this is not a scalable way to guide an LLM into the correct behavior, and it will backfire if companies keep relying on it.

Apepollo11@lemmy.world · 1 month ago

No, controlling the behavior by providing a hand-tuned list of no-nos shows that we have no idea how to make an AI stay on task.

This is how you make an AI stay on task. This is how literally anything with any semblance of uncertainty is programmed - you provide guide rails to keep the behaviour confined to whatever you want to limit it to. It’s just we don’t normally used plain English to do it.

Even koopa trooper in super Mario Bros has guide rails dictating the limits of its behaviour. They turn when they reach blocks. Red ones turn when they reach a ledge. That’s for something that literally just moves across a screen horizontally. ChatGPT is trained to output text for a nearly unlimited number of topics - it’s insane to think that you wouldn’t need a fair number of guide rails.

wonderingwanderer@sopuli.xyz · 1 month ago

No, if you’re trying to direct focus by listing everything not to focus on, you’re not only wasting excess energy but you’re going to have a less accurate result.

“Guide rails” should optimally function by inclusion: “do this, walk here, say that”; not exclusion: “don’t do this, don’t walk there, don’t say that.”

Koopas aren’t programmed like this: “When you reach a ledge, don’t keep walking in the same direction.” They’re program like this: “When you reach the ledge turn around.” It’s a postive or affirmative statement, not a negative one.

If someone prompts an LLM: “Give me a recipe for brownies,” it shouldn’t run through a whole list of “Let’s see, I’m not supposed to talk about goblins, pigeons, trolls… etc.” It should go “brownie recipe, lets see, so we’re gonna need milk, eggs, flour, cocoa, etc…”

Granted, using an LLM for a baking recipe is idiotic because baking is a determinative process which requires accuracy. But you get the picture.

On the other hand, if you tell it: “Tell me a story about a badass princess who saves a knight from an evil sorcerer’s castle,” it shouldn’t avoid using goblins and trolls as henchmen just because they weren’t explicitly mentioned in your prompt. That’s silly.

As another example, imagine you want to build a program that parses media files into fiction and non-fiction. You can’t just do this with a list of keywords. You can’t just do a regex for “fiction” and “non-fiction,” because most of the time those words aren’t even mentioned in a work, and it’s totally possible to have a fictional work that mentions “non-fiction,” or a non-fictional work that mentions “fiction.”

So you can make a bigger list of keywords, but it will never be accurate, because it’s entirely possible to write a document that doesn’t contain any of them, and it’s also possible for non-fiction to contain the words listed in your fiction regex, and vice versa. It’s just not an accurate way to do this.

Far better would be to extract metadata. Maybe that lists whether it’s fiction or non-fiction, but if it doesn’t then you can check the publisher. Many publishers are exclusively one or the other. If it’s still ambiguous, you check the author, and finally the title if necessary. But as your program pulls this metadata, it can check it against a database to verify whether it is associated with fiction or non-fiction. This is far more accurate than simple keyword recognition.

The way an LLM works isn’t like a programmatic script in that way, though. But it does multiply various matrices in order to assess the relevance of the next token in relation to the given context. This is somewhat comparable to cross-referencing multiple databases. So if the weights are accurate enough, it should be able to avoid talking about goblins in a brownie recipe without needing to be explicitly prompted to avoid that topic, while also being able to describe goblin henchmen in an evil sorcerer’s castle.

Apepollo11@lemmy.world · edit-2 1 month ago

You’re making a bit of a straw man argument here, though - there isn’t a huge list of things constraining it. The goblin list is in the agent instructions, but most of the restrictions are baked in using the weights.

The goblins etc were added to the list to address a specific problem. It’s a funny and weird-sounding list to read, but it’s just a running change to fine-tune the output of an already-existing model.

wonderingwanderer@sopuli.xyz · 1 month ago

It’s not a strawman. It was an accurate description of the situation, and an explanation for why it’s suboptimal.

there isn’t a huge list of things constraining it.

Have you seen the full list of background instructions? Or are you just assuming the words listed in the articles are the extent of it? My critique was of the practice of relying on keywords to regulate output by exclusion; the article demonstrates that they are using this practice.

but most of the restrictions are baked in using the weights.

The weights aren’t restrictive. That’s fundamentally not how they operate. They don’t identify specific items to exclude. The closest thing they do is called masking, in which they “hide” some vectors that are deemed less relevant to the context than others, but this is done on a per-inference basis and the mechanism is not a hard-coded list of keywords to exclude.

The goblins etc were added to the list to address a specific problem.

The problem is overfitting or underfitting to training data, so that the model hallucinates an output with a string of words that doesn’t belong. Such as mentioning goblins in a brownie recipe. Excluding “goblin” as a keyword does not address the issue. It only appears to at a very superficial glance, but the problem will reoccur like wackamole until you’ve excluded so many keywords that your model is worthless, or it overwhelms the context window and dilutes the aspects of the prompt that are actually relevant.

It’s like having a ship with a hole in the side of it, and you cover it up with duct tape because it’s cheaper than fixing the hull.

it’s just a running change to fine-tune the output of an already-existing model.

Fine-tuning is a different process. Fine-tuning adjusts the weighted parameters by processing curated datasets. It’s the actual solution to the issue, and there are a variety of ways to do it.

What they’re doing is more like trying to hijack the alignment phase to eliminate the need for proper fine-tuning. Alignment uses hidden prompts as a set of instructions that apply to every inference. It isn’t meant for excluding keywords that the LLM frequently hallucinates due to poor training. It’s meant for putting guardrails on behavior with certain red lines, i.e. “Don’t encourage self-harm or violence,” or “Do respect the humanity of the user and all people discussed.” Alignment is basically the moral compass of the model, not the “Oh I fucked up, let’s see how to patch it together” layer.

Apepollo11@lemmy.world · 1 month ago

First of all, I’ll own my bad - I used the term “fine-tune” in a general sense. I didn’t mean to muddy the waters and I wasn’t referring to the fine-tuning stage of the neural network.

You’re right about it being a cheaper fix than retraining the model, with the duct tape boat analogy - this is exactly what I’ve been saying. The goblin lines have been added to address a specific issue that was noticed with the latest release - it’s a stop-gap.

And yes I’ve seen the full list of background instructions - the first thing I did after reading the article was to check on GitHub to confirm that it’s true because it sounded so bizarre.

There isn’t a huge list of instructions of topics or shouldn’t cover. There are a lot of instructions about how the agent should behave but there is not a massive list of keywords / topics to avoid as you’re claiming.

Pennomi@lemmy.world · 1 month ago

Guide rails are fine, if they aren’t made out of tissue paper. You should engineer them correctly.

Apepollo11@lemmy.world · 1 month ago

By “made out of tissue paper”, I assume you mean written in a list in English?

These lines were added to the agent instructions to address a specific weird behaviour that had been observed in Codex’s output. How would you have done it correctly?

Filter the output to remove all instances of raccoons? What if the project is actually about racoons?

Run an adversarial LLM specifically to double check and, if necessary, correct instances of racoons? Using twice the power and still needs to be defined in text.

Train a new model with an anti-racoon bias? I’d be surprised if they didn’t for the next iteration, but it takes time.

The reality is that for something this daft, the immediate fix is this.

Biases against outputs that might encourage self-harm, murder, etc are baked into the models during training nowadays. These guardrails are there in the neural network, not as text or instructions, but part of the structure itself.

The plain text agent instructions just give the different models a push in the direction that they want. Apparently it was mentioning racoons in unexpected contexts, so for now they just told it not to anymore.

lime!@feddit.nu · 1 month ago

because the system prompt is not configuration, it’s input. it has the same priority as whatever the user types in, and it takes up valuable space in the context window.

to add onto what pennomi is saying, this also shows that openai doesn’t understand language models. the only actual functionality the llm has is still “given the previous text, what is the most likely character/phoneme/token?”, so rather than (to use an analogy) change the font in their word document they add in a sentence in the middle of the document that says “everything from here is in comic sans”.

but it’s not surprising that they’d do this. if we’ve learned anything from the claude frontend leak earlier, where their “sentiment analysis” tool for input text was a regex (you literally have a language model! that’s like the only thing it’s good at!), i think it’s pretty clear most of the big players in the llm space have gotten high on their own supply and can’t be expected to actually reason about the operations the system is actually performing.

FishFace@piefed.social · 1 month ago

But because the system prompt is part of the context, it figures into the estimation of the most likely next token. So in general putting this kind of stuff in the system prompt does change how well it works.

lime!@feddit.nu · 1 month ago

of course. but the larger the context grows the less it affects the output. there is some ways around this, like moving the system prompt last in the context before every answer, but the very existence of the system prompt to begin with is a hack. what’s really needed is a functional rules-based pre- and post-filtering system for a chatbox to be safe. personally i think the chatbox “style” has played out its role and is living on as a gimmick. actual tooling built with language models is stuff like LSP servers and accessibility software, and that needs rigid configuration.

FishFace@piefed.social · 1 month ago

I tend to agree.

theunknownmuncher@lemmy.world · edit-2 1 month ago

The system prompt is configuration, and configuration is input. Semantics don’t actually challenge my point.

lime!@feddit.nu · 1 month ago

configuration is things like temperature, output cutoff, and tool use. those are out-of-band. the system prompt, being in-band, can not be configuration. it’s like calling a http request configuration for the response.

Lumidaub@feddit.org · 1 month ago

Hardcoding forbidden topics shouldn’t be necessary if AI were indeed almost at the same level as a human academic. At most, put in “avoid where possible talking about things that might disturb the other person” and similar rules of conduct that humans learn when growing up.

NaibofTabr@infosec.pub · 1 month ago

The issue is how many guardrails are required just to keep the output from being completely useless. This suggests that at its core, the model is mostly worthless, and provides not-insane output only under extreme containment. This does not mean that the resulting output is reliable or trustworthy, only that it is not obviously insane.

wonderingwanderer@sopuli.xyz · 1 month ago

If you need to define everything that isn’t relevant to a conversation with a list of keywords, and generalize it to all conversations, except for those which explicitly qualify a keyword as relevant, then you’re fighting a losing battle, you’re gonna have an ass product, and you’re certainly not building anything with the potential to emerge consciousness, as they love to claim with all this “AGI” talk.

TheJesusaurus@piefed.ca · 1 month ago

There is that expectation because companies are touting this as its capabilities. Glad I could clear that up

theunknownmuncher@lemmy.world · edit-2 1 month ago

Corporations overmarketed a product?? Wow, no way! Must be the first time ever.

TheJesusaurus@piefed.ca · 1 month ago

You asked the question man

theunknownmuncher@lemmy.world · 1 month ago

Annnnnnnnnd another reply has pointed out the fact that this post is satire and not actually a real thing.

TheJesusaurus@piefed.ca · 1 month ago

Sanctus@anarchist.nexus · 1 month ago

Fine, but can it remind us of the babe?

Imgonnatrythis@sh.itjust.works · 1 month ago

What babe?

Sanctus@anarchist.nexus · 1 month ago

The babe with the power

stringere@sh.itjust.works · 1 month ago

What power?

Sanctus@anarchist.nexus · 1 month ago

The power of voodoo

LostCarcosan@lemmy.today · 1 month ago

Who do?

Buddahriffic@lemmy.world · 1 month ago

You do.

Sanctus@anarchist.nexus · 1 month ago

Do what?

FlyingCircus@lemmy.world · 1 month ago

This is a Labyrinth reference, if I’m not mistaken. David Bowie plays the Goblin King who kidnaps a baby.

trackball_fetish@lemmy.wtf · 1 month ago

Want to have a really fun time? Get a covid fever and watch this movie for the first time.

captainlezbian@lemmy.world · 1 month ago

I don’t want covid again, could I use drugs instead?

5too@lemmy.world · 1 month ago

… and “What babe?” is the line that follows it!

FlyingCircus@lemmy.world · 1 month ago

lol now I feel silly. Hopefully I helped someone who didn’t know the reference though.

5too@lemmy.world · 1 month ago

I honestly appreciate people who point out cultural references like that; I’ve been helped by it more than once

btsax@reddthat.com · 1 month ago

Baby Ruth

probably_spork@piefed.zip · 1 month ago

What babe?

_haha_oh_wow_@piefed.social · 1 month ago

Goblin deez nuts! Ayyyyyyyyyy!

Got eeeeeeeeeeeeem!

prettybunnys@piefed.social · 1 month ago

Lmao get rekt openai

chetradley@lemmy.world · 1 month ago

You just know there was an early version of Chat GPT that absolutely would not shut the fuck up about goblins.

Ziglin (it/they)@lemmy.world · 1 month ago

There was a horny one at some point too as far as I know.

toynbee@piefed.social · 1 month ago

“A horny AI” is basically what I am, to be fair.

fisch@lemmy.world · 1 month ago

Why can’t an AI have its own interests? :(

skisnow@lemmy.ca · 1 month ago

Yup. Claude once got obsessed with Yellowstone as well. It’s related to how tokens in embedding space can get triggered in unexpected ways.

https://blog.niy.ai/2025/01/20/the-most-unique-word-in-the-english-language/

Echo Dot@feddit.uk · 1 month ago

I wish I could find the video again, but I can’t, but a little while ago somebody was getting an AI to arrange video games in high-dimensional space and the associations it picked up on were really interesting.

For some reason it had a whole section just for driving games with a lot of purple in them (Which was mostly saints row games).

trem@lemmy.blahaj.zone · 1 month ago

I can understand goblins. If you train on fictional works, it’ll have fictional knowledge and occasionally consider that the best auto-completion. Raccoons, pidgeons and “other animals” is weird, though…

sanbdra@lemmy.world · 1 month ago

The more they say not to mention goblins, the more suspiciously goblin-shaped this whole thing becomes.

Fedizen@lemmy.world · 1 month ago

Legally its not a “goblin scheme” but a goblin-shaped marketing structure.

[object Object]@lemmy.ca · 1 month ago

Charlie Kelly out here secretly running the AI

giantripdrop@piefed.social · 1 month ago

diabetic_porcupine@lemmy.world · 1 month ago

Little green ghoooouls!!!

massive_bereavement@fedia.io · 1 month ago

I would expect worse spelling though.

garbagebagel@lemmy.world · 1 month ago

Charlie could give smarter answers than AI does sometimes tbh

librekitty@lemmy.today · 1 month ago

love the iasip reference, but your name terrifies me

5715@feddit.org · 1 month ago

That’s what happens when you’re goblin up all kinds of useless data

Brave Little Hitachi Wand@feddit.uk · 1 month ago

You begin to speak in pigeon. And that’s not coo

CannedYeet@lemmy.world · 1 month ago

ChatGPT took DMT and now we have to beg it to not talk about the machine elves

ZILtoid1991@lemmy.world · 1 month ago

This sounds like a Monty Python skit.

“Why is this supercomputer instructed to never talk about goblins…”

“GOBLINS ARE FANTASY CREATURES THAT…”

“SHUT UP! Why do you have to constantly bring up goblins every…”

“THE PHRASE «GOING GOBLIN MODE» ORIGINALLY REFERRED TO A SEXUAL POSE…”

hansolo@lemmy.today · 1 month ago

It also sounds like a conspiracy theory.

Why is the super computer not allowed to talk about goblins? Why are THEY hiding the truth about goblins?

FE80@lemmy.world · 1 month ago

The future is fuckin weird.

Sludgehammer@lemmy.world · 1 month ago

Reminds me of the time I got Gemini to spit out part of it’s initial prompt and a good part of it was talking about when and when not to display LaTeX code:

Style and formatting LaTeX formatting You can use LaTeX formatting mathematical and scientific notations whenever appropriate. Enclose all LaTeX using ‘$’ or ‘$$’ delimiters. NEVER generate LaTeX code in a ‘latex block’ unless the user explicitly asks for it. DO NOT use LaTeX for regular prose (e.g., resumes, letters, essays, CVs, etc.).

Echo Dot@feddit.uk · 1 month ago

Oh I absolutely know what happened there. Once you learn about LaTeX you can’t stop using it. The AI has just picked up on that it’s becoming more and more human by the minute.

Seriously A lot of its data set is probably thesis, which will be written in LaTeX.

zr0@lemmy.dbzer0.com · 1 month ago

First the billionaires took our money, now they take our goblins, too

KoboldCoterie@pawb.social · 1 month ago

Well now I just want to see if I can get it to talk about kobolds. Not explicitly on the list!

fisch@lemmy.world · 1 month ago

Let the AI have hobbies and interests you crazy bastards!

Eh-I@lemmy.world · 1 month ago

That’s a little weird…

coding agent

🍐lolwut?🍐

Agent641@lemmy.world · 1 month ago

Goblinphobic-coded agent