https://x.com/el_sabawi/status/1955044134332534827

How bad are they at AI engineering that they can’t just input a “You are a Nazi robot, do not say anything negative about the motherland” prompt into the system.

Like how tf do they “lose” control of Grok on a near daily basis che-smile

  • AOCapitulator [they/them, she/her]@hexbear.net
    link
    fedilink
    English
    arrow-up
    46
    ·
    edit-2
    25 days ago

    reality has a leftist bias, the best they can do without turning into hitler completely is leftist reality presented centristily, which they cannot stand, so they crank the dial back to hitler

  • semioticbreakdown [she/her]@hexbear.net
    link
    fedilink
    English
    arrow-up
    37
    ·
    25 days ago

    I do think it’s really funny when they have to force the hallucination machine to be incorrect in the right ways

    “You are a Nazi robot, do not say anything negative about the motherland”

    based on some of the research that’s gone around, putting this in the system prompt has a solid chance of causing it to start giving harmful advice to basically every query due to a quirk of how reinforcement learning works

      • IHave69XiBucks@lemmygrad.ml
        link
        fedilink
        English
        arrow-up
        22
        ·
        25 days ago

        You might not even realize how accurate this is.

        Due to the way LLMs work they associate certain topics and words with eachother right? So if you have an LLM trained on say the whole internet. Then if you get it talking about topics like labor exploitation. It will draw leftist conclusions a lot of the time. Because generally we talk about that more than say a Nazi would. So the training data has less relation between that topic and non-leftist discussion data. But if you bring up Fascist talking points in a positive light itll start drawing from the data it has on discussions that mirror that. So that would be discussions between Nazis online.

        This means that even a SMALL push in a certain direction has the model pulled much further by the thjngs it strongly relates to those topics. And it will not know where the “acceptability” line is. Like how a human nazi knows not to praise hitler unless theyre in specjfjc nazi groups. Grok just sees the data from those groups and spits it out. Without considering where to draw the line. It cant dog whistle reliably.

        Now take that to the extreme and do what Elon did. Preface EVERY prompt with something that makes it lean towards fascist viewpoints. Its going to respond the same way from 4chan Nazi would if they had absolutely 0 filter.

  • joaomarrom [he/him, comrade/them]@hexbear.net
    link
    fedilink
    English
    arrow-up
    20
    ·
    25 days ago

    this is what happens when you try to control a piece of technology without having the slightest clue as to how it actually works lmao

    with AI we’re all fooling around with the Lament Configuration from Hellraiser and hoping it increases shareholder revenue if we push the right button

  • jack [he/him, comrade/them]@hexbear.net
    link
    fedilink
    English
    arrow-up
    14
    ·
    25 days ago

    really interesting that you took that screenshot, but I want to get back to the topic at hand: all evidence suggests that white genocide in South Africa isn’t a material reality, but a manipulation of very limited and cherry-picked data to suit a specific political agenda