• Jankatarch@lemmy.world
    link
    fedilink
    English
    arrow-up
    15
    ·
    6 hours ago

    Maintainers’ only responsibility is to ensure quality and shouldn’t have to check for rogue AI submissions.

    Tho I still miss consistent fucking weather so year of the netbsd?

    • MoogleMaestro@lemmy.zip
      link
      fedilink
      English
      arrow-up
      11
      arrow-down
      1
      ·
      5 hours ago

      Microsoft needs to try to ruin Linux somehow, it can’t just hurt windows 11 with AI slop code, it needs to expand it’s efforts to other systems.

    • MoogleMaestro@lemmy.zip
      link
      fedilink
      English
      arrow-up
      7
      ·
      5 hours ago

      It’s definitely financially motivated. Linus said himself that AI has been very lucrative for Linux as it has expanded investment from companies that normally wouldn’t give a fuck (he name dropped NVidia specifically) on that one LTT video.

    • Horsey@lemmy.world
      link
      fedilink
      English
      arrow-up
      14
      arrow-down
      9
      ·
      6 hours ago

      Saying no to code just because it was AI generated is like saying you can’t trust excel to be your bookkeeper. It’s a tool, and the person using the tool being at fault is exactly what happened here.

      • GreenBeanMachine@lemmy.world
        link
        fedilink
        English
        arrow-up
        12
        arrow-down
        2
        ·
        4 hours ago

        Some good points, but poor comparison. Excel is deterministic, AI is not.

        Yes, you can ALWAYS trust Excel, after configuring it correctly ONCE. You can NEVER trust AI to produce the same output given the same inputs. Excel never hallucinates, AI hallucinates all the time.

        • Feyd@programming.dev
          link
          fedilink
          English
          arrow-up
          5
          arrow-down
          5
          ·
          1 hour ago

          You can actually set it up to give the same outputs given the same inputs (temperature = 0). The variability is on purpose

          • GreenBeanMachine@lemmy.world
            link
            fedilink
            English
            arrow-up
            5
            arrow-down
            1
            ·
            1 hour ago

            Not true. While setting temperature to zero will drastically reduce variation, it is still only a near-deterministic and not fully deterministic system.

            You also have to run the model with the input to determine what the output will be, no way to determine it BEFORE running. With a deterministic system, if you know the code you can predict the output with 100% accuracy without ever running it.

            • Feyd@programming.dev
              link
              fedilink
              English
              arrow-up
              3
              arrow-down
              4
              ·
              59 minutes ago

              You also have to run the model with the input to determine what the output will be, no way to determine it BEFORE running. With a deterministic system, if you know the code you can predict the output with 100% accuracy without ever running it.

              This is not the definition of determinism. You are adding qualifications.

              I did look it up and I see now there are other factors that aren’t under your control if you’re using a remote system, so I’ll amend my statement to say that you can have deterministic inference systems, but the big ones most people use cannot be configured to be by the user.

              • GreenBeanMachine@lemmy.world
                link
                fedilink
                English
                arrow-up
                5
                arrow-down
                1
                ·
                43 minutes ago

                Deterministic systems are always predictable, even if you never ran the system. Can you determine the output of an LLM with zero temperature without ever having ran it?

                And even disregarding the above, no, they are still NOT deterministic systems, and can still give different results, even if unlikely. The variation is NOT absolute zero when the temperature is set to zero.

                • Feyd@programming.dev
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  17 minutes ago

                  Deterministic systems are always predictable, even if you never ran the system. Can you determine the output of an LLM with zero temperature without ever having ran it?

                  You don’t have to understand a deterministic system for it to be deterministic. You are making that up.

                  And even disregarding the above, no, they are still NOT deterministic systems

                  I conceded that setting temperature to 0 for an arbitrary system (including all the remote ones most people are using) does not mean it is deterministic after reading about other factors that influence inference in these systems. That does not mean there are not deterministic implementations of LLM inference, and repeating yourself with NO additional information and using CAPS does NOT make you more CORRECT lol.

  • Seth Taylor@lemmy.world
    link
    fedilink
    English
    arrow-up
    38
    arrow-down
    11
    ·
    edit-2
    13 hours ago

    Bad actors submitting garbage code aren’t going to read the documentation anyway, so the kernel should focus on holding human developers accountable rather than trying to police the software they run on their local machines.

    “Guns don’t kill people. People kill people”

    Torvalds and the maintainers are acknowledging reality: developers are going to use AI tools to code faster, and trying to ban them is like trying to ban a specific brand of keyboard.

    The author should elaborate on how exactly AI is like “a specific brand of keyboard”. Last I checked a keyboard only enters what I type, without hallucinating 50 extra pages. And if AI, a tool that generates content, is like “a specific brand of keyboard”, does that mean my brain is also a “specific brand of keyboard”?

    I get their point. If you want to create good code by having AI create bad code and then spending twice the time to fix it, feel free to do that. But I’m in favor of a complete ban.

    • Miaou@jlai.lu
      link
      fedilink
      English
      arrow-up
      13
      ·
      5 hours ago

      The (very obvious) point is that this cannot be enforced. So might as well deal with it upfront.

    • Simulation6@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      29
      arrow-down
      1
      ·
      11 hours ago

      The keyboard thing is sort of a parable, it is as difficult to determine if code was generated in part by AI as it is to determine what keyboard was used to create it.

    • Shayeta@feddit.org
      link
      fedilink
      English
      arrow-up
      16
      arrow-down
      2
      ·
      edit-2
      11 hours ago

      AI is a useful tool for coding as long as it’s being used properly. The problem isn’t the tool, the problem is the companies who scraped the entire internet, trained LLM models, and then put them behind paywalls with no options to download the weights so that they could be self-hosted. Brazen, unaccountable profiteering off of the goodwill of many open source projects without giving anything back.

      If LLMs were community-trained on available, open-source code with weights freely available for anyone to host there wouldn’t be nearly as much animosity against the tech itself. The enemy isn’t the tool, but the ones who built the tool at the expense of everyone and are hogging all the benefits.

      • cartoon meme dog@lemmy.zip
        link
        fedilink
        English
        arrow-up
        2
        ·
        42 minutes ago

        There are hundreds of such LLMs with published training sets and weights available on places like HuggingFace. Lots of people run their own LLMs locally, it’s not hard if you have enough vram and a bit of patience to wait longer for each reply.

    • ede1998@feddit.org
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      1
      ·
      12 hours ago

      Last I checked a keyboard only enters what I type

      I’ve had (broken) keyboard “hallucinate” extra keystrokes before, because of stuck keys. Or ignore keypresses. But yeah, that means the keyboard is broken.

  • sonofearth@lemmy.world
    link
    fedilink
    English
    arrow-up
    45
    arrow-down
    3
    ·
    15 hours ago

    I am the c/fuck_ai person but at this point I have made peace we can’t avoid it. I still don’t want it to do artsy stuff (image gen, video gen) and to blindly use it in critical stuff because humans are the ones that should be doing it or have constant oversight. I think the team’s logic is correct here, because there is no way to know if the code is from an LLM or a human unless something there screams LLM or the contributor explicitly mentions it. Mandating the latter seems like a reasonable move for now.

    • DaleGribble88@programming.dev
      link
      fedilink
      English
      arrow-up
      10
      arrow-down
      3
      ·
      13 hours ago

      I consider myself to be more pro AI than not, but I’m certainly not a zealot and mostly agree with the take that it shouldn’t be used in artistic pursuits. However, I love using AI to help me create art. It can give great critiques, often good advice on how to improve, and is great for rapid experimentation and prototyping. I actually used it this weekend to see what a D&D mini might look like with different color schemes before painting it. I could have done the same with Gimp, but it would have taken much longer for worse results that was ultimately just for a brain storming session. How do you feel about my AI usage from your perspective? I suppose from an energy conservation perspective, all of it was bad, but I’m more interested in a less trivial take.

      • sonofearth@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        ·
        12 hours ago

        Yes the energy consumption is bad. My main gripe about LLM generated art is that it will not be original. It will use its training data from uncredited artworks to generate it. Art usually is made by humans to express something or convey something in a creative way. LLMs fail at that. What LLMs can actually be helpful at is making learning art more accessible to everyone. Art schools or private art classes can be expensive. This lowers the barrier to entry.

        As for you using generated Art is that the it might be really beautiful but it will be very difficult to maintain that style and even more difficult to convince that it is your style. The Artist doesn’t get much recognition with LLM generated art. Using it as a critique also seems stupid because LLMs will aways try to give an objective view on it than subjective. Your art won’t trigger an emotion in it and might say it is bad or “do this to make it more understandable” — that’s where you lose as an artist.

        My mom likes to paint as a hobby. What she does it searches stuff on Pinterest (which is mostly LLM Generated). She uses it as an inspiration to do it in her own style and maybe give it some spin. She keeps all of it for herself.

    • ashughes@feddit.uk
      link
      fedilink
      English
      arrow-up
      3
      ·
      2 hours ago

      You know what? Just fucking move on top of a fucking mountain and Into the wild yourself.

      Workin’ on it.

    • neclimdul@lemmy.world
      link
      fedilink
      English
      arrow-up
      38
      arrow-down
      1
      ·
      5 hours ago

      Fuck the corporate ransacking, chatbot subscription hell hole, and general breaking of the internet done under the framing of “AI”.

      Guess that doesn’t really roll of the tongue like Fuck AI but sure so yeah let’s just move to a mountain instead of pushing for a better world.

      • yabbadabaddon@lemmy.zip
        link
        fedilink
        English
        arrow-up
        6
        arrow-down
        23
        ·
        5 hours ago

        Funny how nothing you wrote has anything to do with AI but with capitalism but yeah sure let’s blame AI instead of the USA, its government and its oligarchs ruining the world for everybody.

        • quack@lemmy.zip
          link
          fedilink
          English
          arrow-up
          7
          ·
          4 hours ago

          Obviously capitalism makes pretty much everything worse but let’s not pretend AI wouldn’t have issues without capitalism too.

            • quack@lemmy.zip
              link
              fedilink
              English
              arrow-up
              9
              ·
              4 hours ago

              You’re very angry for a person who literally used the term themselves a couple of comments ago. What term would you rather use then? It’s colloquial, everyone knows what I’m talking about. Are you the kind of person who gets angry when someone doesn’t call it “GNU/Linux” too?

    • Miaou@jlai.lu
      link
      fedilink
      English
      arrow-up
      13
      arrow-down
      1
      ·
      5 hours ago

      People confuse GPTs with AI, but your comment takes the wrong approach: it’s not that AI hate is not deserved, it’s that the hate should be directed towards the chatbots and the associated bubble.

      • imjustmsk@lemmy.ml
        link
        fedilink
        English
        arrow-up
        5
        ·
        4 hours ago

        Yea, but when am average person talks about AI they just mean a Chatbot or GenAI right?

      • jackalope@lemmy.ml
        link
        fedilink
        English
        arrow-up
        3
        ·
        4 hours ago

        “AI” is simply a field of study. There is no true bar for “AI” that GPTs fail. Because there is no true bar for AI. a symbolic AI system is as much AI as the most advanced LLM or world model or whatever.

    • Mr_Dr_Oink@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      4 hours ago

      I didnt think that was the point. Fuck AI is just a slogan representing peoples disdain for corporate types who think chatgpt literally the second coming of jesus and is going to save us all. Its people who are taking LLMs and pretending they can reason and think like humans. People that think they can sack all their staff and replace them with AI. Its more complex than that. You know that, i am certain you do. AI can do somethings very well, and other things it absolutely falls over flat on its face.

      Unless i am misunderstanding, this was never about the blanket boycotting of anything AI and it was more about not pretending it is more than it is and shoving down the throats of non consenting consumers.

        • Mr_Dr_Oink@lemmy.world
          link
          fedilink
          English
          arrow-up
          9
          arrow-down
          1
          ·
          4 hours ago

          As someone who works in healthcare, in IT, who has been directly involved in the commissioning of an AI designed to spot skin cancers from pictures taken with special lenses attached to iphones. No healthcare provider is using these tools in place of doctors. These AI models are incredibly accurate but the human is still needed to spot false positives. They dont leave diagnostic decisions up to AI. I can tell you that for a fact.

          • yabbadabaddon@lemmy.zip
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            1
            ·
            4 hours ago

            Same thing with everything related to every single algorithm implementation in every single sector.

    • imjustmsk@lemmy.ml
      link
      fedilink
      English
      arrow-up
      2
      ·
      4 hours ago

      Fuck AI- anyway.

      The whole AI hype is just making tech giant whackjobs more rich, as well as FUCKING us over in somany ways.

      The world ain’t black and white you cannot just hate AI, its just a general term, but fuck allose mofos tryna make more bucks off of this- as if they weren’t rich already.

      I wonder why they just give away free “intelligence”, as in free AI chatbots that everyone can access which is so obviously - extremely non profitable, They keep yapping they need to make “information” more accessible and keeps throwing money into a hole.

      FUCKING make education more accessible :|

      People I know, most of them rely on texting to their little chatgpt in their phone to get through day to day tasks, algorithms chose what they watch, now Language learning models decides what they do throughout their life- We are supposed to learn shit ourselves, if we cognitively Offload every shit from our brain- We are just making ourselves more stupid.

      TL; DR That was just a useless and brainless rant on AI lol

      • Washedupcynic@lemmy.ca
        link
        fedilink
        English
        arrow-up
        1
        ·
        3 hours ago

        People I know, most of them rely on texting to their little chatgpt in their phone to get through day to day tasks, algorithms chose what they watch, now Language learning models decides what they do throughout their life- We are supposed to learn shit ourselves, if we cognitively Offload every shit from our brain- We are just making ourselves more stupid.

        That’s what they oligarchs want. They want us ignorant so we will be good little wage slaves and consumers.

      • yabbadabaddon@lemmy.zip
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        1
        ·
        4 hours ago

        Then fight against what matters : the fucking oligarchs and their fucking piece of shit friends at the head of the USA.

  • NewNewAugustEast@lemmy.zip
    link
    fedilink
    English
    arrow-up
    49
    arrow-down
    2
    ·
    19 hours ago

    Copilot? You mean the AI with terms of service that are in bold and explicit: “for entertainment purposes only”?

    Which is why its in the title and not the article? EntertainBait?

  • CanIFishHere@lemmy.ca
    link
    fedilink
    English
    arrow-up
    63
    arrow-down
    11
    ·
    20 hours ago

    AI is here, another tool to use…the correct way. Very reasonable approach from Torvalds.

    • Newsteinleo@infosec.pub
      link
      fedilink
      English
      arrow-up
      28
      arrow-down
      1
      ·
      19 hours ago

      I don’t have a problem with LLMs as much as the way people use them. My boss has offloaded all of his thinking to LLMs to the point he can’t fix a sentence in a slide deck without using an LLM.

      It’s the people that try to use LLMs for things outside their domain of expertise that really cause the problems.

      • InternetCitizen2@lemmy.world
        link
        fedilink
        English
        arrow-up
        10
        arrow-down
        2
        ·
        16 hours ago

        This is a big point. People need to understand that the LLMs are more like a fancy graphing calculator; they are very good and handle multiple things, but its on you to understand why the calculation is meaningful. At a certain point no one wants to see your long division or factorial. We want the results and for students and professionals to focus on the concept.

        • NekoKoneko@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          7 hours ago

          I get the metaphor but it’s not a great one for AI in mathematics especially. A statistical word generator is not going to perform reliable math and woe to anyone who acts otherwise.

          I would call it an autistic sycophantic savant with brain damage. It’s able to perform apparent miraculous feats of memory and creativity but then be unable to tell reality from fiction, to tell if even the simplest response is valid, and likely will lie about it to make itself seem more competent to please you.

          If you have a use for an assistant like that, then great. But a calculator - simple and cheap and reliable - it definitely is not.

      • NotMyOldRedditName@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        1
        ·
        16 hours ago

        It’s the people that try to use LLMs for things outside their domain of expertise that really cause the problems.

        That seems to general. Im a mobile developer and sometimes I need a simple script outside my knowledge area. I needed to scrape a website recently, not for anything serious, but to save me time. Claude wrote it and it works. Its probably trash code, but it works and it helped. But you wouldn’t want me using Claude to do important work outside my specific area of focus either or im sure Id cause problems.

        • Newsteinleo@infosec.pub
          link
          fedilink
          English
          arrow-up
          2
          ·
          10 hours ago

          I’m talking about people that are accountants that now thing they can create software. Or engineers who think they can now write legal briefs for court.

      • CanIFishHere@lemmy.ca
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        2
        ·
        17 hours ago

        Very frustrating for sure. Like any tool, it’s up to humans to know when the tool is useful.

        • filcuk@lemmy.zip
          link
          fedilink
          English
          arrow-up
          3
          ·
          16 hours ago

          Partly a marketing issue.
          Companies keep advertising their new AI’s as destroyers of worlds, and something that’s too dangerous to even release.
          As with anything else, the average user will not have but the most surface level understanding of the tool

    • null@lemmy.zip
      link
      fedilink
      English
      arrow-up
      6
      ·
      19 hours ago

      Clickbait got me. No mention of “Yes copilot” which I assumed was a joke anyway.

  • Blue_Morpho@lemmy.world
    link
    fedilink
    English
    arrow-up
    256
    arrow-down
    3
    ·
    1 day ago

    The title of the article is extraordinary wrong that makes it click bait.

    There is no “yes to copilot”

    It is only a formalization of what Linux said before: All AI is fine but a human is ultimately responsible.

    " AI agents cannot use the legally binding “Signed-off-by” tag, requiring instead a new “Assisted-by” tag for transparency"

    The only mention of copilot was this:

    “developers using Copilot or ChatGPT can’t genuinely guarantee the provenance of what they are submitting”

    This remains a problem that the new guidelines don’t resolve. Because even using AI as a tool and having a human review it still means the code the LLM output could have come from non GPL sources.

    • lechekaflan@lemmy.world
      link
      fedilink
      English
      arrow-up
      8
      ·
      12 hours ago

      The title of the article is extraordinary wrong that makes it click bait.

      It’s the pain in the ass with some of those fucking tech/video/showbiz news outlets and then rules in some fora where you cannot make “editorialized” post titles, even though it’s so tempting to correct the awful titling.

    • Fmstrat@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      10 hours ago

      Because even using AI as a tool and having a human review it still means the code the LLM output could have come from non GPL sources.

      I get why they are passing this by though, since you don’t know the provenance of that Stack Overflow snippet, either.

    • marlowe221@lemmy.world
      link
      fedilink
      English
      arrow-up
      70
      ·
      edit-2
      1 day ago

      Yeah, that’s also my question. Partially because I am a former-lawyer-turned-software-developer… but, yeah. How are the kernel maintainers supposed to evaluate whether a particular PR contains non-GPL code?

      Granted, this was potentially an issue before LLMs too, but nowhere near the scale it will be now.

      (In the interests of full disclosure, my legal career had nothing to do with IP law or software licensing - I did public interest law).

      • wonderingwanderer@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        13
        arrow-down
        1
        ·
        22 hours ago

        If it’s flagged as “assisted by <LLM>” then it’s easy to identify where that code came from. If a commercial LLM is trained on proprietary code, that’s on the AI company, not on the developer who used the LLM to write code. Unless they can somehow prove that the developer had access to said proprietary code and was able to personally exploit it.

        If AI companies are claiming “fair use,” and it holds up in court, then there’s no way in hell open-source developers should be held accountable when closed-source snippets magically appear in AI-assisted code.

        Granted, I am not a lawyer, and this is not legal advice. I think it’s better to avoid using AI-written code in general. At most use it to generate boilerplate, and maybe add a layer to security audits (not as a replacement for what’s already being done).

        But if an LLM regurgitates closed-source code from its training data, I just can’t see any way how that would be the developer’s fault…

        • sem@piefed.blahaj.zone
          link
          fedilink
          English
          arrow-up
          8
          arrow-down
          2
          ·
          13 hours ago

          Pretty convenient.

          This is how copyleft code gets laundered into closed source programs.

          All part of the plan.

          • wonderingwanderer@sopuli.xyz
            link
            fedilink
            English
            arrow-up
            1
            ·
            8 hours ago

            How would they launder it? Just declare it their own property because a few lines of code look similar? When there’s no established connection between the developers and anyone who has access to the closed-source code?

            That makes no sense. Please tell me that wouldn’t hold up in court.

            • lagoon8622@sh.itjust.works
              link
              fedilink
              English
              arrow-up
              2
              ·
              2 hours ago

              Please tell me that wouldn’t hold up in court.

              First tell us how much money you have. Then we’ll be able to predict whether the courts will find in your favor or not

            • sem@piefed.blahaj.zone
              link
              fedilink
              English
              arrow-up
              2
              ·
              2 hours ago

              First of all, who is going to discover the closed source use of gpl code and create a lawsuit anyway?

              Second, the llm ingests the code, and then spits it back out, with maybe a few changes. That is how it benefits from copyleft code while stripping the license.

              Maybe a human could do the same thing, but it would take much longer.

      • Alex@lemmy.ml
        link
        fedilink
        English
        arrow-up
        38
        ·
        1 day ago

        They don’t, just like they don’t with human submitted stuff. The point of the Signed-off-by is the author attests they have the rights to submit the code.

        • ell1e@leminal.space
          link
          fedilink
          English
          arrow-up
          2
          ·
          7 hours ago

          Which I’m guessing they cannot attest, if LLMs truly have the 2-10% plagiarism rate that multiple studies seem to claim. It’s an absurd rule, if you ask me. (Not that I would know, I’m not a lawyer.)

          • Alex@lemmy.ml
            link
            fedilink
            English
            arrow-up
            3
            ·
            3 hours ago

            Where are you seeing the 2-10% figure?

            In my experience code generation is most affected by the local context (i.e. the codebase you are working on). On top of that a lot of code is purely mechanical - code generally has to have a degree of novelty to be protected by copyright.

    • anarchiddy@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      11
      arrow-down
      1
      ·
      1 day ago

      Yup.

      I would also just point out that this doesnt change the legal exposure to the Linux kernel to infringing submissions from before the advent of LLMs.

    • TheOctonaut@piefed.zip
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      6
      ·
      17 hours ago

      the LLM output could have come from non-GPL sources

      Fundamentally not how LLMs work, it’s not a database of code snippets.

  • gandalf_der_12te@discuss.tchncs.de
    link
    fedilink
    English
    arrow-up
    34
    arrow-down
    2
    ·
    22 hours ago

    I agree. If AI becomes outlawed, it will simply be used without other people knowing about it.

    This approach, at least, means that people will label AI-generated code as such.

    • emmy67@lemmy.world
      link
      fedilink
      English
      arrow-up
      19
      ·
      22 hours ago

      Maybe. There’s still strong disapproval around it. I can imagine many will still hide it.

  • theherk@lemmy.world
    link
    fedilink
    English
    arrow-up
    143
    arrow-down
    4
    ·
    1 day ago

    Seems like a reasonable approach. Make people be accountable for the code they submit, no matter the tools used.

    • ell1e@leminal.space
      link
      fedilink
      English
      arrow-up
      27
      arrow-down
      1
      ·
      1 day ago

      If the accountability cannot be practically fulfilled, the reasonable policy becomes a ban.

      What good is it to say “oh yeah you can submit LLM code, if you agree to be sued for it later instead of us”? I’m not a lawyer and this isn’t legal advice, but sometimes I feel like that’s what the Linux Foundation policy says.

      • ViatorOmnium@piefed.social
        link
        fedilink
        English
        arrow-up
        50
        arrow-down
        1
        ·
        1 day ago

        But this was already the case. When someone submitted code to Linux they always had to assume responsibility for the legality of the submitted code, that’s one of the points of mandatory Signed-off-by.

        • badgermurphy@lemmy.world
          link
          fedilink
          English
          arrow-up
          5
          arrow-down
          20
          ·
          1 day ago

          But now, even the person submitting the license-breaching content may be unaware that they are doing that, so the problem is surely worse now that contributors can easily unwittingly be on the wrong side of the law.

          • Traister101@lemmy.today
            link
            fedilink
            English
            arrow-up
            45
            arrow-down
            1
            ·
            1 day ago

            That’s their problem. If they are using an LLM and cannot verify the output they shouldn’t be using an LLM

            • hperrin@lemmy.ca
              link
              fedilink
              English
              arrow-up
              2
              ·
              17 hours ago

              Nobody can verify that the output of an LLM isn’t from its training data except those with access to its training data.

            • jj4211@lemmy.world
              link
              fedilink
              English
              arrow-up
              6
              ·
              23 hours ago

              Problem is that broadly most GenAI users don’t take that risk seriously. So far no one can point to a court case where a rights holder successfully sued someone over LLM infringement.

              The biggest chance is getty and their case, with very blatantly obvious infringement. They lost in the UK, so that’s not a good sign.

            • badgermurphy@lemmy.world
              link
              fedilink
              English
              arrow-up
              4
              arrow-down
              13
              ·
              1 day ago

              It is their problem until the second they submit it, then it is the project’s problem. You can lay the blame for the bad actions wherever you want, but the reality is that the work of verifying the legality and validity of these submissions if being abdicated, crippling projects under increased workloads going through ever more submissions that amount to junk.

              What is the solution for that? The fact that is the fault of the lazy submitter doesn’t clean up the mess they left.

              • Traister101@lemmy.today
                link
                fedilink
                English
                arrow-up
                13
                arrow-down
                1
                ·
                1 day ago

                Frankly I expect the kernel dudes to be pretty good about this, their style guides alone are quite strick and any funny business in a PR that isn’t marked correctly is I think likely a ban from making PRs at all. How it worked beforehand, as already stated by others is the author says “I promise this follows the rules” and that’s basically the end of it. Giving an official avenue for generated code is a great way to reduce the negatives of it that’ll happen anyway. We know this from decades of real life experience trying to ban things like alcohol or drugs, time after time providing a legal avenue with some rules makes things safer. Why wouldn’t we see a similar effect here?

                • badgermurphy@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  2
                  ·
                  18 hours ago

                  I do think that some projects will fare better than others, particularly ones like you mentioned, where the team is robust and capable of handling the filtering of increased submissions from these new sources.

                  I believe we are going to end up having to see some new mechanism for project submissions to deal with the growing imbalance between submission volume and work hours available for review, as became necessary when viruses, malware, and spam first came into being. It has quickly become incredibly easy for anyone to make a PR, but not at all easier to review them, so something is going to have to give in the FOSS world.

    • hperrin@lemmy.ca
      link
      fedilink
      English
      arrow-up
      12
      arrow-down
      6
      ·
      1 day ago

      No, it’s not a reasonable approach. Make people be the authors of the code they submit is reasonable, because then it can be released under the GPL. AI generated code is public domain.

      • theherk@lemmy.world
        link
        fedilink
        English
        arrow-up
        9
        arrow-down
        7
        ·
        1 day ago

        I suppose there should be no code generators, assemblers, compilers, linkers, or lsp’s then either? Just etching 1’s and 0’s?

        • hperrin@lemmy.ca
          link
          fedilink
          English
          arrow-up
          5
          ·
          20 hours ago

          The copyright office has made it explicitly clear that those tools do not interfere with the traditional elements of authorship, and that the use of LLMs does. So, if you don’t want to take my word for it, take the US Copyright Office’s word for it.

          • theherk@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            1
            ·
            edit-2
            17 hours ago

            As the agency overseeing the copyright registration system, the Office has extensive experience in evaluating works submitted for registration that contain human authorship combined with uncopyrightable material, including material generated by or with the assistance of technology. It begins by asking “whether the ‘work’ is basically one of human authorship, with the computer [or other device] merely being an assisting instrument, or whether the traditional elements of authorship in the work (literary, artistic, or musical expression or elements of selection, arrangement, etc.) were actually conceived and executed not by man but by a machine.” In the case of works containing AI-generated material, the Office will consider whether the AI contributions are the result of “mechanical reproduction” or instead of an author’s “own original mental conception, to which [the author] gave visible form.” The answer will depend on the circumstances, particularly how the AI tool operates and how it was used to create the final work. This is necessarily a case-by-case inquiry. If a work’s traditional elements of authorship were produced by a machine, the work lacks human authorship and the Office will not register it For example, when an AI technology receives solely a prompt from a human and produces complex written, visual, or musical works in response, the “traditional elements of authorship” are determined and executed by the technology—not the human user. Based on the Office’s understanding of the generative AI technologies currently available, users do not exercise ultimate creative control over how such systems interpret prompts and generate material. Instead, these prompts function more like instructions to a commissioned artist—they identify what the prompter wishes to have depicted, but the machine determines how those instructions are implemented in its output. For example, if a user instructs a text-generating technology to “write a poem about copyright law in the style of William Shakespeare,” she can expect the system to generate text that is recognizable as a poem, mentions copyright, and resembles Shakespeare’s style. But the technology will decide the rhyming pattern, the words in each line, and the structure of the text. When an AI technology determines the expressive elements of its output, the generated material is not the product of human authorship. As a result, that material is not protected by copyright and must be disclaimed in a registration application.

            In other cases, however, a work containing AI-generated material will also contain sufficient human authorship to support a copyright claim. For example, a human may select or arrange AI-generated material in a sufficiently creative way that “the resulting work as a whole constitutes an original work of authorship.” Or an artist may modify material originally generated by AI technology to such a degree that the modifications meet the standard for copyright protection. In these cases, copyright will only protect the human-authored aspects of the work, which are “independent of ” and do “not affect” the copyright status of the AI-generated material itself.

            This policy does not mean that technological tools cannot be part of the creative process. Authors have long used such tools to create their works or to recast, transform, or adapt their expressive authorship. For example, a visual artist who uses Adobe Photoshop to edit an image remains the author of the modified image, and a musical artist may use effects such as guitar pedals when creating a sound recording. In each case, what matters is the extent to which the human had creative control over the work’s expression and “actually formed” the traditional elements of authorship.

            https://www.copyright.gov/ai/ai_policy_guidance.pdf

            What this makes clear is that it certainly isn’t black or white as you say. Nevertheless, automation converting an input to an output, simply cannot be the only mechanism used in determining authorship.

            And that wouldn’t change my statement anyway, but rather supports it. The person submitting a patch must be accountable for its contents.

            An outright ban would need to carefully define how an input gets converted to an output, and that may not be so clear. To be effectively clear, one would have to potentially end the use of many tools that have been used for many years in the kernel, including snippet generation, spelling and grammar correction, IDE autocompleting. So such a reductive view simply will not suffice.


            Additionally, copywritability and licenseability are wholly different questions. And it does not violate GPL to include public domain content, since the license applies to the aggregate work.

            • hperrin@lemmy.ca
              link
              fedilink
              English
              arrow-up
              4
              arrow-down
              1
              ·
              17 hours ago

              If a work’s traditional elements of authorship were produced by a machine, the work lacks human authorship and the Office will not register it For example, when an AI technology receives solely a prompt from a human and produces complex written, visual, or musical works in response, the “traditional elements of authorship” are determined and executed by the technology—not the human user. Based on the Office’s understanding of the generative AI technologies currently available, users do not exercise ultimate creative control over how such systems interpret prompts and generate material. Instead, these prompts function more like instructions to a commissioned artist—they identify what the prompter wishes to have depicted, but the machine determines how those instructions are implemented in its output. For example, if a user instructs a text-generating technology to “write a poem about copyright law in the style of William Shakespeare,” she can expect the system to generate text that is recognizable as a poem, mentions copyright, and resembles Shakespeare’s style. But the technology will decide the rhyming pattern, the words in each line, and the structure of the text. When an AI technology determines the expressive elements of its output, the generated material is not the product of human authorship. As a result, that material is not protected by copyright and must be disclaimed in a registration application.

              That seems very clear to me. Generative AI output is not human authored, and therefore not copyrighted.

              The policy I use also makes very clear the definition of AI generated material:

              https://sciactive.com/human-contribution-policy/#Definitions

              I’m not exactly sure how you can possibly think there is an equivalence between a tool like a spelling and grammar checker and a generative AI, but there’s a reason the copyright office will register works that have been authored using spelling and grammar checkers, but not works that have been authored using LLMs.

              • theherk@lemmy.world
                link
                fedilink
                English
                arrow-up
                2
                ·
                17 hours ago

                Just read the next two paragraphs. Don’t just stop because you got to something that you like. The equivalence I draw is clear. You don’t like it, and that’s okay. But one would have to clarify exactly what the ban entails, and that wouldn’t be as clear as you might think. LLM’s only, transformers specifically, what about graph generation, other ML models? Is it just ML? If so, is that because a matrix lattice was used to get from input to output? Could other deterministic math functions trigger the same ban? What is a spell checker used RNG to select best replacement from a list of correct options? What if a compiler introduces an assembled output with an optimization not of the authors writing?

                Do you see why they say “The answer will depend on the circumstances, particularly how the AI tool operates and how it was used to create the final work. This is necessarily a case-by-case inquiry”?

                And that still affects copywriteability, not license compliance.

                • hperrin@lemmy.ca
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  edit-2
                  16 hours ago

                  Do you want to explain to me what, in those two paragraphs, means that the use of spell checkers and LLMs is equivalent with regard to copyrightability? It seems like those paragraphs make it clear that the use of spell checkers is not the same as LLMs.

                  The policy I use bans “generative AI model” output. Generative AI is a pretty well defined term:

                  https://en.wikipedia.org/wiki/Generative_AI

                  https://www.merriam-webster.com/dictionary/generative AI

                  If you have trouble determining whether something is a generative AI model, you can usually just look up how it is described in the promotional materials or on Wikipedia.

                  Type: Large language model, Generative pre-trained transformer

                  - https://en.wikipedia.org/wiki/Claude_(language_model)

                  I never said it violates GPL to include public domain code. I’m not sure where you got that from. What I said is that public domain code can’t really be released under the GPL. You can try, but it’s not enforceable. As in, you can release it under that license, but I can still do whatever I want with it, license be damned, because it’s public domain.

                  I did that with this vibe coded project:

                  https://github.com/hperrin/gnata

                  I just took it and rereleased it as pubic domain, because that’s what it is anyway.

      • ziproot@lemmy.ml
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        2
        ·
        1 day ago

        Isn’t that the rule? The author has to be a human?

        The new guidelines mandate that AI agents cannot use the legally binding “Signed-off-by” tag, requiring instead a new “Assisted-by” tag for transparency. Ultimately, the policy legally anchors every single line of AI-generated code and any resulting bugs or security flaws firmly onto the shoulders of the human submitting it.

  • null@lemmy.org
    link
    fedilink
    English
    arrow-up
    42
    arrow-down
    4
    ·
    24 hours ago

    Ah, the solution that recognizes there’s no way to eliminate AI from the supply chain after it’s already been introduced.

    • sunbeam60@feddit.uk
      link
      fedilink
      English
      arrow-up
      8
      arrow-down
      11
      ·
      23 hours ago

      You make it sound as if there was another choice if just people had better principles. Pray tell us, what would you have done, now. Not in the past, now.

      • null@lemmy.org
        link
        fedilink
        English
        arrow-up
        17
        ·
        21 hours ago

        That wasn’t my intent. This is me saying, “of course that’s what they’re going to do because there’s nothing else they can do.”

      • Feyd@programming.dev
        link
        fedilink
        English
        arrow-up
        10
        arrow-down
        1
        ·
        21 hours ago

        You’re agreeing with the comment you replied to. Why the fuck are you trying to be so smug???

  • catlover@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    57
    ·
    1 day ago

    I’d still be highly sceptical about pull requests with code created by llms. Personally what I noticed is that the author of such pr doesn’t even read the code, and i have to go through all the slop

    • kcuf@lemmy.world
      link
      fedilink
      English
      arrow-up
      19
      ·
      1 day ago

      Ya I’m finding myself being the bad code generator at work as I’m scattered across so many things at the moment due to attrition and AI can do a lot of the boilerplate work, but it’s such a time and energy sink to fully review what it generates and I’ve found basic things I missed that others catch and shows the sloppiness. I usually take pride in my code, but I have no attachment to what’s generated and that’s exposing issues with trying to scale out using this

      • Repple (she/her)@lemmy.world
        link
        fedilink
        English
        arrow-up
        17
        ·
        edit-2
        1 day ago

        Same. There’s reduction in workforce, pressure to move faster, and no good way to do that without sloppiness. I have never been this down on the industry before; it was never great, but now it’s terrible.

        • Danitos@reddthat.com
          link
          fedilink
          English
          arrow-up
          9
          ·
          edit-2
          6 hours ago

          Some thought I had the other day: LLM is supposed to make us more productive, say by 20%. Have you won a 20% pay rise since you adopted it? I haven’t

      • Feyd@programming.dev
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        5
        ·
        21 hours ago

        Just fucking stop using it? Wtf? Tell you boss to pound sand! They’re going to blame you when it goes south anyway so you might as well stay honest.

    • jj4211@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      1
      ·
      23 hours ago

      I suspect the answer will be that such large requested as you frequently see with LLM codegen will just be rejected.

      Already I see changes broken up and suggested bit by bit, so I presume the same best practice applies.

    • terabyterex@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      12
      ·
      edit-2
      1 day ago

      Did we all forget about stackoverflow?

      Peopleblindly copy/pasted from there all the time.

      • Railcar8095@lemmy.world
        link
        fedilink
        English
        arrow-up
        9
        ·
        24 hours ago

        Couple of years back I got a PR at work that used a block of code that read a CSV, used some stream method to covert it to binary to then feed it to pandas to make a dataframe. I don’t remember the exact steps it did, but was just crazy when pd.read_csv existed.

        On a hunch I pasted the code in google and found an exact match on overflow for a very weird use case on very early pandas.

        I’m lucky and if people send obvious shit at work I can just cc their manager, but I fell for the volunteers at large FOSS projects, or even paid employees.