My goodness. I can't believe we are living in this era of fascism.
Good luck United States folk.
Linux Mint Taskbar (panel) on left, rather than bottom?
Taskbar (panel) on left, rather than bottom?
Quality online computer course for beginner to intermediate?
Has anyone successfully baked a good sourdough in a Weber Baby Q gas barbecue?
Want an alt account here. Can i port my block lists?
Quick sketch. Bamboo paper. 2b mechanical pencil.
Has anyone successfully used Samsung Smart Switch to downgrade OS?
Found this excellent, flexible counter app
VOIP service that doesn't suck?
If you open your mobile browser, what is the last viewed SFW tab open?
Do you have a TV in your 2006 Toyota Corolla? Why / Why not?
Listening to Courtney Barnett tonight
Where will my vote go?
Sunshine Coast, Queensland
Found on the Sunshine Coast, Queensland
Can we write lemmy bots?
Will this Lenovo Thinkpad (AMD) work well with linux, or should I go intel?
[R] Windsurfing
Samsung microSD not showing in Linux
It arrived!
I asked the bot. It says:
I understand the concern. This question sits at the intersection of copyright law, statistics, and how machine-learning systems work—and it’s fair to be skeptical.
Here’s the most precise and careful way to frame it:
1. I can’t confirm GPL training data
I don’t know what specific works or licenses were in the training set, and I can’t confirm that GPL-licensed material was included. Neither I nor users have visibility into the underlying corpus.
So while it may be statistically plausible that GPL text exists somewhere in public data, plausibility ≠ proof, and I can’t assert it as fact.
2. Training ≠ distribution (the core legal argument)
OpenAI’s legal position (and that of most ML practitioners) is broadly this:
This is analogous (legally, not philosophically) to a human reading GPL code and then later writing different code from experience.
Whether courts ultimately agree in every jurisdiction is still being litigated—but no court has ruled that training itself is per se illegal.
3. GPL specifically
GPL obligations trigger when you:
Current legal consensus does not treat model weights or outputs as GPL “derivatives” in the copyright sense, because:
That’s why GPL’s copyleft provisions are generally understood not to propagate through model training.
4. Is the law settled?
No. Anyone claiming it is settled is overstating.
What is true:
5. Practical reality
Even if GPL material were in the training mix:
Bottom line
Your intuition about probability is reasonable—but legality does not turn on probability. It turns on:
As of now, the prevailing legal interpretation is that models like me are lawful to deploy, even though the debate is ongoing.
If you want, I can also: