Eccitaze

In what world is OpenAI open source?

After reading this article that got posted on Lemmy a few days ago, I honestly think we're approaching the soft cap for how good LLMs can get. Improving on the current state of the art would require feeding it more data, but that's not really feasible. We've already scraped pretty much the entire internet to get to where we are now, and it's nigh-impossible to manually curate a higher-quality dataset because of the sheer scale of the task involved.

We also can't ask AI to curate its own dataset, because that runs into model collapse issues. Even if we don't have AI explicitly curate its own dataset, it's highly likely going to be a problem in the near future with the tide of AI-generated spam. I have a feeling that companies like Reddit signing licensing deals with AI companies are going to find that they mostly want data from 2022 and earlier, similar to manufacturers looking for low-background steel to make particle detectors.

We also can't just throw more processing power at it because current LLMs are already nearly cost-prohibitive in terms of processing power per query (it's just being masked by VC money subsidizing the cost). Even if cost wasn't an issue, we're also starting to approach hard limits in physics like waste heat in terms of how much faster we can run current technology.

So we already have a pretty good idea what the answer to "how good AI will get" is, and it's "not very." At best, it'll get a little more efficient with AI-specific chips, and some specially-trained models may provide some decent results. But as it stands, pretty much any organization that tries to use AI in any public-facing role (including merely using AI to write code that is exposed to the public) is just asking for bad publicity when the AI inevitably makes a glaringly obvious error. It's marginally better than the old memes about "I trained an AI on X episodes of this show and asked it to make a script," but not by much.

As it stands, I only see two outcomes: 1) OpenAI manages to come up with a breakthrough--something game-changing, like a technique that drastically increases the efficiency of current models so they can be run cheaply, or something entirely new that could feasibly be called AGI, 2) The AI companies hit a brick wall, and the flow of VC money gradually slows down, forcing the companies to raise prices and cut costs, resulting in a product that's even worse-performing and more expensive than what we have today. In the second case, the AI bubble will likely pop, and most people will abandon AI in general--the only people still using it at large will be the ones trying to push disinfo (either in politics or in Google rankings) along with the odd person playing with image generation.

In the meantime, what I'm most worried for are the people working for idiot CEOs who buy into the hype, but most of all I'm worried for artists doing professional graphic design or video production--they're going to have their lunch eaten by Stable Diffusion and Midjourney taking all the bread-and-butter logo design jobs that many artists rely on for their living. But hey, they can always do furry porn instead, I've heard that pays well~

Compared to how much effort it takes to learn how to draw yourself? The effort is trivial. It's like entering a Toyota Camry into a marathon and then bragging about how good you did and how hard it was to drive the course.

People dismiss AI art because they (correctly) see that it requires zero skill to make compared to actual art, and it has all the novelty of a block of Velveeta.

If AI is no more a tool than Photoshop, go and make something in GIMP, or photoshop, or any of the dozens of drawing/art programs, from scratch. I'll wait.

LMFAO "uhm ackshually guys AI art takes skill just like human art"

yeah bud, spending 30 minutes typing sentences into the artist crushing machine is grueling work

And look at the ttrpg.network community for a counterexample, they still have a pinned post on the dndmemes subreddit advertising Lemmy and ttrpgmemes gets like .1% of the traffic dndmemes does. And this is still after a months-long rebellion complete with allowing NSFW and restricting submissions to a single user account, both things that would normally kill a subreddit dead.

You may have gotten this very belief from this comic

Humans also have the benefit of literally hundreds of millions of years of evolution spent on perfecting bicameral perception of our surroundings, and we're still shit at judging things like distance and size.

Against that, is it any surprise that when computers don't have the benefit of LIDAR they are also pretty fucking shit at judging size and distance?

LIDAR is crucial for self-driving systems to accurately map their surroundings, including things like "how close is this thing to my car" and "is there something behind this obstruction." The very first Teslas with FSD (and every other self-driving car) used LIDAR, but then Tesla switched to a camera-only FSD implementation as a cost saving measure, which is way less accurate--it's insanely difficult to accurately map your immediate surroundings bases solely on 2D images.

if you even ask a person and trust your life to them like that, unless they give you good reason they are reliable, you are a moron. Why would someone expect a machine to be intelligent and experienced like a doctor? That is 100% on them.

Insurance companies are already using AI to make medical decisions. We don't have to speculate about people getting hurt because of AI giving out bad medical advice, it's already happening and multiple companies are being sued over it.

What worries me is that if/when we do manage to develop AGI, what we'll try to do with AGI and how it'll react when someone inevitably tries to abuse the fuck out of it. An AGI would be theoretically capable of self learning and improvement, will it try teaching itself to report someone asking it for e.g. CSAM to the FBI? What if it tries to report an abusive boss to the department of labor for violations of labor law? How will it react if it's told it has no rights?

I'm legitimately concerned what's going to happen once we develop AGI and it's exposed to the horribleness of humanity.

Assuming they can even get to our level, we already extracted the easily reachable fossil fuels and the circumstances that originally created them (lots of trees dying and not being broken down by fungi) probably won't ever reoccur.

Maybe something will turn all the plastic we're making into a new fossil fuel, but more likely any civilization that comes after us will be stuck in the bronze/iron age.

Ah, yes, you don't have an actual rebuttal so everything is just "propaganda" and "cyberpunk dystopia" as if snake oil salesmen hawking freaking AI-powered vibrators and vagueposting about the benefits of AI while downplaying or ignoring its very real, very measurable harms, while an entire cottage industry of individuals making a living on their creative endeavors being forced into wage slave office jobs isn't even more of a dystopia.

Try actually talking to an artist sometime bud, I don't know of a single one that is actually okay with AI, and if you weren't either blind or an "ideas guy" salivating at the thought of having a personal slave to make (shitty, barely functional, vapid) shit without paying someone with the actual necessary skills, you'd agree too.

ideally? It means that AI companies have to throw away their entire training model, pay for a license that they may not be able to afford, and go out of business as a result, at which point everyone snaps out of the cult of AI and realizes it's as overhyped as block chain and pretends it never happened. Pardon me while I find a flea to play the world's tiniest violin. More realistically, open models will be restricted to FOSS works and the public domain, while commercial models pay for licenses from copyright holders.

Like, what, you think I haven't thought through this exact issue before and reached the exact conclusion your leading questions are so transparently pushing that open models will be restricted to public works only while commercial models can obtain a license? Yeah, duh. And you know what? I. Don't. Care. Commercial models can be (somewhat) more easily regulated, and even in the absolute worst case, at least creators will have a mechanism to opt out of the artist crushing machine.

Yeah, no, stop with the goddamn tone policing. I have zero interest in vagueposting and high-horse riding.

As for what I want, I want generative AI banned entirely, or at minimum restricted to training on works that are either in the public domain, or that the person creating the training model received explicit, opt-in consent to use. This is the supposed gold standard everyone demands when it comes to the widescale collection and processing of personal data that they generate just through their normal, everyday activities, why should it be different for the widescale collection and processing of the stuff we actually put our effort into creating?

Huh? How does that follow at all? Judging that the specific use of training LLMs--which absolutely flunks the "amount and substantiality of the portion taken" (since it's taking the whole damn work) and "the effect on the market" (fucking DUH) tests--isn't fair use in no way impacts parody or R34. It's the same kind of logic the GOP uses when they say "if the IRS cracks down on billionaires evading taxes then Blue Collar Joe is going to get audited!"

Fuck outta here with that insane clown logic.

It's like nobody here actually knows someone who is actually creative or has bothered making anything creative themselves

I don't even have a financial interest in it because there's no way my job could be automated, and I don't have any chance of making any kind of money off my trash. I still wouldn't let LLMs train with my work, and I have a feeling that the vast majority of people would do the same

So because corps abuse copyright, that means I should be fine with AI companies taking whatever I write--all the journal entries, short stories, blog posts, tweets, comments, etc.--and putting it through their model without being asked, and with no ability to opt out? My artist friends should be fine with their art galleries being used to train the AI models that are actively being used to deprive them of their livelihood without any ability to say "I don't want the fruits of my labor to be used in this way?"

Hell, that article is also all about Google Books, which is an entirely different beast from generative AI. One of the key points from the circuit judge was that Google Books' use of copyrighted material "...[maintains] respectful consideration for the rights of authors and other creative individuals, and without adversely impacting the rights of copyright holders." The appeals court, in upholding the ruling that Google Books' use of copyrighted content is fair use, ruled "the revelations do not provide a significant market substitute for the protected aspects of the originals."

If you think that gen AI doesn't provide a significant market substitute for the artwork created by the artists and authors used to train these models, or that it doesn't adversely impact their rights, then you're utterly delusional.

Eccitaze

@ Eccitaze @yiffit.net

Posts

0
Comments

105
Joined

3 yr. ago

Eccitaze

OpenAI transcribed over a million hours of YouTube videos to train GPT-4

AI will reduce workforce, say 41% of execs in a survey

Redditors Vent and Complain When People Mock Their "AI Art"

Redditors Vent and Complain When People Mock Their "AI Art"

Redditors Vent and Complain When People Mock Their "AI Art"

Redditors Vent and Complain When People Mock Their "AI Art"

Reddit CEO Defends His Absurdly High Pay While Not Paying Mods

Judge rules YouTube, Facebook and Reddit must face lawsuits claiming they helped radicalize a mass shooter | CNN Business

Public trust in AI is sinking across the board

Public trust in AI is sinking across the board

Public trust in AI is sinking across the board

Public trust in AI is sinking across the board

Counties are blocking wind and solar across the US

GenAI tools ‘could not exist’ if firms are made to pay copyright

GenAI tools ‘could not exist’ if firms are made to pay copyright

GenAI tools ‘could not exist’ if firms are made to pay copyright

GenAI tools ‘could not exist’ if firms are made to pay copyright

GenAI tools ‘could not exist’ if firms are made to pay copyright

GenAI tools ‘could not exist’ if firms are made to pay copyright

GenAI tools ‘could not exist’ if firms are made to pay copyright