Pro-AI mod and self-proclaimed 'communist' got mad for being downvoted.

This is fine🔥🐶☕🔥@lemmy.world · edit-2 2 months ago

Pro-AI mod and self-proclaimed 'communist' got mad for being downvoted.

Arthur Besse@lemmy.ml · 2 months ago

They aren’t pro corpo Ai.

They’re very much against the mass scraping/ddos ai companies are doing.

All of the self-hostable LLMs and image generators (or at least, all of the ones capable of the quality people have come to expect for the last few years) people are using today are trained on massive scraped datasets far beyond the reach of hobbyists. There are many so-called “open source” models which are free to modify (eg, by fine-tuning) and to redistribute, but the data used for the initial training (which hobbyists are allowed to build upon) cannot be published because doing so would obviously be large-scale copyright infringement.

Also, even with the data (which in many cases also needs to be labeled/annotated using human labor), the cost of training such a model from scratch is astronomical.

As a pirate myself, I totally understand how, after reading that Meta’s training data included 82TB of pirated books they torrented, one’s first thought might be “🤤” … but to imagine that this makes Meta our ally in the fight against copyright is some temporarily-embarrassed-millionaire kind of thinking.