There are lightweight models as good as some heavier ones. It's a bit like Intel's tick-tock advertised process. Heavy memory-hungry models are "tick", but there's "tock"- say, "lfm2.5-thinking" model, the light version, in the ollama repository seems almost as good as qwen3.5 for me, except it's very lightweight and lightning-fast compared to that.
These things are being optimized. It's just that in the market capture phase nobody bothered.
That they are not being used correctly - yeah, absolutely, my idea of their proper use is some graph-based system with each node being processed by a select LLM (or just piece of logic) with select set of tools and actions and choices available for each. A bit like ComfyUI, but something saner than a zoom-based web UI. Like MacOS Automator application, rather.
It's "Large Language Model", and the point is in "Large" and that on really large datasets and well-selected attention dimensions set it's good at extrapolating language describing real world, thus extrapolating how real world events will be described. So the task is more of an oracle.
I agree that providing anything accurate is not the task. It's the opposite of the task, actually, all the usefulness of LLMs is in areas where you don't have a good enough model of the world, but need to make some assumptions.
Except for "diagnose these symptoms", with proper framework around it (only using it for flagging things, not for actually making decisions, things that have been discussed thousands of times) that's a valid task for them.
Models are becoming more optimized. I've recently tried LFM2.5, small version, and it's ridiculously close in usefulness to Qwen3.5, for example. Or RNJ-1.
To maintain, meaning actualized datasets - well, sort of expensive, but they were assembling those as a side effect of their main businesses.
So this is not what'll kill them. Their size will. These are very big companies with lots of internal corruption and inefficiency pulling them down. And a few new AI companies, which, I think, are going to survive, they are centered around specific products, some will die, but I'd expect LiquidAI or Anthropic or such to still be around some time after the crash.
The crash might coincide with a bubble burst, but notice how this family of technologies really is delivering results. Instead of a bunch of specialized applications people are asking LLMs and getting often good enough answers. LLM agents can retrieve data from web services, perform operations, assist in using tools.
You shouldn't look at the big ones in the cloud, rather at what value local LLMs give you for energy spent. Right now it's not that good, but approaching good honestly. I don't feel like they've stopped becoming better. Human time is still more expensive. The tools are there, and are being improved, and the humans are slowly gaining experience in using them, and that makes them more efficient in various tasks.
It's for all kinds of reference and knowledge tools what Google was for search.
And there's one just amazing thing about these models - they are self-contained, even if some can use tools to access external sources. Our corporate overlords have been building a dependent networked world for 20 years, simply to break it by popularizing a technology that almost neuters that. They were thinking, probably, that they were reaping the crops of the web for themselves, instead they taught everyone that you don't have to eat at the diner, you can take the food home.
A common trope in stories is that to gain any kind of scary access you need to find a "hacker" who'll do that, but it's at the same time some obscure power that nobody has, not even the company they are "hacking" into.
People still feel as if such news were something unique and couldn't be repeated just like that, easily, with them and things they use. There's nothing unique with computers.
Doesn't make the technology bad. Just should remember that its weak and strong sides are connected to each other logically. Fuzzy logic based on probabilities of the next token in a few attention spaces - thus always artifacts.
People care about what they care about breaking in their hands and exploding into their faces.
ASD and BAD, probably also ADHD.
People also love to assume what they keep on their hard drives and memory sticks is somehow preserved over time and machine time. Bitflips and other physical effects onto your imagined perfect machine are why it's not, and is as good or worse as what's written on paper. A cat decides to piss onto your grandpa's diary and there's no more diary. Or humidity slowly eats it. With computers it's even faster.
You can's speak about not having frequent corruption of files when you are not using tools detecting it. I can guarantee you have plenty of already corrupt stuff on your hard drives. RAM bit flips do contribute to that.
You have bugs (leading to broken documents, something failing, freezes, crashes) in applications you use and part of them is not due to developer's error, but due to uncorrected memory errors.
If you'd try using a filesystem like ZFS with checksumming and regular rescans, you'd see detected errors very often. Probably not corrected, because you'd not use mirroring to save space, dummy.
And if you were using ECC, you'd see messages about corrected memory errors in dmesg often enough.
There's a jump instruction by an address read from RAM, a bit flip occurred so a condition "if friend greet else kill" worked as "if friend rape else kill". Absolutely anything can happen, that wasn't determined by program design flaws and errors. A digital computer is a deterministic system (sometimes there are intentional non-deterministic elements like analog-based RNGs), this is non-deterministic random changes of the state.
In concrete terms - things break without reason. A perfect program with no bugs, if such exists, will do random wrong things if bit flips occur. Clear enough?
Of course. I'm also generally a low-end user, so to say. 18GB is the biggest I've used on one machine, and the program I run that realistically often takes much of it isn't even the web browser, it's POV-Ray. Sometimes some work VMs, but that's rare.
No, that's you happily laughing at the nonsense you yourself said attributing that to me.
I said that RAM compression in MacOS is an OS feature, well-tested and always on. You can play with something similar under Linux and find out it really makes things better. Which means you can fit more there. Like 10%-20% more is notable enough.
And I said that unified memory is a feature of their hardware, which is correct. Which is the reason Intel and AMD were playing with that X86-S idea (a new architecture with much of legacy removed, and also, yes, unified memory), until they dropped it because Intel is going to shit.
I don't see any marketing nonsense in technical facts. Your GPU can use all the same RAM with less expense for doing that. And RAM allocated to applications does get compressed, which is more CPU-intensive obviously, but happens.
Unified memory, so more efficient with that. Also MacOS has RAM compression.
I suppose more is better, and 8GB seems like bare minimum for something useful. But one should always mind that now (unlike before 2020) Apple's hardware has caught up with their advertising in the fact that it's really specifically optimized for the job.
It's fine for an "Apple Chromebook" I think, especially if bulk orders for institutions will get different deals.
You are writing pretentious nonsense, go someplace else.