Yesterday I pointed out that nVidia, unlike OpenAI, has a genuine fiduciary responsibility to its owners. As a result, nVidia isn't likely to enter binding deals without proof of either cash or profitability.
- Posts
- 5
- Comments
- 77
- Joined
- 3 yr. ago
- Posts
- 5
- Comments
- 77
- Joined
- 3 yr. ago
TechTakes @awful.systems CATGIRL Officially Banned For Cheating!!!
TechTakes @awful.systems Ai told me to kіӏӏ 17 people (and myself)!
SneerClub @awful.systems Bag of words, have mercy on us
SneerClub @awful.systems OpenAI investor falls for GPT's SCP-style babble
SneerClub @awful.systems A non-anthropomorphized view of LLMs



I only sampled some of the docs and interesting-sounding modules. I did not carefully read anything.
First, the user-facing structure. The compiler is far too configurable; it has lots of options that surely haven't been tested in combination. The idea of a pipeline is enticing but it's not actually user-programmable. File headers are guessed using a combination of magic numbers and file extensions. The dog is wagged in the design decisions, which might be fair; anybody writing a new C compiler has to contend with old C code.
Next, I cannot state enough how generated the internals are. Every hunk of code tastes bland; even when it does things correctly and in a way which resembles a healthy style, the intent seems to be lacking. At best, I might say that the intent is cargo-culted from existing code without a deeper theory; more on that in a moment. Consider these two hunks. The first is generated code from my fork of META II:
while i < len(self.s) and self.clsWhitespace(ord(self.s[i])): i += 1And the second is generated code from their C compiler:
while self.pos < self.input.len() && self.input[self.pos].is_ascii_whitespace() { self.pos += 1; }In general, the lexer looks generated, but in all seriousness, lexers might be too simple to fuck up relative to our collective understanding of what they do. There's also a lot of code which is block-copied from one place to another within a single file, in lists of options or lists of identifiers or lists of operators, and Transformers are known to be good at that sort of copying.
The backend's layering is really bad. There's too much optimization during lowering and assembly. Additionally, there's not enough optimization in the high-level IR. The result is enormous amounts of spaghetti. There's a standard algorithm for new backends, NOLTIS, which is based on building mosaics from a collection of low-level tiles; there's no indication that the assembler uses it.
The biggest issue is that the codebase is big. The second-biggest issue is that it doesn't have a Naur-style theory underlying it. A Naur theory is how humans conceptualize the codebase. We care about not only what it does but why it does. The docs are reasonably-accurate descriptions of what's in each Rust module, as if they were documents to summarize, but struggle to show why certain algorithms were chosen.
Choice sneer, credit to the late Jessica Walter for the intended reading: It's one topological sort, implemented here. What could it cost? Ten lines?
That's the secret: any generative tool which adapts to feedback can do that. Previously, on Lobsters, I linked to a 2006/2007 paper which I've used for generating code; it directly uses a random number generator to make programs and also disassembles programs into gene-like snippets which can be recombined with a genetic algorithm. The LLM is a distraction and people only prefer it for the ELIZA Effect; they want that explanation and Naur-style theorizing.