

I would point out I think you might be overly confident in the manner in which it was trained addition. I’m open to being wrong here, but when you say “It was not trained to do trigonometry to solve addition problem”, that suggests to me either you know how it was trained, or are making assumptions about how it was trained. I would suggest unless you work at one of these companies, you probably are not privy to their training data. This is not an accusation, I think that is probably a trade secret at this point. And if the idea that there would be nobody training an LLM to do addition in this manner, I invite you to glance the Wikipedia article on addition. Really glance at literally any math topic on Wikipedia. I didn’t notice any trigonometry in this entry but I did find the discussion around finding the limits of logarithmic equations in the “Related Operations” section: https://en.m.wikipedia.org/wiki/Addition. They also cite convolution as another way to add in which they jump straight to calculus: https://en.m.wikipedia.org/wiki/Convolution.
This is all to say, I would suggest that we don’t know how they’re training LLMs. We don’t know what that training data is or how it is being used exactly. What we do know is that LLMs work on tokens and weights. The weights and statistical relevance to each of the other tokens depends on the training data, which we don’t have access to.
I know this is not the point, but up until this point I’ve been fairly pedantic and tried to use the correct terminology, so I would point out that technically LLMs have “tensors” not “neurons”. I get that tensors are designed to behave like neurons, and this is just me being pedantic. I know what you mean when you say neurons, just wanted to clarify and be consistent. No shade intended.
I don’t doubt that it can perform addition in multiple ways. I would go as far as saying it can probably attempt to perform addition in more ways than the average person as it has probably been trained on a bunch of math. Can it perform it correctly? Sometimes. That’s ok, people make mistakes all the time too. I don’t take away from LLMs just because they make mistakes. The ability to do math in multiple ways is not evidence of thinking though. That is evidence that it’s been trained on at least a fair bit of math. I would say if you train it on a lot of math, it will attempt to do a lot of math. That’s not thinking, that’s just increasing the weighting on tokens related to math. If you were to train an LLM on nothing but math and texts about math, then asked it an art question, it would respond somewhat nonsensically with math. That’s not thinking, that’s just choosing the statistically most likely next token.
I had no idea about artificial neurons, TIL. I suppose that makes “neural networks” make more sense. In my readings on ML they always seemed to go straight to the tensor and overlook the neuron. They would go over the functions to help populate the weights but never used that term. Now I know.