We are all word rotators
A recurring pattern observed recently when non-technical technical people talk about the rise of LLMs is:
they don’t really know anything, they are just pattern matching based on the data the model was “pretrained” on.
It is somewhat amusing: do these people really, truly believe they understand how LLMs work, while even seasoned veterans own up to the fact that LLMs might as well be working because of “divine benevolence”. Even the literature has the humility to admit that we as humans don’t really know how any of Deep Learning actually works.
Regardless of the confidence with which people refute these models’ true understanding, they often have clear arguments. Neural Networks are just shape rotators and pattern matchers - they imply when their arguments are stripped to their most basic form. Usually, this is what they cite as their pillar of wisdom for their refutation of AI as future overlords.
“No shit genius”
Isn’t any kind of understanding just pattern matching?
If LLM’s are nothing but shape rotators, then us humans are nothing but word rotators.
What about us? Remarkable how in the DAG of understanding, just going one abstraction below to explain nouns with newer nouns we recently learnt can be so confidence inducing.
I need to tread carefully while making the above argument for the fact that this can be a slippery slope and a hyperbole along the above lines could beg us to ask question: there can be no notion of any understanding because any concept broken down would be composed of other concepts/words. Hence at some point, we need to share some axioms of concepts that we assume share the same meaning for all of us.
How do we then verify whether a child’s understanding of the word “water” is same as the parents'?
Evidence of utility of the said token when used in real life.
No one cares what the embedding of the token water is for the child or how much is the cosine similarity of the same embedding is with the parent’s embeddings. But when the child feels pacified when he takes that sip from a glass after crying for water, we know that somewhere in his latent space, he understands the concept of water in the same way we do.
Hence the next time an LLM is able to singleshot the prose you were writing, or the bug you were debugging, at least pretend to be scared.
For when both you, and the LLM are put behind a screen, the person on the other side has no way to peek into either of your word embeddings. The only way to distinguish, would be by the shared utility of the tokens you spit.
And, as months pass, the utility of tokens spit by LLMs is more and more useful.
References
On “Divine Benevolence” and Machine Learning as Alchemy:
- Noam Shazeer, in a Fortune interview (August 2024): “My best guess is divine benevolence. Nobody really understands what’s going on. This is a very experimental science… It’s more like alchemy or whatever chemistry was in the Middle Ages.”
- Shazeer also wrote in the SwiGLU paper (2020): “We offer no explanation as to why these architectures seem to work; we attribute their success, as all else, to divine benevolence.”
- Ali Rahimi’s “Alchemy” speech at NIPS 2017: “Machine learning has become alchemy.” (Synced coverage)
On LLMs “Going to the Mid”:
- Ben Affleck on Joe Rogan Experience #2440 (January 2026): “If you try to get ChatGPT or Claude or Gemini to write you something, it’s really shitty. It’s shitty because by its nature it goes to the mid, to the average.”
- Vijender Chauhan, UPSC educator (January 2026): “ChatGPT’s data has largely been created by upper-caste, privileged sections of society. One should not expect social justice from it.”
- ThePrimeagen, in many of his earlier streams.
On “Shape Rotators”:
- The term was coined/popularized by @roon (tszzl) in early 2022. See his essay: “A Song of Shapes and Words”