🤖 LLM Thinking Explorer
Step 1: Tokenization
The model breaks the input text into smaller structural units called tokens.
Step 2: Prediction Difficulty (Surprisal)
Analyzes how "surprising" or difficult it was for the model to predict each token based on the previous context. You can observe the level of surprisal for first word is mostly high , then if you type a ambiguous word then aagain surprisal will increase , for example you type "the dog is studying" , you can observe the surprisal of studying will be high
Step 3: Next-Token Probabilities (Top-K)
For each position, the model calculates the probability of various words coming next. Here are its top choices.
Step 4: Attention Mechanism
Visualizes which previous words the model focused on while processing the current word.
How Humans Read
We skip intermediate words to connect subjects structurally to their verbs across long distances.
How LLMs Read
Models are forced to assign math probabilities sequentially. Unused attention gets dumped into the first word (The Attention Sink).
Normalize Attention (Hide 'Attention Sink')
Ignores the heavy focus on the first word to reveal the hidden semantic connections in the darker squares.Final Review: Human vs. LLM
A live replay of how biological brains and artificial transformers process the exact sentence you entered above.