A comprehensive Anki flashcard deck covering mechanistic interpretability (MI) of transformer language models, based on Neel Nanda's Comprehensive Mechanistic Interpretability Explainer & Glossary.
Cards were generated by asking Claude to parse and synthesize Neel's glossary page into question-and-answer format, then reviewed and corrected for accuracy.
~190 cards organized across these topics:
| Tag | Coverage |
|---|---|
mech-interp::basics |
Features, circuits, motifs, universality, localization |
mech-interp::features |
Features as directions, privileged bases, bottleneck dims |
mech-interp::superposition |
Toy model of superposition, polysemanticity, antipodal pairs |
mech-interp::field |
Black-box vs white-box interpretability, XAI, BERTology |
ml::basics |
Tensors, activations, softmax, cross-entropy, MLPs |
ml::training |
SGD, Adam, AdamW, grokking, phase transitions, scaling laws |
transformer::basics |
Residual stream, LayerNorm, d_model, tokenization |
transformer::attention |
QK/OV circuits, attention patterns, composition |
transformer::induction |
Induction heads/circuits, pointer arithmetic, in-context learning |
transformer::ioi |
Indirect Object Identification circuit |
transformer::solu |
SoLU activation, activation sparsity, lateral inhibition |
techniques::* |
Ablation, logit difference, activation patching, causal scrubbing, probing |
models |
GPT-2, GPT-Neo, OPT, GPT-J, BERT, Pythia, CLIP, etc. |
tooling |
TransformerLens, CircuitsVis, Neuroscope |
- Open Anki
- File → Import
- Select
mech-interp-anki.tsv - Confirm settings:
- Type: Basic (the file uses two fields: Front, Back)
- Separator: Tab
- Allow HTML in fields: off (plain text)
- Tags column: 3
- Click Import
Cards will be tagged hierarchically (e.g., transformer::induction) so you can study individual topics using the Browser or a filtered deck.
Import via desktop first, then sync.
All content is derived from:
Neel Nanda, A Comprehensive Mechanistic Interpretability Explainer & Glossary (2022).
https://www.neelnanda.io/mechanistic-interpretability/glossary
The original glossary is the authoritative reference — these cards are a study aid, not a substitute. When in doubt, check the source.
Recommended further reading:
- A Mathematical Framework for Transformer Circuits
- In-Context Learning and Induction Heads
- A Toy Model of Superposition
- Interpretability in the Wild (IOI)
- Circuits: Zoom In
| File | Description |
|---|---|
mech-interp-anki.tsv |
Anki import file (tab-separated, 3 columns: front, back, tags) |
See CONTRIBUTING.md.
The flashcard content (mech-interp-anki.tsv) is licensed under CC BY 4.0. Attribution: derived from Neel Nanda's glossary.
The source glossary (A Comprehensive Mechanistic Interpretability Explainer & Glossary.md) belongs to its original author.