Skip to content

donjguido/mechanistic-interpretability-anki-cards

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mechanistic Interpretability Anki Cards

A comprehensive Anki flashcard deck covering mechanistic interpretability (MI) of transformer language models, based on Neel Nanda's Comprehensive Mechanistic Interpretability Explainer & Glossary.

Cards were generated by asking Claude to parse and synthesize Neel's glossary page into question-and-answer format, then reviewed and corrected for accuracy.

What's Included

~190 cards organized across these topics:

Tag Coverage
mech-interp::basics Features, circuits, motifs, universality, localization
mech-interp::features Features as directions, privileged bases, bottleneck dims
mech-interp::superposition Toy model of superposition, polysemanticity, antipodal pairs
mech-interp::field Black-box vs white-box interpretability, XAI, BERTology
ml::basics Tensors, activations, softmax, cross-entropy, MLPs
ml::training SGD, Adam, AdamW, grokking, phase transitions, scaling laws
transformer::basics Residual stream, LayerNorm, d_model, tokenization
transformer::attention QK/OV circuits, attention patterns, composition
transformer::induction Induction heads/circuits, pointer arithmetic, in-context learning
transformer::ioi Indirect Object Identification circuit
transformer::solu SoLU activation, activation sparsity, lateral inhibition
techniques::* Ablation, logit difference, activation patching, causal scrubbing, probing
models GPT-2, GPT-Neo, OPT, GPT-J, BERT, Pythia, CLIP, etc.
tooling TransformerLens, CircuitsVis, Neuroscope

Import Instructions

Anki Desktop

  1. Open Anki
  2. File → Import
  3. Select mech-interp-anki.tsv
  4. Confirm settings:
    • Type: Basic (the file uses two fields: Front, Back)
    • Separator: Tab
    • Allow HTML in fields: off (plain text)
    • Tags column: 3
  5. Click Import

Cards will be tagged hierarchically (e.g., transformer::induction) so you can study individual topics using the Browser or a filtered deck.

AnkiWeb / AnkiMobile

Import via desktop first, then sync.

Source Material

All content is derived from:

Neel Nanda, A Comprehensive Mechanistic Interpretability Explainer & Glossary (2022).
https://www.neelnanda.io/mechanistic-interpretability/glossary

The original glossary is the authoritative reference — these cards are a study aid, not a substitute. When in doubt, check the source.

Recommended further reading:

File Reference

File Description
mech-interp-anki.tsv Anki import file (tab-separated, 3 columns: front, back, tags)

Contributing

See CONTRIBUTING.md.

License

The flashcard content (mech-interp-anki.tsv) is licensed under CC BY 4.0. Attribution: derived from Neel Nanda's glossary.

The source glossary (A Comprehensive Mechanistic Interpretability Explainer & Glossary.md) belongs to its original author.

About

~190 Anki flashcards for mechanistic interpretability of transformer language models, based on Neel Nanda's glossary

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors