Models Capabilities Use Cases Benchmarks Papers Glossary

Models Capabilities Use Cases Benchmarks Papers Glossary

About Privacy Terms RSS

ThinkLLM

Spot an error in our data? Let us know.

Glossary/Mechanistic Interpretability

Mechanistic Interpretability

evaluation

The study of understanding how a language model's internal components and computations work to produce its outputs.

Learn more on Wikipedia

Mechanistic Interpretability — Glossary — ThinkLLM