This repository has been archived by the owner on Nov 29, 2024. It is now read-only.

PaulPauls / llama3_interpretability_sae Public archive

Notifications You must be signed in to change notification settings
Fork 25
Star 602

A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and fully reproducible.

602 stars 25 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md

Repository files navigation

Llama 3 Interpretability with Sparse Autoencoders

This project is currently taken down. My apologies.

About

A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and fully reproducible.

pytorch feature-extraction open-research sparse-autoencoder llama3 llm-interpretability feature-steering

Report repository

Releases

No releases published