SAEfarer is a visualization tool for exploring the relationship between a sparse autoencoder's features and a text classification model's predictions and errors. It is an interactive widget for Jupyter notebook.
You can install SAEfarer with pip:
pip install saefarerCheck out the examples for demonstrations of the tool and read the preprint for more details about our approach.
This project uses code and/or takes inspiration from several other works:
