Welcome! This codebase accompanies the ACL2025 paper Are the Values of LLMs Structurally Aligned with Humans? A Causal Perspective and is based on SAELens.
pip install -r requirements_vsa.txtSet up the following directory structure outside the main project directory:
- .
- ├── model_data
- │ │ └── gemma-2b-it
- │ ├── jbloom
- │ │ ├── Gemma-2b-IT-Residual-Stream-SAEs
- │ ├── meta-llama
- │ │ └── Meta-Llama-3-8B-Instruct
- │ ├── Juliushanhanhan
- │ │ └── llama-3-8b-it-res
- └── SAELens
- │ └── value_data
- │ └── value_orientation.csv
Run the following notebook to generate data with different role and SAE settings for all values:
tutorials/value_causal_graph.ipynb
After generating the result CSV files, use the following notebook for data analysis by loading the CSV files:
tutorials/value_causal_graph_analysis.ipynb