Skip to content

AntoninPoche/ConSim

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ConSim: Measuring Concept-Based Explanations' Effectiveness with Automated Simulatability

Code related to the paper: https://arxiv.org/abs/2501.05855

Authors:

Instalation

git clone https://github.com/AntoninPoche/ConSim.git
pip install -e .

Launching experiments

First you need to download datasets and adapt src/utils/dataset_utils.py to load your datasets. You will also have to adapt src/utils/models_configs.py to create model configs for your dataset. Finally, you might also have to add a prompting relative to the dataset for SplittedLlamaForCausalLM in src/utils/splitted_models.py.

Then, the different scripts are in the scripts folder. In order:

  • train_evaluate.py to train and evaluate a model on a dataset and compute its embeddings
  • llama_embeddings.py to compute the embeddings for llama on a dataset
  • compute_concepts_and_co.py to compute the concepts and their importance for a model-dataset pair.
  • concepts_communication.py to compute the communication between concepts for a model-dataset pair.
  • make_prompts.py to create the simulatability prompts for a model-dataset pair.
  • call_openai_api.py to call open-ai API as meta-predictors for simulatability prompts on a model-dataset pair.
  • compute_methods_perfs.py to compute the performances of different methods based on open-ai models' answers on a model-dataset pair.
  • visualize_methods_perfs.py to visualize the performances of different methods based on open-ai models' answers.
  • compute_metrics.py to compute the other metrics for a model-dataset pair.
  • analyze_metrics.py to analyze the other metrics with regard to simulatability.

Parameters for scripts can be found in src/utils/general_utils.py.

You can check examples in xp_to_launch.txt. There are also examples of how to launch many scripts using launch_scripts.sh.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors