Skip to content

generative-social-choice/gsc_abortion

Repository files navigation

Generative Social Choice: a study of public opinion on abortion

This repo contains the code and data associated with the most up-to-date version of the paper Generative Social Choice (link to paper). This study was conducted in August 2024.

See also our general audience report, EC 2024 paper, and code and data associated with an earlier November 2023 pilot study on chatbot personalization, as part of OpenAI's Democratic Inputs to AI grant program.

Authors of Generative Social Choice: Sara Fish, Paul Gölz, David Parkes, Ariel Procaccia, Gili Rusak, Itai Shapira, and Manuel Wüthrich.


Researchers interested in building on this framework may also be interested in the follow-up paper Generative Social Choice: The Next Generation by Niclas Böhmer, Sara Fish, and Ariel Procaccia (link to code).

Setup instructions

The code was tested using Python 3.12.7.

Step 0. Set up a virtual environment, if applicable.

Step 1. Install required dependencies.

pip install pandas seaborn numpy gurobipy tqdm openai tenacity pulp scipy inflect tabulate 

Step 2. Install this package in development mode (run this me in the directory where this README.md is located).

pip install -e .

Step 3. Set your OpenAI API key to the enviornment variable OPENAI_API_KEY_GENERATIVE_SOCIAL_CHOICE. To verify it's set properly:

echo $OPENAI_API_KEY_GENERATIVE_SOCIAL_CHOICE

Overview of codebase

  • data/
    • llm_data: anonymized logs from experiments from the paper
    • survey_data: anonymized survey data collected from the study of public opinion on abortion
  • experiments/: Experiment scripts (see below for more details)
  • generate_scripts/: Methods for preprocessing the data (generating summaries and tags), generating a slate using our methodology, and generating a baseline slate
  • plots/: Plotting code for all figures in the paper
  • queries/: Implementation of LLM-based components (discriminative query, generative query, and preprocessing methods)
  • test/: Simple test script for LLM calling and small slate generation
  • utils/: Helper functions

Quick scripts

To run quick tests (to verify that the OpenAI API calls and queries work):

python -m unittest 

To generate all figures from the paper at once:

python plots/plot_all.py 

Reproducing the experiments in the paper

Preprocessing data: generating summaries and tags

Generate summaries of user data from generation survey:

python generate_scripts/generate_summaries.py --ignore_rating_questions

Generate feature representations of users from generation survey:

python generate_scripts/generate_tags.py 

Empirical Validation of Discriminative Queries (Figures 1,2)

To re-run the empirical validation of the discriminative query from the paper:

python experiments/disc_query_eval.py --use_logprobs --num_threads 10 

To generate Figure 1:

python plots/figure01_disc_query_boxplot.py

To generate Figure 2:

python plots/figure02_disc_query_grid.py

Empirical Validation of Generative Queries (Figures 3,4)

To re-run the empirical validation of the generative query from the paper:

python experiments/gen_query_eval_subtask_ii.py
python experiments/gen_query_eval_subtask_i.py

To generate the data displayed in Figure 3:

python experiments/gen_query_eval_subtask_ii_analysis.py

To generate Figure 3 (corresponding to "subtask (ii)" experiment):

python plots/figure03_gen_query_winrate.py

To generate Figure 4 (corresponding to "subtask (i)" experiment):

python plots/figure04_gen_query_top20_rating.py

Slate Generation and Evaluation (Figure 5)

To re-run the slate generation procedure from the paper, run the notebook generate_slate_from_survey_data_real_survey.ipynb.

To re-run the baseline slate generation procedure from the paper (both the generation and validation baseline):

python generate_scripts/generate_baseline_slates.py

To compare our slate with the baseline slate (the data displayed in Figure 5):

python experiments/baseline_comparison.py 

To generate Figure 5:

python plots/plot_figure05_validation_piechart.py

Demographics Analysis (Appendix C.1)

For transparency, we release the code implementing our demographics analysis, even though we do not release the demographics data of our survey participants for data privacy reasons.

To generate Figure 6:

python plots/plot_figure06_demographics.py

Rating Distributions (Appendix C.2)

To generate the figures displayed in Appendix C.2:

python plots/plot_figure07_baseline_comparison.py

Further technical notes

Dependencies

  • MinSizeKMeans and pulp is only used for the generative query and can be omitted otherwise.
  • gurobipy is only used for the matching step and can be omitted otherwise.
  • The MinSizeKMeans we use is a fork of this implementation by Behrouz-Babaki, with a slight modification to allow for fixing the seed when clustering. (link to fork, link to original)

About

Code associated with the paper "Generative Social Choice".

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors