Skip to content

janeswingler/snoopy_chart_gen_llm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Chart Generation and Context Enrichment Scripts with GPT-4


This repository contains scripts used for generating various types of charts and enriching them with contextual metadata using GPT. The charts include bar, line, and pie charts, with options for creating both standard and misleading visualizations.

Prerequisites

  • ATTENTION: This repository is not designed to function as a standalone program. These scripts are intended for research purposes, and the researcher will need to manually enter file names, adjust parameters in the main scripts, and run validation scripts themselves.
  • OpenAI API Key: You will need an OpenAI API key to run these scripts, as they utilize GPT for generating metadata and descriptions. Ensure that your API key is set up in your environment.

LLM Generation Scripts

  • generate_chart_context.py: Generate charts with contextual metadata.
  • generate_chart_x_categories.py: Generating chart x-axis categories.
  • generate_chart_y_values.py: Generates charts y-axis values.
  • generate_chart_captions.py: Generates captions and additional descriptive metadata for the charts.

Programmatic Generation Scripts

  • add_misleading_feature.py: Programmatically adds specified misleading feature to existing charts. It allows you to specify the type of misleading feature to apply, such as non-zero baselines or inconsistent time intervals, and modifies the chart's metadata accordingly.

Validation Scripts

  • remove_parse_error_charts.py: Run after every run-through of an LLM script to remove charts that have parsing errors, which can occur when the LLM provides unexpected feedback.
  • validate_generated_charts.py: Validates the output of charts generated by the LLM-based script. It checks for empty or null values in generated fields. Invalid charts are logged in a separate file. To be run after the script generate_chart_context.py
  • validate_x_categories.py: Checks for duplicate or null values in the x-axis categories of generated charts which are then removed from the dataset, with the cleaned data saved to a new file. To be run after the script generate_chart_x_categories.py
  • validate_y_values.py: Checks for duplicate or null values in the y-axis values of generated charts which are then removed from the dataset, with the cleaned data saved to a new file. To be run after the script generate_chart_y_values.py
  • validate_chart_captions.py: Validates the captions generated for charts and removes charts with null or empty values. To be run after generate_chart_captions.py.
  • validate_charts_misleading_features.py: Validates the misleading features applied to charts. It checks whether the charts marked as misleading have the correct features, such as non-zero baselines or inconsistent time intervals for bar and line charts, and non-sum-to-100% or over-segmentation for pie charts. Any discrepancies are logged for further review. To be run after add_misleading_feature.py.

Chart Figure Generation Scripts

  • generate_bar_images.py
  • generate_line_images.py
  • generate_pie_images.py

Processing for Hugging Face Scripts

  • prepare_final_metadata_for_hugging_face.py
  • push_to_hugging_face.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors