Skip to content

Conversation

@MaxGhenis
Copy link
Contributor

Summary

  • Adds CPS_2024.file_path (cps_2024.h5) to the HuggingFace upload list in upload_completed_datasets.py
  • PR Add support for 2024 CPS ASEC data #438 added CPS_2024 class and CensusCPS_2024 (2024 ASEC / March 2025 survey), but the upload script was never updated to include the raw CPS dataset
  • Only enhanced_cps_2024.h5 and small_enhanced_cps_2024.h5 were being uploaded — the unenhanced cps_2024.h5 was missing from HuggingFace

Why this matters

For the SPM child poverty decomposition, we need both the raw and enhanced CPS for 2024 to cleanly decompose the gap between Census-published child poverty (13.4%) and PolicyEngine's estimate (22.6%). Without cps_2024.h5 on HuggingFace, we had to use cps_2023.h5 (different survey year) as a workaround.

Test plan

  • Verify cps_2024.h5 appears on HuggingFace after next dataset build
  • Verify Microsimulation(dataset="hf://policyengine/policyengine-us-data/cps_2024.h5") works

Closes #501

🤖 Generated with Claude Code

The raw (unenhanced) CPS 2024 dataset was never being uploaded to
HuggingFace — only enhanced_cps_2024.h5 and small_enhanced_cps_2024.h5
were included. This means downstream consumers couldn't access the raw
CPS for the 2024 ASEC survey year.

Closes #501

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@MaxGhenis MaxGhenis merged commit 3be1d13 into main Feb 1, 2026
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Upload cps_2024.h5 to HuggingFace

2 participants