Skip to content

WIP: feat(zenodo): automate monthly dataset publishing to Zenodo#815

Open
vcnainala wants to merge 2 commits into
developmentfrom
feat/zenodo-monthly-export
Open

WIP: feat(zenodo): automate monthly dataset publishing to Zenodo#815
vcnainala wants to merge 2 commits into
developmentfrom
feat/zenodo-monthly-export

Conversation

@vcnainala

Copy link
Copy Markdown
Member

Summary

  • Zip CSV exports during the monthly pipeline so coconut_csv-*.zip files are included in S3 uploads (required for Zenodo and the download page).
  • Add publish_zenodo.py to create new Zenodo dataset versions via the deposit API after each monthly export.
  • Expose ZENODO_* configuration through .env.example and config/services.php (Laravel service config pattern).

Changes

Commit Description
fix(exports): zip CSV archives for monthly S3 downloads Zip lite/full CSV before cleanup
feat(zenodo): publish monthly exports to Zenodo after S3 upload Zenodo publisher script + pipeline hook + env config

Configuration

ZENODO_ENABLED=false
ZENODO_ACCESS_TOKEN=
ZENODO_LATEST_DEPOSITION_ID=13897048
ZENODO_CONCEPT_DOI=10.5281/zenodo.13382750
ZENODO_API_URL=https://zenodo.org/api
ZENODO_AUTO_PUBLISH=false
ZENODO_DRY_RUN=false
ZENODO_RELEASE_NOTES=

Test plan

  • Run monthly export with ZENODO_ENABLED=false and confirm CSV zips appear under prod/downloads/{YYYY-MM}/
  • Run python publish_zenodo.py --dry-run against a completed export directory
  • Test against Zenodo sandbox with ZENODO_API_URL=https://sandbox.zenodo.org/api
  • Create a draft version on sandbox and verify file names match existing Zenodo conventions
  • Confirm ZENODO_LATEST_DEPOSITION_ID is updated after each publish

WIP notes

  • Validate SQL dump zipping on production-sized dumps (~30 GB uncompressed)
  • Decide on default ZENODO_AUTO_PUBLISH policy (currently false for manual review)
  • Update docs/versions.md DOI references once first automated release is published

vcnainala added 2 commits June 8, 2026 22:17
CSV exports were deleted after SDF conversion without being zipped,
so coconut_csv-*.zip files were missing from prod/downloads uploads.
Add publish_zenodo.py to create new Zenodo dataset versions via the
deposit API, wire it into the monthly export pipeline, and expose
ZENODO_* settings through config/services.php and .env.example.
@codecov-commenter

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 6.13%. Comparing base (4223a58) to head (1b8f8eb).

Additional details and impacted files
@@              Coverage Diff              @@
##             development    #815   +/-   ##
=============================================
  Coverage           6.13%   6.13%           
  Complexity          1737    1737           
=============================================
  Files                227     227           
  Lines               8961    8961           
=============================================
  Hits                 550     550           
  Misses              8411    8411           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants