Skip to content

Add MinIO round-trip markdown demo#4748

Open
perryrighthere wants to merge 2 commits intoopendatalab:masterfrom
perryrighthere:minio-markdown-demo
Open

Add MinIO round-trip markdown demo#4748
perryrighthere wants to merge 2 commits intoopendatalab:masterfrom
perryrighthere:minio-markdown-demo

Conversation

@perryrighthere
Copy link
Copy Markdown

Motivation

MinerU already provides local CLI, API, and HTTP client examples, but it does not include an official example for an object-storage-first workflow.

This PR adds a MinIO round-trip demo that shows how to upload a local file to MinIO, read it back from MinIO for parsing, upload the generated artifacts back to MinIO, and produce Markdown whose image references point to MinIO object URLs.

Modification

  • add demo/minio_markdown_demo.py as an official MinIO round-trip parsing example
  • support MinIO endpoint, access key, and secret key configuration via CLI arguments, environment variables, and mineru.json fallback
  • rewrite generated Markdown and JSON image references to MinIO object URLs
  • save the downloaded Markdown locally after the MinIO round-trip completes
  • document the new example in docs/zh/usage/quick_usage.md and docs/en/usage/quick_usage.md
  • add lightweight unit tests for helper behavior in tests/unittest/test_minio_markdown_demo.py

BC-breaking

None.

Use cases

  • integrate MinerU into MinIO or S3-compatible object storage workflows
  • provide an official example for storage-backed parsing pipelines
  • generate Markdown that can directly reference MinIO-hosted images in downstream systems

Validation

  • python -m py_compile demo/minio_markdown_demo.py
  • .venv/bin/python -m unittest discover -s tests/unittest -p 'test_minio_markdown_demo.py'

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 7, 2026

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

Copy link
Copy Markdown
Author

I have read the CLA Document and I hereby sign the CLA

github-actions bot added a commit that referenced this pull request Apr 7, 2026
@perryrighthere perryrighthere marked this pull request as ready for review April 7, 2026 15:07
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Apr 7, 2026
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Apr 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant