Skip to content

Add HTML-to-markdown generation for llms.txt#239

Merged
svekars merged 1 commit intopytorch_sphinx_theme2from
generate-md
Mar 25, 2026
Merged

Add HTML-to-markdown generation for llms.txt#239
svekars merged 1 commit intopytorch_sphinx_theme2from
generate-md

Conversation

@svekars
Copy link
Copy Markdown
Contributor

@svekars svekars commented Mar 25, 2026

Add llm_generate_md theme option that converts Sphinx HTML output to clean .md files and links to them in llms.txt instead of .html files.
Also generates llms-full.txt with all page content concatenated.

  • Add _html_to_markdown() with BeautifulSoup (+ regex fallback)
  • Strip theme-injected metadata (date info, headerlinks, nav elements)
  • Normalize Unicode punctuation to ASCII for compatibility
  • Generate llms-full.txt per the llms.txt spec
  • Register llm_generate_md option in theme.conf - True by default
  • Add 33 tests covering conversion, file generation, and edge cases
  • Add .claude/ to .gitignore

Example generated .md files:
HTML: https://deploy-preview-239--pytorchsphinxtheme.netlify.app/community/build_ci_governance
.md: https://deploy-preview-239--pytorchsphinxtheme.netlify.app/community/build_ci_governance.md

HTML: https://deploy-preview-239--pytorchsphinxtheme.netlify.app/community/design
.md: https://deploy-preview-239--pytorchsphinxtheme.netlify.app/community/design.md

llms.txt: https://deploy-preview-239--pytorchsphinxtheme.netlify.app/llms.txt
llms-full.txt: https://deploy-preview-239--pytorchsphinxtheme.netlify.app/llms-full.txt

@netlify
Copy link
Copy Markdown

netlify bot commented Mar 25, 2026

Deploy Preview for pytorchsphinxtheme ready!

Name Link
🔨 Latest commit 2943cd6
🔍 Latest deploy log https://app.netlify.com/projects/pytorchsphinxtheme/deploys/69c426784b3abc0008d9c0a6
😎 Deploy Preview https://deploy-preview-239--pytorchsphinxtheme.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@meta-cla meta-cla bot added the cla signed label Mar 25, 2026
@svekars svekars merged commit f987ecc into pytorch_sphinx_theme2 Mar 25, 2026
7 checks passed
@svekars svekars mentioned this pull request Apr 3, 2026
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants