Skip to content

Add endpoint + examples#17

Merged
hershaw merged 9 commits intomainfrom
nc/27feb/read-cmd
Mar 2, 2026
Merged

Add endpoint + examples#17
hershaw merged 9 commits intomainfrom
nc/27feb/read-cmd

Conversation

@nfcampos
Copy link
Contributor

@nfcampos nfcampos commented Feb 27, 2026

Summary

  • New examples/read.ts entrypoint for multi-document QnA using witan read to extract text from PDF, DOCX, PPTX, HTML, and 10 other formats
  • Demo mode generates 3 interlocking Acme Corp fixtures and asks a cross-document question; custom mode accepts user files + question
  • New skills/read-source/SKILL.md skill teaching the agent witan read usage (formats, flags, strategy)
  • Fix format.ts to pair parallel tool calls with their results — previously parallel results printed orphaned from their commands
  • Clean up cross-skill references in xlsx-code-mode and xlsx-verify scope sections

Test plan

  • pnpm read --help prints usage
  • pnpm read runs demo mode end-to-end (generates fixtures, agent reads all 3 docs, synthesizes correct answer)
  • pnpm read --verbose shows full message stream with correct tool call/result pairing
  • pnpm read report.pdf minutes.docx "question" custom mode with user files
  • Verify format.ts pairing works for pnpm qna and pnpm verify (non-read examples)

🤖 Generated with Claude Code

nfcampos and others added 3 commits February 27, 2026 16:22
- New examples/read.ts entrypoint: demo mode generates 3 Acme Corp
  fixtures (PDF/DOCX/PPTX), custom mode accepts user files + question
- New skills/read-source/SKILL.md teaching the agent witan read usage
- New examples/lib/acme-fixtures.ts with deterministic fixture generators
  and ACME_QUESTION cross-document question
- Fix format.ts to pair parallel tool calls with their results instead
  of printing results orphaned from their commands
- Update AGENTS.md, README.md, package.json with read example docs
- Clean up cross-skill references in xlsx-code-mode and xlsx-verify

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@hershaw hershaw changed the title Add read command Add read.ts example for multi-document QnA Mar 2, 2026
hershaw and others added 6 commits March 2, 2026 17:20
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@hershaw hershaw changed the title Add read.ts example for multi-document QnA Add endpoint + examples Mar 2, 2026
@hershaw hershaw marked this pull request as ready for review March 2, 2026 17:42
Copy link
Contributor

@hershaw hershaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tested and added an example, lgtm

@hershaw hershaw merged commit 64e9cac into main Mar 2, 2026
4 checks passed
@hershaw hershaw deleted the nc/27feb/read-cmd branch March 2, 2026 17:43
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2cf5e797a2

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +67 to +68
const basename = path.basename(resolved);
fs.copyFileSync(resolved, path.join(workDir, basename));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid clobbering files with duplicate basenames

In custom mode, each input is copied into the temp workspace under path.basename(...), so two different paths like dirA/report.pdf and dirB/report.pdf will silently overwrite each other and only one document will actually be available to the agent. This can produce incorrect cross-document answers while appearing to process all files, so the staging step should ensure unique destination names.

Useful? React with 👍 / 👎.

Comment on lines +267 to +269
ext := extFromContentType(resp.Header.Get("Content-Type"))
if ext == "" {
ext = filepath.Ext(urlPath(input))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Fall back to URL suffix for generic text content types

The URL path suffix is only used when extFromContentType(...) returns empty, but extFromContentType returns .txt for any unrecognized text/* value; that means URLs like .../notes.md served as text/plain are forced to .txt, so downstream upload uses text/plain instead of markdown-specific handling (notably affecting --outline). Treating generic text types as "unknown" (or checking the URL suffix first in that case) avoids this misclassification.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants