Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions README.ja.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,22 @@ CLI の既定では列キーは従来どおり 0 始まりの数値文字列(`
CLI の既定では shape/chart の `provenance` / `approximation_level` / `confidence` も出力しません。必要な場合は `--include-backend-metadata` を指定してください。
注意: MCP の `exstruct_extract` は `options.alpha_col=true` が既定で、CLI の既定(`false`)とは異なります。

## クイックスタート Editing CLI

```bash
exstruct patch --input book.xlsx --ops ops.json --backend openpyxl
exstruct patch --input book.xlsx --ops - --dry-run --pretty < ops.json
exstruct make --output new.xlsx --ops ops.json --backend openpyxl
exstruct ops list
exstruct ops describe create_chart --pretty
exstruct validate --input book.xlsx --pretty
```

- `patch` / `make` は JSON の `PatchResult` を標準出力に出します。
- `ops list` / `ops describe` で public patch-op schema を確認できます。
- `validate` はワークブックの読取可否(`is_readable`, `warnings`, `errors`)を返します。
- Phase 2 では既存の抽出 CLI はそのまま維持し、`exstruct extract` や対話的な safety flag はまだ追加しません。

## MCPサーバー (標準入出力)

### uvx を使ったクイックスタート(推奨)
Expand Down
17 changes: 17 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,23 @@ By default, the CLI keeps legacy 0-based numeric string column keys (`"0"`, `"1"
By default, serialized shape/chart output omits backend metadata (`provenance`, `approximation_level`, `confidence`) to reduce token usage. Use `--include-backend-metadata` or the corresponding Python/MCP option when you need it.
Note: MCP `exstruct_extract` defaults to `options.alpha_col=true`, which differs from the CLI default (`false`).

## Quick Start Editing CLI

```bash
exstruct patch --input book.xlsx --ops ops.json --backend openpyxl
exstruct patch --input book.xlsx --ops - --dry-run --pretty < ops.json
exstruct make --output new.xlsx --ops ops.json --backend openpyxl
exstruct ops list
exstruct ops describe create_chart --pretty
exstruct validate --input book.xlsx --pretty
```

- `patch` and `make` print JSON `PatchResult` to stdout.
- `ops list` / `ops describe` expose the public patch-op schema.
- `validate` reports workbook readability (`is_readable`, `warnings`, `errors`).
- Phase 2 keeps the legacy extraction CLI unchanged; it does not add
`exstruct extract` or interactive safety flags yet.

## MCP Server (stdio)

### Quick Start with `uvx` (recommended)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# ADR-0007: Editing CLI as Public Operational Interface

## Status

`accepted`

## Background

Phase 1 established `exstruct.edit` as the first-class Python API for workbook
editing while preserving MCP as the host-owned integration layer. That still
left a gap for command-line and agent-oriented workflows: extraction already had
an `exstruct` CLI, but editing was still exposed mainly through MCP tools.

Phase 2 needs to answer two policy questions that are likely to recur:

- how editing commands should coexist with the legacy extraction CLI
- whether the public editing CLI should expose JSON-first operational flows
directly over `exstruct.edit` or continue to depend on MCP-facing entrypoints

The change also touches a public CLI contract, so the compatibility and layering
decision must be recorded explicitly rather than left in implementation notes.

## Decision

- ExStruct adds a first-class editing CLI on the existing `exstruct` console
script with these Phase 2 commands:
- `patch`
- `make`
- `ops list`
- `ops describe`
- `validate`
- The legacy extraction entrypoint `exstruct INPUT.xlsx ...` remains valid and
is not replaced with `exstruct extract` in Phase 2.
- `patch`, `make`, and `ops*` are thin wrappers around the public
`exstruct.edit` contract.
- Editing commands are JSON-first:
- `patch` and `make` serialize `PatchResult`
- `ops list` / `ops describe` serialize patch-op schema metadata
- `validate` serializes workbook readability results
- Phase 2 does not introduce:
- interactive confirmation flows
- backup / allow-root / deny-glob flags
- a request-envelope JSON CLI format
- a new public Python validation API

## Consequences

- Users and agents gain a stable command-line surface for workbook editing
without routing through MCP.
- The existing extraction CLI keeps backward compatibility because editing
dispatch is opt-in by subcommand.
- The operational CLI now aligns with the public Python API, which reduces the
risk of CLI-only business logic drift.
- `validate` remains a CLI-only operational helper in Phase 2, so Python API
parity for validation is still deferred.
- The `exstruct` CLI now has two invocation styles (legacy extraction and edit
subcommands), which is slightly less uniform than a full subcommand redesign
but materially lowers migration risk.

## Rationale

- Tests:
- `tests/cli/test_edit_cli.py`
- `tests/cli/test_cli.py`
- `tests/cli/test_cli_alpha_col.py`
- `tests/edit/test_api.py`
- `tests/mcp/test_validate_input.py`
- Code:
- `src/exstruct/cli/main.py`
- `src/exstruct/cli/edit.py`
- `src/exstruct/edit/__init__.py`
- `src/exstruct/mcp/validate_input.py`
- Related specs:
- `dev-docs/specs/editing-api.md`
- `dev-docs/specs/editing-cli.md`
- `dev-docs/architecture/overview.md`
- `docs/cli.md`
- `docs/api.md`
- `README.md`
- `README.ja.md`

## Supersedes

- None

## Superseded by

- None
1 change: 1 addition & 0 deletions dev-docs/adr/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,3 +41,4 @@ ADRs record what was decided, under which constraints, and which trade-offs were
| `ADR-0004` | Patch Backend Selection Policy | `accepted` | `mcp` |
| `ADR-0005` | PathPolicy Safety Boundary | `accepted` | `safety` |
| `ADR-0006` | Public Edit API and Host-Owned Safety Boundary | `accepted` | `editing` |
| `ADR-0007` | Editing CLI as Public Operational Interface | `accepted` | `editing` |
6 changes: 6 additions & 0 deletions dev-docs/adr/decision-map.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ This document is a human-readable map for navigating ADRs by domain.

- `ADR-0003` Output Serialization Omission Policy (`accepted`)
- `ADR-0004` Patch Backend Selection Policy (`accepted`)
- `ADR-0007` Editing CLI as Public Operational Interface (`accepted`)

## mcp

Expand All @@ -47,11 +48,16 @@ This document is a human-readable map for navigating ADRs by domain.
## editing

- `ADR-0006` Public Edit API and Host-Owned Safety Boundary (`accepted`)
- `ADR-0007` Editing CLI as Public Operational Interface (`accepted`)

## api

- `ADR-0006` Public Edit API and Host-Owned Safety Boundary (`accepted`)

## cli

- `ADR-0007` Editing CLI as Public Operational Interface (`accepted`)

## Supersession Relationships

- There are currently no ADR supersession relationships.
18 changes: 18 additions & 0 deletions dev-docs/adr/index.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -84,3 +84,21 @@ adrs:
- dev-docs/specs/data-model.md
- docs/api.md
- docs/mcp.md
- id: ADR-0007
title: Editing CLI as Public Operational Interface
status: accepted
path: dev-docs/adr/ADR-0007-editing-cli-as-public-operational-interface.md
primary_domain: editing
domains:
- editing
- cli
- compatibility
supersedes: []
superseded_by: []
related_specs:
- dev-docs/specs/editing-api.md
- dev-docs/specs/editing-cli.md
- docs/cli.md
- docs/api.md
- README.md
- README.ja.md
7 changes: 7 additions & 0 deletions dev-docs/architecture/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ exstruct/
specs.py
types.py
cli/
edit.py
main.py
```

Expand Down Expand Up @@ -84,6 +85,12 @@ PDF/PNG output (for RAG use cases)

CLI entry point

- `main.py` keeps the legacy extraction CLI and dispatches to editing
subcommands only when the first token matches `patch` / `make` / `ops` /
`validate`
- `edit.py` contains the Phase 2 editing parser, JSON serialization helpers,
and wrappers around `exstruct.edit`

### edit/

First-class public workbook editing API
Expand Down
3 changes: 3 additions & 0 deletions dev-docs/specs/editing-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@
This document defines the Phase 1 public editing contract exposed from
`exstruct.edit`.

Phase 2 adds a CLI wrapper around this contract; the CLI-specific surface is
documented separately in `dev-docs/specs/editing-cli.md`.

## Public import path

- Primary public package: `exstruct.edit`
Expand Down
85 changes: 85 additions & 0 deletions dev-docs/specs/editing-cli.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# Editing CLI Specification

This document defines the Phase 2 public editing CLI contract.

## Command surface

- Editing commands are exposed from the existing `exstruct` console script.
- Phase 2 commands:
- `exstruct patch`
- `exstruct make`
- `exstruct ops list`
- `exstruct ops describe`
- `exstruct validate`
- The legacy extraction entrypoint `exstruct INPUT.xlsx ...` remains valid and
is not rewritten to `exstruct extract` in Phase 2.

## Dispatch and compatibility rules

- `exstruct.cli.main` dispatches to the editing parser only when the first
token is one of the Phase 2 editing subcommands.
- All other invocations continue to use the extraction parser and
`process_excel` path unchanged.
- Phase 2 does not add a new console script or top-level Python export.

## Patch and make commands

- `patch` is the CLI wrapper over `exstruct.edit.patch_workbook`.
- `make` is the CLI wrapper over `exstruct.edit.make_workbook`.
- Shared request flags:
- `--sheet`
- `--on-conflict {overwrite,skip,rename}`
- `--backend {auto,com,openpyxl}`
- `--auto-formula`
- `--dry-run`
- `--return-inverse-ops`
- `--preflight-formula-check`
- `--pretty`
- `patch` requires:
- `--input PATH`
- `--ops FILE|-`
- `patch` optionally accepts `--output PATH`; when omitted, the existing patch
output defaulting behavior remains in effect.
- `make` requires `--output PATH`.
- `make` accepts optional `--ops FILE|-`; when omitted, `ops=[]`.

## Ops input contract

- `--ops` reads UTF-8 JSON from a file path or stdin marker `-`.
- The top-level JSON value must be an array.
- Each array item follows the existing public patch-op normalization rules
exposed from `exstruct.edit`, including alias normalization and JSON-string
op coercion.
- Phase 2 does not accept a request-envelope JSON document on the CLI.

## Output contract

- `patch` and `make` serialize `PatchResult` to stdout as JSON.
- `validate` serializes the existing input validation result shape:
- `is_readable`
- `warnings`
- `errors`
- `ops list` returns compact summaries with `op` and `description`.
- `ops describe` returns detailed patch-op schema metadata for one op.
- `--pretty` applies `indent=2` JSON formatting to all Phase 2 editing
commands.

## Exit-code rules

- `patch` / `make` exit `0` when the serialized `PatchResult` has
`error is None`; otherwise they exit `1`.
- `validate` exits `0` when `is_readable=true`; otherwise `1`.
- `ops list` exits `0` on success.
- `ops describe` exits `1` for unknown op names.
- JSON parse failures, request validation failures, and local I/O failures are
reported as stderr CLI errors and exit `1`; Phase 2 does not introduce a
separate generic JSON error envelope for these cases.

## Explicit non-goals for Phase 2

- No `exstruct extract` subcommand
- No backup / confirmation / allow-root / deny-glob flags
- No summary-mode output
- No changes to backend selection or fallback policy
- No changes to MCP tool contracts
- No new public Python validation API
2 changes: 2 additions & 0 deletions docs/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,8 @@ Key points:
MCP patch contract in Phase 1.
- Use `list_patch_op_schemas()` / `get_patch_op_schema()` to inspect the public
operation schema programmatically.
- The matching operational CLI is `exstruct patch`, `exstruct make`,
`exstruct ops`, and `exstruct validate`.

## Dependencies

Expand Down
61 changes: 60 additions & 1 deletion docs/cli.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@
# CLI User Guide

This page explains how to run ExStruct from the command line, what each flag does, and common workflows. The CLI wraps `process_excel` under the hood.
This page explains how to run ExStruct from the command line, what each flag
does, and common workflows.

- Extraction keeps the legacy `exstruct INPUT.xlsx ...` form and wraps
`process_excel`.
- Editing uses subcommands such as `exstruct patch` and wraps `exstruct.edit`.

## Basic usage

Expand All @@ -14,6 +19,58 @@ exstruct INPUT.xlsx --format toon # TOON output (needs python-toon)
- `INPUT.xlsx` supports `.xlsx/.xlsm/.xls`.
- Exit code `0` on success, `1` on failure.

## Editing commands

Phase 2 adds JSON-first editing commands while keeping the extraction entrypoint
unchanged.

```bash
exstruct patch --input book.xlsx --ops ops.json --backend openpyxl
exstruct patch --input book.xlsx --ops - --dry-run --pretty < ops.json
exstruct make --output new.xlsx --ops ops.json --backend openpyxl
exstruct ops list
exstruct ops describe create_chart --pretty
exstruct validate --input book.xlsx --pretty
```

- `patch` serializes `PatchResult` to stdout and exits `1` only when
`PatchResult.error` is present.
- `make` serializes `PatchResult` for new workbook creation.
- `ops list` returns compact `{op, description}` summaries.
- `ops describe` returns the detailed schema for one patch op.
- `validate` returns input readability checks (`is_readable`, `warnings`,
`errors`).

## Editing options

### `patch`

| Flag | Description |
| ---- | ----------- |
| `--input PATH` | Existing workbook to edit. |
| `--ops FILE\|-` | JSON array of patch ops from a file or stdin. |
| `--output PATH` | Optional output workbook path. If omitted, the existing default patch output naming applies. |
| `--sheet TEXT` | Top-level sheet fallback for patch ops. |
| `--on-conflict {overwrite,skip,rename}` | Output conflict policy. |
| `--backend {auto,com,openpyxl}` | Backend selection. |
| `--auto-formula` | Treat `=...` values in `set_value` ops as formulas. |
| `--dry-run` | Simulate changes without saving. |
| `--return-inverse-ops` | Return inverse ops when supported. |
| `--preflight-formula-check` | Run formula-health validation before saving when supported. |
| `--pretty` | Pretty-print JSON output. |

### `make`

`make` accepts the same flags as `patch`, except that `--output PATH` is
required and `--input` is not used. `--ops` is optional; omitting it creates an
empty workbook.

### `ops` and `validate`

- `exstruct ops list [--pretty]`
- `exstruct ops describe OP [--pretty]`
- `exstruct validate --input PATH [--pretty]`

## Options

| Flag | Description |
Expand Down Expand Up @@ -59,6 +116,8 @@ exstruct sample.xlsx --pdf --image --dpi 144 -o out.json
## Notes

- Optional dependencies are lazy-imported. Missing packages raise a `MissingDependencyError` with install hints.
- Editing commands are JSON-first and do not add interactive confirmation,
backup creation, or path-restriction flags in Phase 2.
- On non-COM environments, prefer `--mode libreoffice` for best-effort rich extraction on `.xlsx/.xlsm`, or `--mode light` for minimal extraction.
- `--mode libreoffice` is best-effort, not a strict subset of COM output. It does not render PDFs/PNGs and does not compute auto page-break areas in v1.
- `--mode libreoffice` combined with `--pdf`, `--image`, or `--auto-page-breaks-dir` fails early with a configuration error instead of silently ignoring the option.
Expand Down
Loading
Loading