Skip to content

v1.5.0#6

Open
mattyhansen wants to merge 18 commits into
mainfrom
dev
Open

v1.5.0#6
mattyhansen wants to merge 18 commits into
mainfrom
dev

Conversation

@mattyhansen

@mattyhansen mattyhansen commented May 10, 2026

Copy link
Copy Markdown
Contributor

This pull request prepares the v1.5.0 release with PHP client runtime features, typed response helpers, streaming improvements, and GOAT Flow documentation and agent-workflow scaffolding. It is not documentation-only; the runtime changes are part of the release scope.

Key changes include:

PHP Client Runtime Features

  • Added OpenTelemetry tracing middleware that emits client spans and injects W3C trace-context headers when explicitly configured.
  • Added typed stream callback handlers, including a printing reference handler and cancellation support through typed on* hooks.
  • Added typed citation and guardrail assessment DTO accessors while preserving the existing raw array fields for backward compatibility.
  • Added structured-output hydration, additional stop reasons, richer AgentInput media helpers, and specialized AgentErrorException subclasses.

Tests, Fixtures, and Release Notes

  • Expanded unit coverage and fixtures for citations, guardrails, streaming helpers, exceptions, media inputs, and tracing middleware.
  • Updated README and CHANGELOG entries for the v1.5.0 release features.

GOAT Flow and Agent Workflow Materials

  • Added and updated GOAT Flow architecture, code map, learning-loop buckets, scratch/log/task anchors, and skill-reference playbooks.
  • Mirrored agent skill documentation for the .agents and .claude runtimes, including goat-critique reference extraction for rubrics and sub-agent directives.
  • Added local agent hook/config templates and repository housekeeping for the agent workflow.

Summary by CodeRabbit

  • New Features

    • New multi-skill “goat-*” workflows (critique, review, debug, plan, qa, security), richer multi‑modal inputs (images/videos/documents/cache points), enhanced tracing/observability, sanitized stream summaries, response-observer hooks, and finer error/stop classifications.
  • Documentation

    • Published Wire Contract v1, comprehensive skill/reference guides, patterns, architecture, examples, and a Python gateway reference.
  • Tests

    • Expanded fixtures and suites covering tracing, streams, citations, guardrails, media inputs, exception mapping, and contract compatibility.
  • Chores

    • Added safety hooks, formatting/guard scripts, dependency/install utilities, release tooling, and repo housekeeping.

Review Change Stack

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR bumps the project to a v1.5.0 feature release, adding new runtime capabilities to the Strands PHP client (tracing, richer DTOs, streaming helpers, exception specialization, richer media inputs) alongside a large set of goat-flow reference/documentation scaffolding and repo housekeeping.

Changes:

  • Added OpenTelemetry tracing middleware, plus new streaming callback handler utilities (including a printing reference handler).
  • Introduced typed DTOs/accessors for citations and guardrail assessments, plus structured-output hydration and new StopReason variants.
  • Added specialized AgentErrorException subclasses and expanded tests/fixtures; updated docs/config/ignore files and added Node dev tooling lockfiles.

Reviewed changes

Copilot reviewed 118 out of 121 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/Unit/Streaming/StreamEventCitationTest.php Unit tests for StreamEvent citation DTO hydration
tests/Unit/Streaming/StreamCallbackHandlerTest.php Unit tests for typed stream callback dispatch helper
tests/Unit/Streaming/PrintingCallbackHandlerTest.php Unit tests for stdout/stderr printing handler
tests/Unit/Response/GuardrailAssessmentTest.php Unit tests for GuardrailAssessment DTO hydration
tests/Unit/Response/Citation/CitationTest.php Unit tests for Citation DTO hydration
tests/Unit/Response/Citation/CitationLocationTest.php Unit tests for CitationLocation DTO hydration
tests/Unit/GuardrailTraceTest.php Added tests for typed guardrail assessment accessor + caching
tests/Unit/Exceptions/ThrottledExceptionTest.php Unit tests for 429 exception subtype
tests/Unit/Exceptions/MaxTokensExceptionTest.php Unit tests for max-tokens exception subtype
tests/Unit/Exceptions/ContextOverflowExceptionTest.php Unit tests for context-overflow exception subtype
tests/Unit/Exceptions/AgentErrorExceptionTest.php Unit tests for subtype resolution in fromHttpResponse()
tests/Unit/AgentResponseTest.php Added tests for new StopReason values, citation DTO accessor, structured output hydration
tests/Unit/AgentInputTest.php Added tests for new AgentInput media helper methods
tests/Http/Middleware/OtelTracingMiddlewareTest.php Unit tests for OpenTelemetry tracing middleware behavior
tests/Fixtures/invoke-with-guardrail.json New invoke fixture including guardrail trace data
tests/Fixtures/invoke-with-citations.json New invoke fixture including citations content blocks
src/Streaming/StreamEvent.php Added typed citation accessor on stream events
src/Streaming/StreamCallbackHandler.php New typed dispatcher base class for stream callbacks
src/Streaming/PrintingCallbackHandler.php New reference callback handler that prints tokens/errors
src/Response/StopReason.php Added Cancelled + Checkpoint stop reasons
src/Response/GuardrailTrace.php Added typed assessment accessor with caching; adjusted class mutability
src/Response/GuardrailAssessment.php New DTO for typed guardrail assessments
src/Response/Citation/CitationSourceContent.php New DTO for citation source content
src/Response/Citation/CitationLocation.php New DTO for citation location metadata
src/Response/Citation/CitationGeneratedContent.php New DTO for citation generated content
src/Response/Citation/Citation.php New top-level Citation DTO + fromArray()
src/Response/AgentResponse.php Added citation DTO accessor + structured output hydration helper
src/Http/Middleware/OtelTracingMiddleware.php New OpenTelemetry span + header injection middleware
src/Exceptions/ThrottledException.php New 429-specific exception subtype
src/Exceptions/MaxTokensException.php New max-tokens exception subtype
src/Exceptions/ContextOverflowException.php New context-overflow exception subtype
src/Exceptions/AgentErrorException.php Added subtype resolution logic in fromHttpResponse()
src/Context/AgentInput.php Added new rich-media helper methods (S3 image, URL image/doc/video, base64 video)
README.md Documented tracing middleware + stream callback handler usage
pnpm-lock.yaml Added pnpm lockfile for goat-flow tooling dependency
package.json Added Node dev dependency for goat-flow tooling
composer.json Added OTEL + psr7 dev dependencies; updated suggest/branch-alias/plugins
CLAUDE.md Added/updated repo-local agent workflow/instructions documentation
CHANGELOG.md Added v1.5.0 unreleased entries for new client features
.goat-flow/tasks/README.md Documented local-only tasks directory purpose
.goat-flow/tasks/.gitignore Ignored task contents while keeping anchor files tracked
.goat-flow/skill-reference/skill-quality-testing/deployment.md Added skill deployment/testing reference
.goat-flow/skill-reference/skill-quality-testing/adversarial-framing.md Added review-class skill authoring reference
.goat-flow/skill-reference/skill-quality-testing.md Added index for skill quality testing references
.goat-flow/skill-reference/skill-preamble.md Added shared skill execution/evidence conventions
.goat-flow/skill-reference/skill-conventions.md Added full-depth skill authoring conventions
.goat-flow/skill-reference/README.md Added index for skill-reference playbooks
.goat-flow/skill-reference/browser-use.md Added browser-use tool playbook
.goat-flow/security-policy.md Added placeholder security policy hook file
.goat-flow/scratchpad/README.md Documented local-only scratchpad purpose
.goat-flow/scratchpad/.gitignore Ignored scratchpad contents while keeping anchors tracked
.goat-flow/patterns/testing.md Added pattern describing fixture-based testing approach
.goat-flow/patterns/README.md Added patterns index and formatting conventions
.goat-flow/logs/security/README.md Documented security log persistence behavior
.goat-flow/logs/quality/README.md Documented quality report persistence behavior
.goat-flow/logs/critiques/README.md Documented critique snapshot persistence behavior
.goat-flow/lessons/README.md Added lessons index and formatting conventions
.goat-flow/lessons/history.md Added lesson bucket frontmatter anchor
.goat-flow/glossary.md Added/updated project glossary entries
.goat-flow/footguns/transport.md Added transport-related footguns documentation
.goat-flow/footguns/README.md Added footguns index and formatting conventions
.goat-flow/footguns/integration.md Added integration-related footguns documentation
.goat-flow/decisions/README.md Added ADR conventions and constraints
.goat-flow/config.yaml Added goat-flow config (version/agents/skills)
.goat-flow/code-map.md Added codebase structure map
.goat-flow/architecture.md Added high-level architecture + request flow docs
.goat-flow/.gitignore Added goat-flow-specific ignore rules for local-only logs/tasks/scratchpad
.gitignore Expanded ignore list (node_modules, coverage artifacts, editor/OS files, agent tooling)
.github/git-commit-instructions.md Added commit message conventions
.codex/hooks.json Added Codex hooks configuration
.codex/config.toml Added Codex config template
.claude/skills/goat/SKILL.md Updated/added goat dispatcher skill (v1.5.0)
.claude/skills/goat-security/references/secrets-and-data-exposure.md Added goat-security reference pack
.claude/skills/goat-security/references/project-policy-template.md Added goat-security policy template pack
.claude/skills/goat-security/references/file-upload-and-paths.md Added goat-security file/path reference pack
.claude/skills/goat-security/references/dependency-and-supply-chain.md Added goat-security supply-chain reference pack
.claude/skills/goat-security/references/common-threats.md Added goat-security common threats reference pack
.claude/skills/goat-security/references/cicd-and-agent-surfaces.md Added goat-security CI/agent-surfaces reference pack
.claude/skills/goat-security/references/auth-authz.md Added goat-security auth/authz reference pack
.claude/skills/goat-review/references/refuter-spec.md Added goat-review refuter spec reference
.claude/skills/goat-review/references/examples.md Added goat-review examples reference
.claude/skills/goat-plan/references/milestone-examples.md Added goat-plan milestone examples reference
.claude/skills/goat-plan/references/issue-format.md Added goat-plan ISSUE.md format reference
.claude/skills/goat-debug/SKILL.md Updated/added goat-debug skill (v1.5.0)
.claude/skills/goat-critique/references/sub-agent-directives.md Added goat-critique sub-agent directives reference
.claude/skills/goat-critique/references/rubric-examples.md Added goat-critique rubric examples reference
.claude/settings.json Added/updated Claude tool permissions + hooks
.claude/hooks/stop-lint.sh Added stop hook for linting checks
.claude/hooks/guard-write-size.sh Added write-size guard hook
.claude/hooks/format-file.sh Added post-write auto-formatter hook
.agents/skills/goat/SKILL.md Mirrored goat dispatcher skill for .agents runtime
.agents/skills/goat-security/SKILL.md Mirrored goat-security skill for .agents runtime
.agents/skills/goat-security/references/secrets-and-data-exposure.md Mirrored goat-security reference pack
.agents/skills/goat-security/references/project-policy-template.md Mirrored goat-security reference pack
.agents/skills/goat-security/references/file-upload-and-paths.md Mirrored goat-security reference pack
.agents/skills/goat-security/references/dependency-and-supply-chain.md Mirrored goat-security reference pack
.agents/skills/goat-security/references/common-threats.md Mirrored goat-security reference pack
.agents/skills/goat-security/references/cicd-and-agent-surfaces.md Mirrored goat-security reference pack
.agents/skills/goat-security/references/auth-authz.md Mirrored goat-security reference pack
.agents/skills/goat-review/references/refuter-spec.md Mirrored goat-review reference pack
.agents/skills/goat-review/references/examples.md Mirrored goat-review reference pack
.agents/skills/goat-plan/references/milestone-examples.md Mirrored goat-plan reference pack
.agents/skills/goat-plan/references/issue-format.md Mirrored goat-plan reference pack
.agents/skills/goat-debug/SKILL.md Mirrored goat-debug skill for .agents runtime
.agents/skills/goat-critique/references/sub-agent-directives.md Mirrored goat-critique reference pack
.agents/skills/goat-critique/references/rubric-examples.md Mirrored goat-critique reference pack
Files not reviewed (1)
  • pnpm-lock.yaml: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +38 to +41
};

return null;
}
Comment on lines +17 to +20
protected function onComplete(StreamEvent $event): void
{
echo PHP_EOL;
}
Comment on lines +22 to +25
protected function onError(StreamEvent $event): void
{
fwrite(STDERR, "Error [{$event->errorCode}]: {$event->errorMessage}" . PHP_EOL);
}
Comment on lines +94 to +95
public function afterResponse(string $url, int $statusCode, float $durationMs, ?\Throwable $error = null): void
{
Comment on lines +52 to +56
ob_start();
$handler($event);
$stdout = ob_get_clean();

// Error should NOT go to stdout
Comment thread CHANGELOG.md Outdated
Comment on lines +8 to +12
## [1.5.0] - Unreleased

### Added

- **OpenTelemetry tracing middleware** — `OtelTracingMiddleware` emits `KIND_CLIENT` spans and injects W3C `traceparent`/`tracestate` headers for distributed tracing. Zero runtime cost when not configured. OTEL packages in `require-dev` + `suggest` only.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b9540601de

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +27 to +31
$sourceData = is_array($data['source_content'] ?? null) ? $data['source_content'] : [];
/** @var array<string, mixed> $generatedData */
$generatedData = is_array($data['generated_content'] ?? null) ? $data['generated_content'] : [];

return new self(

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve flat citation fields when hydrating Citation DTOs

Citation::fromArray() only maps location, source_content, and generated_content, but existing citation payloads in this codebase also use flat keys such as source, title, and text (for example the fixture used by AgentResponse citation parsing and the stream citation event shape). As a result, getCitationObjects()/getCitationObject() can return DTOs with all-null properties for valid citations, so consumers of the new typed API lose citation data unless they fall back to raw arrays.

Useful? React with 👍 / 👎.

Comment thread src/Streaming/StreamCallbackHandler.php Outdated
Comment on lines +36 to +40
StreamEventType::Complete => $this->onComplete($event),
StreamEventType::Error => $this->onError($event),
};

return null;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Propagate cancellation from typed stream event handlers

StreamCallbackHandler documents that returning false cancels streaming, but __invoke() always returns null after dispatch and the on* methods are void, so overriding typed hooks alone can never signal cancellation. Because StrandsClient::stream() only cancels when the callback result is exactly false, this abstraction cannot implement early-stop behavior without reimplementing __invoke().

Useful? React with 👍 / 👎.

@mattyhansen mattyhansen changed the title v1.5.0 v1.5.0 client features and GOAT Flow docs May 10, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 59fbe352ce

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +78 to +79
$scope = $span->activate();
$this->spanStack->push([$span, $scope]);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge End activated spans when setup throws before transport call

beforeRequest() activates and stores a span immediately, but there is no guard to detach/end it if request setup fails before afterResponse() runs. In StrandsClient, buildRequest() is executed before the try/catch that calls notifyAfterResponse() (e.g., src/StrandsClient.php:83), so an exception from later middleware/authentication can leave this scope active indefinitely. That leaks context into subsequent operations on the same process and produces incorrect parent-child traces; wrap activation/injection in a cleanup path (or defer activation) so failed setup does not leave orphaned spans.

Useful? React with 👍 / 👎.

@coderabbitai

coderabbitai Bot commented May 24, 2026

Copy link
Copy Markdown

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds goat-flow v1.5: new/updated skills and references, repository safety hooks/settings, Wire Contract docs and Python gateway + fixtures, SDK features (OpenTelemetry middleware, ResponseObserver, streaming callbacks, citation/guardrail DTOs, exceptions), extensive tests, and supporting scripts/config.

Changes

Goat Flow v1.5 rollout

Layer / File(s) Summary
Updated goat- skill specifications and references*
.agents/skills/**, .claude/skills/**
Introduces or updates goat-* skill specifications including goat-critique, goat-debug, goat-plan, goat-qa, goat-review, goat-security, and goat dispatcher; adds sub-agent directives, rubric examples, refuter spec, and audit/excuse reference packs.
SDK core feature additions and client integration
src/Http/Middleware/OtelTracingMiddleware.php, src/Http/ResponseObserver.php, src/StrandsClient.php, src/Integration/**, src/Streaming/*, src/Response/Citation/*, src/Exceptions/*, src/Response/*
Adds full OpenTelemetry tracing middleware with span lifecycle, W3C header injection, and low-level span naming; new ResponseObserver interface for parsed-agent/stream events; updated StrandsClient to notify observers; extended stream parsing and results with context size; new stream callback classes; DTO hydration enhancements for citations and guardrails; agent exceptions expanded with new specific subclasses; new enum values and response/message metadata expansion; integration updates for Laravel and Symfony DI.
Repository hooks and runtime configuration
.claude/hooks/*, .codex/hooks/*, .claude/settings.json, .codex/config.toml
Adds pre-tool safety hooks deny-dangerous.sh blocking risky Bash commands, a filesystem write size guard, post-tool format hook, stop-linting hook, and configures hook lifecycles for Claude and Codex runtimes including safety lists and compact notifications.
Wire Contract documentation, Python gateway, fixtures, and scripts
docs/wire-contract*.md, examples/python-gateway/*, tests/Fixtures/wire-contract/*, scripts/*
Introduces Strands HTTP Wire Contract documentation and audit, project compatibility matrix, a FastAPI/Python reference gateway with tracing, contract mapping utilities, safety checks and tests, rich set of canonical invoke/stream/error/guardrail interrupt fixtures, and management shell scripts for version bumping and dependency install/update.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant OtelMiddleware
  participant Tracer
  participant Server
  Client->>OtelMiddleware: beforeRequest(url, headers, body)
  OtelMiddleware->>Tracer: startSpan + inject trace headers
  OtelMiddleware->>Server: send request with traceparent
  Server-->>OtelMiddleware: status/response
  OtelMiddleware->>Tracer: endSpan(status, error?)
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

"I hopped through lines and docs with cheer,
Traces, hooks, and fixtures all appear.
DTOs tidy, streams that sing,
Safety fences strong as spring.
A rabbit’s nod—v1.5 is here!"

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dev

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3c2efd0cbc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +206 to +210
'type' => 'image',
'source' => [
'type' => 'url',
'url' => $url,
'media_type' => $mediaType,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Add required format field to URL image blocks

withImageFromUrl() builds an image content block without a format key, even though the wire contract and fixtures in this change use format as a required field for image blocks. In environments where the wrapper validates content-block schema, requests created via this new helper will be rejected (or handled inconsistently) while withImageFromS3() works, so URL-based image input can fail at runtime.

Useful? React with 👍 / 👎.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🧹 Nitpick comments (14)
.agents/skills/goat-qa/SKILL.md (1)

19-19: ⚡ Quick win

Mixed spelling variants: US and British English.

Line 19 uses "prioritized" (US spelling) while Line 128 uses "deprioritise" (UK spelling). Pick one variant and apply it consistently throughout the document.

Suggested fix for consistency

If standardizing on US English:

-Output: prioritized "must test / safe to skip / should test" guidance.
+Output: prioritized "must test / safe to skip / should test" guidance.
-**Misaligned effort:** deprioritise plan cases not mapped to current changes
+**Misaligned effort:** deprioritize plan cases not mapped to current changes

Or if standardizing on British English:

-Output: prioritized "must test / safe to skip / should test" guidance.
+Output: prioritised "must test / safe to skip / should test" guidance.
-**Misaligned effort:** deprioritise plan cases not mapped to current changes
+**Misaligned effort:** deprioritise plan cases not mapped to current changes

Also applies to: 128-128

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.agents/skills/goat-qa/SKILL.md at line 19, The document contains mixed US
and British spellings (e.g., "prioritized" on Line 19 vs "deprioritise" on Line
128); pick one English variant (US or British) and make the spelling consistent
across SKILL.md by updating all occurrences (search for "prioritized",
"prioritised", "deprioritise", "deprioritize", etc.) to the chosen variant;
ensure headings and inline guidance (including the phrase shown in the diff
"Output: prioritized \"must test / safe to skip / should test\" guidance") are
updated to match the chosen spelling.
.agents/skills/goat-review/references/refuter-spec.md (1)

10-26: 💤 Low value

Add language identifier to fenced code block.

The refuter prompt template code block (line 10) is missing a language identifier. While this is a template/example rather than executable code, adding a language hint improves readability and satisfies markdown linting rules.

Suggest adding markdown or text identifier:

📝 Proposed fix
-```
+```text
 You are a code review refuter. Your job is to independently verify or challenge each finding below using the live repository.

As per coding guidelines: markdownlint rule MD040.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.agents/skills/goat-review/references/refuter-spec.md around lines 10 - 26,
The fenced code block in refuter-spec.md (the template starting "You are a code
review refuter...") lacks a language identifier which triggers markdownlint
MD040; update that fenced block to include a language tag (e.g., "text" or
"markdown") by changing the opening triple backticks to ```text (or ```markdown)
so the block is properly annotated and linter-friendly, ensuring the template
string inside the file remains unchanged except for the added language
identifier.
.agents/skills/goat-review/SKILL.md (2)

166-166: ⚡ Quick win

Strengthen cross-model refuter failure handling.

The constraint mentions logging cross-model-refuter-failed if the refuter call errors, but doesn't specify what happens to findings that were meant to be verified. Should they remain at their original severity, be downgraded, or be flagged differently?

Recommend adding: "When refuter fails, retain original severity and add refuter-unavailable flag to affected findings."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.agents/skills/goat-review/SKILL.md at line 166, Update the constraint text
around "cross-model-refuter-failed" to explicitly state post-failure handling:
when the cross-model refuter call errors (emit "cross-model-refuter-failed"),
retain the original finding severity and attach a "refuter-unavailable" flag to
each affected finding rather than downgrading or changing severity, and clarify
that REFUTER-REFUTED remains advisory; ensure the new clause mentions both the
emitted event name "cross-model-refuter-failed" and the introduced flag
"refuter-unavailable".

91-91: ⚡ Quick win

Clarify consumer verification expectation.

The Blast Radius Rule requires running rg to locate external call-sites, but doesn't specify what to do when zero consumers are found. Should this be treated as suspicious (dead code), acceptable (new API), or require human confirmation?

Consider adding guidance: "If zero consumers found, mark as needs-verification and note in Review Integrity."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.agents/skills/goat-review/SKILL.md at line 91, Update the Blast Radius Rule
wording to explicitly handle the zero-consumer case: when running rg -n
'<symbol>' -t ts -t js -t py -t php -t go -t rust returns no external
call-sites, mark the suspicion state as needs-verification (do not
auto-resolve), add a short note in Review Integrity documenting the rg command
output and that no consumers were found, and keep the item UNRESOLVED with
coverage-degraded until a human verifies whether this is dead code or an
intended new API; reference the existing symbols Blast Radius Rule, rg -n
'<symbol>'..., needs-verification, Review Integrity, UNRESOLVED, and
coverage-degraded in the updated text.
.claude/settings.json (1)

31-39: ⚡ Quick win

Apply write-size guard to Edit operations too.

guard-write-size.sh currently runs only for Write, so large mutations via Edit can bypass this control.

🛠️ Suggested update
-        "matcher": "Write",
+        "matcher": "Edit|Write",
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.claude/settings.json around lines 31 - 39, The matcher currently lists only
"Write", so the guard-write-size.sh hook doesn't run for Edit operations; update
the JSON matcher array to include "Edit" alongside "Write" so the existing
command hook (bash "$(git rev-parse
--show-toplevel)/.claude/hooks/guard-write-size.sh") executes for both operation
types, ensuring the write-size guard applies to Edit as well as Write.
.claude/skills/goat-qa/SKILL.md (1)

188-191: ⚡ Quick win

Missing output format for Regression Guard Mode.

Regression Guard Mode is mentioned in lines 188-191 and referenced in the Constraints (line 208), but the Output Format section (lines 211-295) only provides templates for Standard mode and Audit mode. Consider adding an output format template for Regression Guard Mode to match the completeness of the other modes.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.claude/skills/goat-qa/SKILL.md around lines 188 - 191, The SKILL.md is
missing an Output Format template for "Regression Guard Mode": add a new
template in the Output Format section mirroring the style of the existing
"Standard mode" and "Audit mode" templates that includes fields for
"Verification Source" (e.g., /goat-debug, human sign-off, PR check),
"Invariants" (1–2 with concise descriptions), "Coverage Assessment" for each
invariant, and "Recommended Guard Tests" (clear test steps and pass/fail
criteria) and ensure the Constraints section's reference to Regression Guard
Mode still points to this new template; update the "Regression Guard Mode"
heading and any cross-references in SKILL.md so the mode is fully documented.
.claude/skills/goat-review/references/refuter-spec.md (1)

10-26: 💤 Low value

Code block should specify language identifier.

The refuter prompt template (lines 10-26) uses a fenced code block without a language identifier. While this is intentional (it's a template, not executable code), adding a language hint like text or markdown would satisfy linters and improve clarity.

📝 Suggested fix
-```
+```text
 You are a code review refuter. Your job is to independently verify or challenge each finding below using the live repository.

Based on learnings: Static analysis hints should be selectively incorporated; markdownlint flagged this as missing language specification.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.claude/skills/goat-review/references/refuter-spec.md around lines 10 - 26,
The fenced code block in the refuter prompt template (the block starting with
"You are a code review refuter..." and containing "FINDINGS TO VERIFY:
<findings_list>") lacks a language identifier; update that fenced block in
.claude/skills/goat-review/references/refuter-spec.md to use a language hint
such as ```text or ```markdown so linters accept it and clarity improves,
leaving the block contents unchanged; ensure only the opening fence is modified
(e.g., change ``` to ```text) and run your markdown linter to verify the warning
is resolved.
.goat-flow/code-map.md (2)

3-116: 💤 Low value

Add language specifier to fenced code block.

The code block should specify text or tree as the language identifier.

📝 Proposed fix
-```
+```text
 strands-php-client/
 ├── src/
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.goat-flow/code-map.md around lines 3 - 116, The fenced code block in
.goat-flow/code-map.md is missing a language specifier; update the opening fence
for the tree listing (the block starting with ``` on the file) to include a
language token such as text or tree (e.g., change ``` to ```text) so the block
is properly annotated in Markdown and rendered as a plain text/tree listing.

22-27: ⚡ Quick win

Incomplete Response DTO listing in code map.

The code map shows only AgentResponse.php, GuardrailTrace.php, InterruptDetail.php, StopReason.php, and Usage.php under src/Response/, but the changeset includes additional files:

  • Citation.php, CitationGeneratedContent.php, CitationLocation.php, CitationSourceContent.php (in src/Response/Citation/ subdirectory)
  • GuardrailAssessment.php
🗺️ Proposed fix to complete the Response/ section
 │   ├── Response/
 │   │   ├── AgentResponse.php               = Invoke response DTO with fromArray() factory
+│   │   ├── Citation/
+│   │   │   ├── Citation.php                = Citation DTO
+│   │   │   ├── CitationGeneratedContent.php = Generated content in citation
+│   │   │   ├── CitationLocation.php        = Citation source location
+│   │   │   └── CitationSourceContent.php   = Source content in citation
+│   │   ├── GuardrailAssessment.php         = Guardrail assessment DTO
 │   │   ├── GuardrailTrace.php              = Guardrail intervention data
 │   │   ├── InterruptDetail.php             = Human-in-the-loop interrupt data
 │   │   ├── StopReason.php                  = Backed enum (EndTurn, ToolUse, MaxTokens, etc.)
 │   │   └── Usage.php                       = Token usage stats with fromArray() factory
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.goat-flow/code-map.md around lines 22 - 27, The Response/ section in
.goat-flow/code-map.md is missing new DTOs—add entries for
src/Response/Citation/Citation.php, CitationGeneratedContent.php,
CitationLocation.php, CitationSourceContent.php (group under a Citation/
subdirectory with brief descriptions like "citation DTOs / location and source
content"), and add GuardrailAssessment.php alongside the other response DTOs
(describe as guardrail assessment results); keep formatting consistent with
existing lines and include any factory notes (e.g., fromArray) where applicable
(e.g., for DTOs that implement fromArray()).
.goat-flow/architecture.md (1)

11-23: 💤 Low value

Add language specifier to fenced code block.

The code block showing request flow should specify a language or use text to satisfy markdown linting.

📝 Proposed fix
-```
+```text
 User Code
   └─ StrandsClient.invoke(message, context, sessionId)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.goat-flow/architecture.md around lines 11 - 23, The fenced code block lacks
a language specifier causing markdown lint errors; update the opening fence for
the request flow block to include a language (e.g., "text" or "diff") so the
block renders and lints correctly — locate the block that documents
StrandsClient.invoke (and its steps buildJsonRequest,
RequestMiddleware.beforeRequest, AuthStrategy.authenticate, postWithRetry,
HttpTransport.post, SymfonyHttpTransport, PsrHttpTransport,
AgentResponse::fromArray, RequestMiddleware.afterResponse) and change the
opening ``` to ```text (or another appropriate language).
.goat-flow/decisions/ADR-001-strands-http-wire-contract.md (1)

24-27: 💤 Low value

Consider redacting environment-specific paths.

Lines 24-27 reference absolute paths in a local development environment (/home/devgoat/projects/...). These paths are environment-specific and may expose internal project structure.

Consider either:

  1. Generalizing the references (e.g., "consumer project wrapper contracts")
  2. Removing the absolute paths and keeping only the search terms
  3. Documenting these as external references in a separate evidence file
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.goat-flow/decisions/ADR-001-strands-http-wire-contract.md around lines 24 -
27, The document exposes environment-specific absolute paths; remove or redact
them and replace with generalized references: change occurrences of
`/home/devgoat/projects/the-summit-chatroom/strands_agents/api/server.py`
(search: `InvokeRequest`),
`/home/devgoat/projects/healthkit/strands_agents/api/helpers.py` (search:
`_extract_usage`),
`/home/devgoat/projects/halaxy-agents-lab/strands_agents/api/schemas/shared.py`
(search: `AgentUsage`), and
`/home/devgoat/projects/ambient-scribe/docs/workflow.md` (search: `contracts are
tightly coupled`) to either project-agnostic labels (e.g., "consumer project:
strands_agents/api/server.py - search: InvokeRequest"), keep only the search
terms, or move them into a separate evidence/reference file; ensure the
references still include the unique identifiers (`InvokeRequest`,
`_extract_usage`, `AgentUsage`, `contracts are tightly coupled`) so readers can
locate the code without revealing absolute paths.
src/Context/AgentInput.php (1)

151-151: ⚡ Quick win

Tighten $source PHPDoc to an explicit array shape.

Line 151 uses array<string, mixed>, but this structure is well-known and can be fully typed for stronger static analysis.

🧹 Proposed PHPDoc shape
-        /** `@var` array<string, mixed> $source */
+        /** `@var` array{type: string, uri: string, bucket_owner?: string} $source */
As per coding guidelines: `**/*.php`: Use fully qualified PHPDoc array shapes such as `array`; avoid bare `array` and `mixed` unless justified.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/Context/AgentInput.php` at line 151, Replace the loose PHPDoc /** `@var`
array<string, mixed> $source */ with a concrete array shape that matches the
actual keys and types used by AgentInput (e.g., specify keys like 'type',
'content', 'metadata' and their types) so static analyzers can validate it;
locate the $source variable in AgentInput.php, inspect where $source is
read/written (constructor/methods in the AgentInput class or any methods
referencing $source) and declare a PHPDoc like array{key1:type1, key2:type2,
...} listing exact keys and types (use nested array shapes for structured
values) replacing the current generic annotation.
src/Exceptions/AgentErrorException.php (1)

80-85: ⚡ Quick win

Add regression tests for single-keyword error codes in exception classification.

Mutation output indicates the context+overflow guard can be weakened without test failure. Add cases where only one token is present to ensure Line 80 stays strict.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/Exceptions/AgentErrorException.php` around lines 80 - 85, Add regression
tests covering single-keyword cases for the exception classification logic in
AgentErrorException (the method containing the shown guards) to ensure the
ContextOverflowException branch only triggers when both 'context' and 'overflow'
appear together and not for messages containing only one of those tokens;
specifically add assertions that a message with only "context" and a message
with only "overflow" do NOT map to ContextOverflowException while a message with
both does. Also add a test that a message containing "max_tokens" still maps to
MaxTokensException (and that similar partial matches don't misclassify). Use the
exception class names ContextOverflowException::class and
MaxTokensException::class in your assertions so the tests will fail if the guard
logic at AgentErrorException is weakened.
tests/Fixtures/wire-contract/error-response.json (1)

1-8: Reminder: Run validation commands per coding guidelines.

As per the coding guidelines, please run the following commands to validate the PHP changes:

# Validate PSR-12 code style
composer cs:check

# Run PHPStan static analysis (Level 10)
composer analyse

# Check cyclomatic complexity (max 20)
composer analyse:complexity

# Validate PHPMD rules
composer analyse:messdetector

# Run PHPUnit test suite (477 tests)
composer test
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/Fixtures/wire-contract/error-response.json` around lines 1 - 8, Run the
full PHP validation suite before updating this fixture: execute composer
cs:check, composer analyse, composer analyse:complexity, composer
analyse:messdetector and composer test locally (or in CI) and fix any reported
PSR-12, PHPStan, cyclomatic complexity, PHPMD or PHPUnit failures; ensure the
branch passes all these commands and include the successful run output in the PR
checks for composer cs:check, composer analyse, composer analyse:complexity,
composer analyse:messdetector and composer test.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.agents/skills/goat/SKILL.md:
- Around line 32-36: The fenced "Route Snapshot" block that begins with the
lines "Intent: [classified intent]" and "Route: [/goat-* or direct]" is missing
a language tag (causing MD040); update the opening fence from ``` to ```text to
add a neutral language identifier so the markdown linter stops flagging it.

In @.claude/hooks/deny-dangerous.sh:
- Around line 1458-1462: The script currently fails to split on a single
backgrounding ampersand so a segment like "echo ok & rm -rf /" is treated as one
segment and the whitelist in check_segment() (matching cmd_verb) returns early;
to fix, update split_command_segments() to also treat a single '&' as a segment
delimiter (but not '&&')—e.g. split on '&' only when not preceded or followed by
another '&' or when not inside quotes—so "echo ok & rm -rf /" becomes two
segments; after that, check_segment() will validate the second segment (e.g.
detect rm_has_recursive). Ensure the split logic preserves existing separators
(&&, ||, ;, newline) and adjust any unit tests for split_command_segments() and
check_segment() accordingly.

In @.claude/hooks/guard-write-size.sh:
- Around line 26-30: The guard incorrectly treats an empty new content as a
parse failure, mixes bytes vs characters when computing OLD_SIZE/NEW_SIZE, and
can fail when jq is unavailable; update the hook to (1) read raw JSON input into
a variable (e.g. INPUT_RAW=$(cat)) and parse with jq but tolerate jq failures
(use || true or a presence check) so missing jq causes a fail-open exit 0, (2)
distinguish parse-failure from an explicitly empty "content" by testing whether
the jq invocation succeeded versus the content being an empty string (do not use
[[ -z "$FILE_PATH" || -z "$NEW_CONTENT" ]] to detect parse errors), and (3)
compute sizes in bytes consistently by using wc -c on the old file and on
NEW_CONTENT via printf '%s' "$NEW_CONTENT" | wc -c (or write NEW_CONTENT to a
temp file and wc -c it), then calculate the reduction percentage from those byte
counts and enforce the >80% rule. Ensure references to FILE_PATH, NEW_CONTENT,
OLD_SIZE, NEW_SIZE, jq, and the existing exit 0 fail-open behavior are preserved
and adjusted accordingly.

In @.claude/settings.json:
- Around line 13-17: The deny rules currently only block reads for secrets (the
"Read(**/*.pem)", "Read(**/*.key)", "Read(**/.ssh/**)", "Read(**/.aws/**)",
"Read(**/credentials*)") but do not prevent writes/edits; update the deny
permissions to include matching write/modify/delete patterns (e.g., add
"Write(...)" or "Modify(...)" and "Delete(...)" entries with the same glob
patterns) so secret file paths are blocked for both reads and any write
operations; ensure the new entries use the exact same glob patterns as the
existing Read rules so they cover the same files.

In @.claude/skills/goat/SKILL.md:
- Around line 32-36: The unlabeled fenced code block containing the Route
Snapshot (the three-line block starting with "Intent: [classified intent]" and
including "Route: [/goat-* or direct]" and "Rationale: [...]") should have a
language tag to satisfy markdownlint MD040; update that fence to include a
language identifier such as "text" (e.g., replace the opening ``` with ```text)
so the block is labeled while leaving the block contents unchanged.

In @.codex/hooks/deny-dangerous.sh:
- Around line 1527-1528: The current regex that checks the variable cmd uses
[[:space:]]+ after the redirect operators, so no-space forms like echo "{}">file
bypass it; update the pattern in .codex/hooks/deny-dangerous.sh (the if that
tests "$cmd") to allow zero or more spaces after the redirect tokens (e.g.
change [[:space:]]+ to [[:space:]]* or use \s*), for example match ((>|>>) | tee
|
sed[[:space:]]+-i)[[:space:]]*.*(package-lock\.json|pnpm-lock\.yaml|composer\.lock|Cargo\.lock|yarn\.lock)
so that both spaced and no-space redirections are blocked and still trigger the
block "Direct lockfile modification..." call.

In @.goat-flow/patterns/testing.md:
- Line 10: Update the documentation in .goat-flow/patterns/testing.md to match
the actual fixture filenames used in tests by changing the SSE fixture extension
from .txt to .sse (i.e., refer to SSE fixtures as sse-*.sse or explicitly
mention files like stream-error.sse and stream-guardrail.sse in
tests/Fixtures/wire-contract/); ensure the doc text that describes loading
fixtures via file_get_contents(__DIR__ . '/../Fixtures/<name>') and examples
(invoke-*.json, sse-*.sse) reflect the correct extension so tests that reference
these fixtures (e.g., any test that loads stream-error.sse or
stream-guardrail.sse) will match the documentation.

In `@CLAUDE.md`:
- Line 58: Remove the hardcoded PHPUnit test count from the command example in
CLAUDE.md: replace the line containing "composer test                    #
PHPUnit (477 tests)" with a generic comment like "composer test                 
# PHPUnit" (or no inline count) so the docs don't drift as test numbers change;
update the same literal line in CLAUDE.md accordingly.

In `@docs/wire-contract-audit.md`:
- Around line 11-14: The table contains workstation-local absolute paths (e.g.
/home/devgoat/...) that must be replaced with portable repo-relative paths or
commit permalinks; update each row referencing the-summit-chatroom,
ambient-scribe, halaxy-agents-lab, and healthkit to use repo-relative links (or
GitHub/GitLab permalinks) pointing to the files that contain the referenced
symbols (search for InvokeRequest, stream_response, event_data; postJson;
AgentUsage, _extract_usage; and streamSse) so other contributors can locate the
referenced code without local paths.

In `@src/Context/AgentInput.php`:
- Around line 152-154: The emitted source objects in AgentInput.php use the
wrong type string; change the 'type' field from 's3_location' to the wrapper
wire-contract value 's3' wherever $source is constructed for S3 image blocks
(both the block starting at the $source = [...] near the first occurrence and
the similar block around lines 161-165). Update the $source arrays in the
relevant methods in the AgentInput class so they emit 'type' => 's3' to conform
to the Strands HTTP Wire Contract v1.

In `@src/Response/AgentResponse.php`:
- Around line 77-113: Replace the generic \RuntimeException usage in
AgentResponse::structuredOutputAs with the project custom exception(s): throw
AgentErrorException (or StrandsException if more appropriate to your error
taxonomy) when no structured output is present and when hydration fails,
including the original exception as previous and, if the response has an agent
error body, pass it via a responseBody constructor/field; update the method
PHPDoc to document the thrown custom exception(s) instead of \RuntimeException,
and ensure you import the exception class names and preserve the existing
previous exception chaining (use the caught Throwable $e as previous).

In `@tests/Unit/Streaming/StreamEventCitationTest.php`:
- Around line 59-77: The test testGetCitationObjectPreservesFlatCitationData
includes 'page' => 3 but never asserts it; update the test to assert the page
value is preserved by calling $citation = $event->getCitationObject() and adding
an assertion that $citation->page === 3 (or $this->assertSame(3,
$citation->page)); locate this in the StreamEvent test method and use the
Citation class/property name 'page' to reference the preserved field when adding
the assertion.

---

Nitpick comments:
In @.agents/skills/goat-qa/SKILL.md:
- Line 19: The document contains mixed US and British spellings (e.g.,
"prioritized" on Line 19 vs "deprioritise" on Line 128); pick one English
variant (US or British) and make the spelling consistent across SKILL.md by
updating all occurrences (search for "prioritized", "prioritised",
"deprioritise", "deprioritize", etc.) to the chosen variant; ensure headings and
inline guidance (including the phrase shown in the diff "Output: prioritized
\"must test / safe to skip / should test\" guidance") are updated to match the
chosen spelling.

In @.agents/skills/goat-review/references/refuter-spec.md:
- Around line 10-26: The fenced code block in refuter-spec.md (the template
starting "You are a code review refuter...") lacks a language identifier which
triggers markdownlint MD040; update that fenced block to include a language tag
(e.g., "text" or "markdown") by changing the opening triple backticks to ```text
(or ```markdown) so the block is properly annotated and linter-friendly,
ensuring the template string inside the file remains unchanged except for the
added language identifier.

In @.agents/skills/goat-review/SKILL.md:
- Line 166: Update the constraint text around "cross-model-refuter-failed" to
explicitly state post-failure handling: when the cross-model refuter call errors
(emit "cross-model-refuter-failed"), retain the original finding severity and
attach a "refuter-unavailable" flag to each affected finding rather than
downgrading or changing severity, and clarify that REFUTER-REFUTED remains
advisory; ensure the new clause mentions both the emitted event name
"cross-model-refuter-failed" and the introduced flag "refuter-unavailable".
- Line 91: Update the Blast Radius Rule wording to explicitly handle the
zero-consumer case: when running rg -n '<symbol>' -t ts -t js -t py -t php -t go
-t rust returns no external call-sites, mark the suspicion state as
needs-verification (do not auto-resolve), add a short note in Review Integrity
documenting the rg command output and that no consumers were found, and keep the
item UNRESOLVED with coverage-degraded until a human verifies whether this is
dead code or an intended new API; reference the existing symbols Blast Radius
Rule, rg -n '<symbol>'..., needs-verification, Review Integrity, UNRESOLVED, and
coverage-degraded in the updated text.

In @.claude/settings.json:
- Around line 31-39: The matcher currently lists only "Write", so the
guard-write-size.sh hook doesn't run for Edit operations; update the JSON
matcher array to include "Edit" alongside "Write" so the existing command hook
(bash "$(git rev-parse --show-toplevel)/.claude/hooks/guard-write-size.sh")
executes for both operation types, ensuring the write-size guard applies to Edit
as well as Write.

In @.claude/skills/goat-qa/SKILL.md:
- Around line 188-191: The SKILL.md is missing an Output Format template for
"Regression Guard Mode": add a new template in the Output Format section
mirroring the style of the existing "Standard mode" and "Audit mode" templates
that includes fields for "Verification Source" (e.g., /goat-debug, human
sign-off, PR check), "Invariants" (1–2 with concise descriptions), "Coverage
Assessment" for each invariant, and "Recommended Guard Tests" (clear test steps
and pass/fail criteria) and ensure the Constraints section's reference to
Regression Guard Mode still points to this new template; update the "Regression
Guard Mode" heading and any cross-references in SKILL.md so the mode is fully
documented.

In @.claude/skills/goat-review/references/refuter-spec.md:
- Around line 10-26: The fenced code block in the refuter prompt template (the
block starting with "You are a code review refuter..." and containing "FINDINGS
TO VERIFY: <findings_list>") lacks a language identifier; update that fenced
block in .claude/skills/goat-review/references/refuter-spec.md to use a language
hint such as ```text or ```markdown so linters accept it and clarity improves,
leaving the block contents unchanged; ensure only the opening fence is modified
(e.g., change ``` to ```text) and run your markdown linter to verify the warning
is resolved.

In @.goat-flow/architecture.md:
- Around line 11-23: The fenced code block lacks a language specifier causing
markdown lint errors; update the opening fence for the request flow block to
include a language (e.g., "text" or "diff") so the block renders and lints
correctly — locate the block that documents StrandsClient.invoke (and its steps
buildJsonRequest, RequestMiddleware.beforeRequest, AuthStrategy.authenticate,
postWithRetry, HttpTransport.post, SymfonyHttpTransport, PsrHttpTransport,
AgentResponse::fromArray, RequestMiddleware.afterResponse) and change the
opening ``` to ```text (or another appropriate language).

In @.goat-flow/code-map.md:
- Around line 3-116: The fenced code block in .goat-flow/code-map.md is missing
a language specifier; update the opening fence for the tree listing (the block
starting with ``` on the file) to include a language token such as text or tree
(e.g., change ``` to ```text) so the block is properly annotated in Markdown and
rendered as a plain text/tree listing.
- Around line 22-27: The Response/ section in .goat-flow/code-map.md is missing
new DTOs—add entries for src/Response/Citation/Citation.php,
CitationGeneratedContent.php, CitationLocation.php, CitationSourceContent.php
(group under a Citation/ subdirectory with brief descriptions like "citation
DTOs / location and source content"), and add GuardrailAssessment.php alongside
the other response DTOs (describe as guardrail assessment results); keep
formatting consistent with existing lines and include any factory notes (e.g.,
fromArray) where applicable (e.g., for DTOs that implement fromArray()).

In @.goat-flow/decisions/ADR-001-strands-http-wire-contract.md:
- Around line 24-27: The document exposes environment-specific absolute paths;
remove or redact them and replace with generalized references: change
occurrences of
`/home/devgoat/projects/the-summit-chatroom/strands_agents/api/server.py`
(search: `InvokeRequest`),
`/home/devgoat/projects/healthkit/strands_agents/api/helpers.py` (search:
`_extract_usage`),
`/home/devgoat/projects/halaxy-agents-lab/strands_agents/api/schemas/shared.py`
(search: `AgentUsage`), and
`/home/devgoat/projects/ambient-scribe/docs/workflow.md` (search: `contracts are
tightly coupled`) to either project-agnostic labels (e.g., "consumer project:
strands_agents/api/server.py - search: InvokeRequest"), keep only the search
terms, or move them into a separate evidence/reference file; ensure the
references still include the unique identifiers (`InvokeRequest`,
`_extract_usage`, `AgentUsage`, `contracts are tightly coupled`) so readers can
locate the code without revealing absolute paths.

In `@src/Context/AgentInput.php`:
- Line 151: Replace the loose PHPDoc /** `@var` array<string, mixed> $source */
with a concrete array shape that matches the actual keys and types used by
AgentInput (e.g., specify keys like 'type', 'content', 'metadata' and their
types) so static analyzers can validate it; locate the $source variable in
AgentInput.php, inspect where $source is read/written (constructor/methods in
the AgentInput class or any methods referencing $source) and declare a PHPDoc
like array{key1:type1, key2:type2, ...} listing exact keys and types (use nested
array shapes for structured values) replacing the current generic annotation.

In `@src/Exceptions/AgentErrorException.php`:
- Around line 80-85: Add regression tests covering single-keyword cases for the
exception classification logic in AgentErrorException (the method containing the
shown guards) to ensure the ContextOverflowException branch only triggers when
both 'context' and 'overflow' appear together and not for messages containing
only one of those tokens; specifically add assertions that a message with only
"context" and a message with only "overflow" do NOT map to
ContextOverflowException while a message with both does. Also add a test that a
message containing "max_tokens" still maps to MaxTokensException (and that
similar partial matches don't misclassify). Use the exception class names
ContextOverflowException::class and MaxTokensException::class in your assertions
so the tests will fail if the guard logic at AgentErrorException is weakened.

In `@tests/Fixtures/wire-contract/error-response.json`:
- Around line 1-8: Run the full PHP validation suite before updating this
fixture: execute composer cs:check, composer analyse, composer
analyse:complexity, composer analyse:messdetector and composer test locally (or
in CI) and fix any reported PSR-12, PHPStan, cyclomatic complexity, PHPMD or
PHPUnit failures; ensure the branch passes all these commands and include the
successful run output in the PR checks for composer cs:check, composer analyse,
composer analyse:complexity, composer analyse:messdetector and composer test.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: f4a67f1d-c99c-4353-80eb-b94e70cdca1e

📥 Commits

Reviewing files that changed from the base of the PR and between 00e8b8d and 3c2efd0.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (138)
  • .agents/skills/goat-critique/SKILL.md
  • .agents/skills/goat-critique/references/rubric-examples.md
  • .agents/skills/goat-critique/references/sub-agent-directives.md
  • .agents/skills/goat-debug/SKILL.md
  • .agents/skills/goat-plan/SKILL.md
  • .agents/skills/goat-plan/references/issue-format.md
  • .agents/skills/goat-plan/references/milestone-examples.md
  • .agents/skills/goat-qa/SKILL.md
  • .agents/skills/goat-review/SKILL.md
  • .agents/skills/goat-review/references/examples.md
  • .agents/skills/goat-review/references/refuter-spec.md
  • .agents/skills/goat-security/SKILL.md
  • .agents/skills/goat-security/references/auth-authz.md
  • .agents/skills/goat-security/references/cicd-and-agent-surfaces.md
  • .agents/skills/goat-security/references/common-threats.md
  • .agents/skills/goat-security/references/dependency-and-supply-chain.md
  • .agents/skills/goat-security/references/file-upload-and-paths.md
  • .agents/skills/goat-security/references/project-policy-template.md
  • .agents/skills/goat-security/references/secrets-and-data-exposure.md
  • .agents/skills/goat/SKILL.md
  • .claude/hooks/deny-dangerous.sh
  • .claude/hooks/format-file.sh
  • .claude/hooks/guard-write-size.sh
  • .claude/hooks/stop-lint.sh
  • .claude/settings.json
  • .claude/skills/goat-critique/SKILL.md
  • .claude/skills/goat-critique/references/rubric-examples.md
  • .claude/skills/goat-critique/references/sub-agent-directives.md
  • .claude/skills/goat-debug/SKILL.md
  • .claude/skills/goat-plan/SKILL.md
  • .claude/skills/goat-plan/references/issue-format.md
  • .claude/skills/goat-plan/references/milestone-examples.md
  • .claude/skills/goat-qa/SKILL.md
  • .claude/skills/goat-review/SKILL.md
  • .claude/skills/goat-review/references/examples.md
  • .claude/skills/goat-review/references/refuter-spec.md
  • .claude/skills/goat-security/SKILL.md
  • .claude/skills/goat-security/references/auth-authz.md
  • .claude/skills/goat-security/references/cicd-and-agent-surfaces.md
  • .claude/skills/goat-security/references/common-threats.md
  • .claude/skills/goat-security/references/dependency-and-supply-chain.md
  • .claude/skills/goat-security/references/file-upload-and-paths.md
  • .claude/skills/goat-security/references/project-policy-template.md
  • .claude/skills/goat-security/references/secrets-and-data-exposure.md
  • .claude/skills/goat/SKILL.md
  • .codex/config.toml
  • .codex/hooks.json
  • .codex/hooks/deny-dangerous.sh
  • .github/git-commit-instructions.md
  • .gitignore
  • .goat-flow/.gitignore
  • .goat-flow/architecture.md
  • .goat-flow/code-map.md
  • .goat-flow/config.yaml
  • .goat-flow/decisions/ADR-001-strands-http-wire-contract.md
  • .goat-flow/decisions/README.md
  • .goat-flow/footguns/README.md
  • .goat-flow/footguns/integration.md
  • .goat-flow/footguns/transport.md
  • .goat-flow/glossary.md
  • .goat-flow/lessons/README.md
  • .goat-flow/lessons/history.md
  • .goat-flow/logs/critiques/README.md
  • .goat-flow/logs/quality/README.md
  • .goat-flow/logs/security/README.md
  • .goat-flow/logs/sessions/.gitkeep
  • .goat-flow/patterns/README.md
  • .goat-flow/patterns/testing.md
  • .goat-flow/scratchpad/.gitignore
  • .goat-flow/scratchpad/README.md
  • .goat-flow/security-policy.md
  • .goat-flow/skill-reference/README.md
  • .goat-flow/skill-reference/browser-use.md
  • .goat-flow/skill-reference/page-capture.md
  • .goat-flow/skill-reference/skill-conventions.md
  • .goat-flow/skill-reference/skill-preamble.md
  • .goat-flow/skill-reference/skill-quality-testing.md
  • .goat-flow/skill-reference/skill-quality-testing/adversarial-framing.md
  • .goat-flow/skill-reference/skill-quality-testing/deployment.md
  • .goat-flow/skill-reference/skill-quality-testing/tdd-iteration.md
  • .goat-flow/tasks/.gitignore
  • .goat-flow/tasks/README.md
  • AGENTS.md
  • CHANGELOG.md
  • CLAUDE.md
  • README.md
  • composer.json
  • docs/rich-input.md
  • docs/usage-guide.md
  • docs/wire-contract-audit.md
  • docs/wire-contract.md
  • package.json
  • src/Context/AgentInput.php
  • src/Exceptions/AgentErrorException.php
  • src/Exceptions/ContextOverflowException.php
  • src/Exceptions/MaxTokensException.php
  • src/Exceptions/ThrottledException.php
  • src/Http/Middleware/OtelTracingMiddleware.php
  • src/Response/AgentResponse.php
  • src/Response/Citation/Citation.php
  • src/Response/Citation/CitationGeneratedContent.php
  • src/Response/Citation/CitationLocation.php
  • src/Response/Citation/CitationSourceContent.php
  • src/Response/GuardrailAssessment.php
  • src/Response/GuardrailTrace.php
  • src/Response/StopReason.php
  • src/Streaming/PrintingCallbackHandler.php
  • src/Streaming/StreamCallbackHandler.php
  • src/Streaming/StreamEvent.php
  • tests/Fixtures/invoke-with-citations.json
  • tests/Fixtures/invoke-with-guardrail.json
  • tests/Fixtures/wire-contract/README.md
  • tests/Fixtures/wire-contract/error-response.json
  • tests/Fixtures/wire-contract/invoke-request-rich-input.json
  • tests/Fixtures/wire-contract/invoke-request-text.json
  • tests/Fixtures/wire-contract/invoke-response-citations.json
  • tests/Fixtures/wire-contract/invoke-response-guardrail.json
  • tests/Fixtures/wire-contract/invoke-response-interrupt.json
  • tests/Fixtures/wire-contract/invoke-response-metadata.json
  • tests/Fixtures/wire-contract/invoke-response-success.json
  • tests/Fixtures/wire-contract/stream-error.sse
  • tests/Fixtures/wire-contract/stream-guardrail.sse
  • tests/Fixtures/wire-contract/stream-interrupt.sse
  • tests/Fixtures/wire-contract/stream-success.sse
  • tests/Http/Middleware/OtelTracingMiddlewareTest.php
  • tests/Unit/AgentInputTest.php
  • tests/Unit/AgentResponseTest.php
  • tests/Unit/Exceptions/AgentErrorExceptionTest.php
  • tests/Unit/Exceptions/ContextOverflowExceptionTest.php
  • tests/Unit/Exceptions/MaxTokensExceptionTest.php
  • tests/Unit/Exceptions/ThrottledExceptionTest.php
  • tests/Unit/GuardrailTraceTest.php
  • tests/Unit/Response/Citation/CitationLocationTest.php
  • tests/Unit/Response/Citation/CitationTest.php
  • tests/Unit/Response/GuardrailAssessmentTest.php
  • tests/Unit/Streaming/PrintingCallbackHandlerTest.php
  • tests/Unit/Streaming/StreamCallbackHandlerTest.php
  • tests/Unit/Streaming/StreamEventCitationTest.php

Comment thread .agents/skills/goat/SKILL.md
Comment thread .claude/hooks/deny-dangerous.sh
Comment thread .claude/hooks/guard-write-size.sh Outdated
Comment thread .claude/settings.json Outdated
Comment thread .claude/skills/goat/SKILL.md
Comment thread CLAUDE.md Outdated
Comment thread docs/wire-contract-audit.md Outdated
Comment on lines +152 to +154
$source = [
'type' => 's3_location',
'uri' => $s3Uri,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use wrapper wire-contract s3 source type for S3 image blocks.

Line 153 currently emits s3_location, which diverges from the v1 wrapper contract shape documented in this PR (type: s3). That can break wrappers expecting the normalized contract.

🔧 Proposed fix
         $source = [
-            'type' => 's3_location',
+            'type' => 's3',
             'uri' => $s3Uri,
         ];
As per coding guidelines: `src/**/*.php`: Ensure client targets the Strands HTTP Wire Contract v1 emitted by wrapper services, not raw sdk-python TypedDict shapes.

Also applies to: 161-165

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/Context/AgentInput.php` around lines 152 - 154, The emitted source
objects in AgentInput.php use the wrong type string; change the 'type' field
from 's3_location' to the wrapper wire-contract value 's3' wherever $source is
constructed for S3 image blocks (both the block starting at the $source = [...]
near the first occurrence and the similar block around lines 161-165). Update
the $source arrays in the relevant methods in the AgentInput class so they emit
'type' => 's3' to conform to the Strands HTTP Wire Contract v1.

Comment thread src/Response/AgentResponse.php
Comment thread tests/Unit/Streaming/StreamEventCitationTest.php

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.claude/hooks/deny-dangerous.sh:
- Around line 1527-1529: The guard currently requires whitespace after
redirection, letting constructs like `>package-lock.json` bypass protection;
update the regex that tests the cmd variable so the redirection tokens include
the no-space variants (>, >>, >|) and allow zero-or-more spaces after them (use
[[:space:]]*), keeping the existing checks for tee and sed -i; ensure the
updated pattern still references $cmd and invokes block("Direct lockfile
modification...") when matched.

In `@scripts/dependencies-update.sh`:
- Around line 69-76: The script checks for npm on PATH before verifying
package.json, causing failures in environments without npm even when no
package.json exists; update the RUN_NPM branch so it first checks for the
presence of package.json (the file test for package.json) and if absent logs the
warn "No package.json found - skipping npm update" and skips npm logic, and only
when package.json exists then test for npm with command -v npm and error out if
missing; adjust the ordering around the RUN_NPM conditional (the if [[
"$RUN_NPM" == "true" ]] block) to perform the package.json existence check
before invoking command -v npm.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 1e2cf44d-1895-4745-8f5c-b17cdbf3ecf5

📥 Commits

Reviewing files that changed from the base of the PR and between 3c2efd0 and 7bb28d6.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (4)
  • .claude/hooks/deny-dangerous.sh
  • scripts/bump-version.sh
  • scripts/dependencies-install.sh
  • scripts/dependencies-update.sh
✅ Files skipped from review due to trivial changes (1)
  • scripts/dependencies-install.sh

Comment thread .claude/hooks/deny-dangerous.sh Outdated
Comment thread scripts/dependencies-update.sh

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7bb28d6aa5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +49 to +50
type: is_string($data['type'] ?? null) ? $data['type'] : null,
action: is_string($data['action'] ?? null) ? $data['action'] : null,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Hydrate normalized guardrail assessment keys

GuardrailAssessment::fromArray() currently only maps type/action plus policy objects, but this commit’s wire-contract fixtures for guardrails use normalized assessment entries like name, result, and confidence (e.g., tests/Fixtures/wire-contract/invoke-response-guardrail.json and stream-guardrail.sse). When callers use GuardrailTrace::getAssessmentObjects(), those valid assessments hydrate into DTOs with null fields, so typed consumers lose whether a guardrail blocked or allowed content; the mapper should preserve the normalized keys as typed properties (or equivalent) to avoid dropping this data.

Useful? React with 👍 / 👎.

@mattyhansen mattyhansen requested a review from Copilot May 24, 2026 05:49
@mattyhansen mattyhansen changed the title v1.5.0 client features and GOAT Flow docs v1.5.0 May 24, 2026

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 185 out of 188 changed files in this pull request and generated 2 comments.

Comment on lines +73 to +78
private static function int(array $data, string $key): ?int
{
$value = $data[$key] ?? null;

return is_int($value) ? $value : null;
}
Comment on lines +42 to +45
protected function onError(StreamEvent $event): ?bool
{
$this->writeError("Error [{$event->errorCode}]: {$event->errorMessage}" . PHP_EOL);

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 629933479a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/StrandsClient.php Outdated
static fn (RequestMiddleware $mw): bool => $mw instanceof ResponseObserver,
));

return [...$observerMiddleware, ...$responseObservers];

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Deduplicate response observers before dispatch

normaliseResponseObservers() concatenates middleware-derived observers with explicitly provided observers without removing duplicates. In the new Symfony wiring, services that implement both RequestMiddleware and ResponseObserver are tagged into both lists, so the same observer instance is notified twice per request (afterInvoke/afterStream/etc.). This double-fires side effects like metrics increments or audit/log sinks and can inflate observability data in production.

Useful? React with 👍 / 👎.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
.goat-flow/patterns/testing.md (1)

10-10: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Correct the SSE fixture file extension.

The documentation states SSE fixtures use sse-*.txt naming, but the actual fixture files in tests/Fixtures/wire-contract/ use the .sse extension.

📝 Proposed fix
-**Approach:** Use static fixture files in `tests/Fixtures/` rather than inline response data. JSON fixtures (`invoke-*.json`) provide canned agent responses for invoke tests. SSE text fixtures (`sse-*.txt`) provide raw event streams for streaming/parser tests. Tests load fixtures via `file_get_contents(__DIR__ . '/../Fixtures/<name>')` and feed them to mocked transports. Add new fixtures for new response shapes rather than embedding large JSON/SSE payloads in test methods.
+**Approach:** Use static fixture files in `tests/Fixtures/` rather than inline response data. JSON fixtures (`invoke-*.json`) provide canned agent responses for invoke tests. SSE fixtures (`*.sse`) provide raw event streams for streaming/parser tests. Tests load fixtures via `file_get_contents(__DIR__ . '/../Fixtures/<name>')` and feed them to mocked transports. Add new fixtures for new response shapes rather than embedding large JSON/SSE payloads in test methods.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.goat-flow/patterns/testing.md at line 10, Docs and fixtures disagree on SSE
fixture extension: tests/Fixtures/wire-contract/ uses .sse but the docs and
loading convention expect sse-*.txt. Rename the SSE fixture files in
tests/Fixtures/wire-contract/ from *.sse to sse-*.txt (or update the doc if you
prefer the .sse extension) and update any references that load them via
file_get_contents(__DIR__ . '/../Fixtures/<name>') so they match the sse-*.txt
pattern; also confirm the .goat-flow/patterns/testing.md line listing "SSE text
fixtures (`sse-*.txt`)" stays consistent with the actual fixture filenames.
src/Response/AgentResponse.php (1)

98-123: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use project exception hierarchy for structured-output hydration failures.

The structuredOutputAs() method throws generic \RuntimeException which violates the coding guideline. Per the guidelines: "Use custom exceptions: StrandsException, AgentErrorException, and StreamInterruptedException."

The method should throw StrandsException instead of \RuntimeException at lines 101 and 118-121, and the PHPDoc at line 96 should be updated accordingly.

As per coding guidelines: "Use custom exceptions: StrandsException, AgentErrorException, and StreamInterruptedException. Pass structured agent error bodies via responseBody where available and document thrown exceptions in PHPDoc".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/Response/AgentResponse.php` around lines 98 - 123, The method
structuredOutputAs currently throws generic \RuntimeException on missing
structured output and on hydration failures; replace those throws with the
project custom StrandsException (imported/fully-qualified as needed) so both
error paths use StrandsException and include the original throwable as previous
where applicable; update the PHPDoc above structuredOutputAs to declare that it
throws StrandsException (and document that responseBody may contain structured
agent error info if relevant) and ensure the thrown StrandsException messages
preserve the existing messages and previous exception chaining from the caught
\Throwable.
🧹 Nitpick comments (2)
src/Integration/Symfony/DependencyInjection/StrandsExtension.php (1)

7-14: ⚡ Quick win

Reorder imports per coding guidelines.

Imports should be alphabetical in groups: PHP built-ins, third-party (Symfony), then StrandsPhpClient\.... Currently, StrandsPhpClient imports appear before Symfony imports.

📦 Proposed fix
-use StrandsPhpClient\Http\RequestMiddleware;
-use StrandsPhpClient\Http\ResponseObserver;
-use StrandsPhpClient\StrandsClient;
 use Symfony\Component\DependencyInjection\Argument\TaggedIteratorArgument;
 use Symfony\Component\DependencyInjection\ContainerBuilder;
 use Symfony\Component\DependencyInjection\Definition;
 use Symfony\Component\DependencyInjection\Reference;
 use Symfony\Component\HttpKernel\DependencyInjection\Extension;
+use StrandsPhpClient\Http\RequestMiddleware;
+use StrandsPhpClient\Http\ResponseObserver;
+use StrandsPhpClient\StrandsClient;

As per coding guidelines: Imports are alphabetical in groups: PHP built-ins, third-party, then StrandsPhpClient\...

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/Integration/Symfony/DependencyInjection/StrandsExtension.php` around
lines 7 - 14, Reorder the use statements in StrandsExtension.php so they follow
the project import ordering and alphabetical grouping: first any PHP built-ins
(if present), then third-party Symfony imports (Argument\TaggedIteratorArgument,
ContainerBuilder, Definition, Reference,
HttpKernel\DependencyInjection\Extension) sorted alphabetically, and finally the
StrandsPhpClient imports (RequestMiddleware, ResponseObserver, StrandsClient)
sorted alphabetically; update the import block to reflect this order so the
Symfony imports come before the StrandsPhpClient imports.
src/Context/AgentInput.php (1)

425-436: ⚡ Quick win

Tighten citations PHPDoc from array<string, mixed> to a concrete shape (or justified union).

The new public document APIs propagate array<string, mixed>|null through documentBlock(), which weakens static guarantees on a new contract surface.

As per coding guidelines: **/*.php: "Use fully qualified PHPDoc array shapes such as array<string, string>; avoid bare array and mixed unless justified".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/Context/AgentInput.php` around lines 425 - 436, The PHPDoc for
documentBlock's $citations parameter is too loose (array<string,mixed>|null)
which weakens type guarantees; update the docblock for the private static
function documentBlock to specify a concrete array shape (e.g., array<int,
array{source:string, title?:string, url?:string}>|null or another justified
union) that matches the actual citation structure used by callers, adjust the
return PHPDoc if it depends on citation shape, and update any callsites that
pass citations to conform to the new shape (or provide a justified union type if
multiple shapes are required) so static analysis enforces the real contract.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/Context/AgentInput.php`:
- Around line 321-336: Add a unit test for AgentInput::withCachePoint that
verifies immutability: create an AgentInput instance, capture its original state
(e.g., contentBlocks), call withCachePoint('default', '123'), and assert the
returned instance is a different object with the new 'cache_point' block while
the original AgentInput (the pre-call instance) remains unchanged (no new block
and identical contentBlocks); target the withCachePoint method to prevent
clone-removal regressions.

In `@src/StrandsClient.php`:
- Around line 765-774: The normaliseResponseObservers method currently
concatenates middleware-derived observers ($observerMiddleware) with explicit
$responseObservers which can cause the same ResponseObserver instance to run
twice; update normaliseResponseObservers to merge both lists and remove
duplicates (by object identity, e.g. using spl_object_hash) while preserving
order so each ResponseObserver runs only once; operate on the arrays derived
from RequestMiddleware and $responseObservers and return the deduplicated list
of ResponseObserver instances.

---

Duplicate comments:
In @.goat-flow/patterns/testing.md:
- Line 10: Docs and fixtures disagree on SSE fixture extension:
tests/Fixtures/wire-contract/ uses .sse but the docs and loading convention
expect sse-*.txt. Rename the SSE fixture files in tests/Fixtures/wire-contract/
from *.sse to sse-*.txt (or update the doc if you prefer the .sse extension) and
update any references that load them via file_get_contents(__DIR__ .
'/../Fixtures/<name>') so they match the sse-*.txt pattern; also confirm the
.goat-flow/patterns/testing.md line listing "SSE text fixtures (`sse-*.txt`)"
stays consistent with the actual fixture filenames.

In `@src/Response/AgentResponse.php`:
- Around line 98-123: The method structuredOutputAs currently throws generic
\RuntimeException on missing structured output and on hydration failures;
replace those throws with the project custom StrandsException
(imported/fully-qualified as needed) so both error paths use StrandsException
and include the original throwable as previous where applicable; update the
PHPDoc above structuredOutputAs to declare that it throws StrandsException (and
document that responseBody may contain structured agent error info if relevant)
and ensure the thrown StrandsException messages preserve the existing messages
and previous exception chaining from the caught \Throwable.

---

Nitpick comments:
In `@src/Context/AgentInput.php`:
- Around line 425-436: The PHPDoc for documentBlock's $citations parameter is
too loose (array<string,mixed>|null) which weakens type guarantees; update the
docblock for the private static function documentBlock to specify a concrete
array shape (e.g., array<int, array{source:string, title?:string,
url?:string}>|null or another justified union) that matches the actual citation
structure used by callers, adjust the return PHPDoc if it depends on citation
shape, and update any callsites that pass citations to conform to the new shape
(or provide a justified union type if multiple shapes are required) so static
analysis enforces the real contract.

In `@src/Integration/Symfony/DependencyInjection/StrandsExtension.php`:
- Around line 7-14: Reorder the use statements in StrandsExtension.php so they
follow the project import ordering and alphabetical grouping: first any PHP
built-ins (if present), then third-party Symfony imports
(Argument\TaggedIteratorArgument, ContainerBuilder, Definition, Reference,
HttpKernel\DependencyInjection\Extension) sorted alphabetically, and finally the
StrandsPhpClient imports (RequestMiddleware, ResponseObserver, StrandsClient)
sorted alphabetically; update the import block to reflect this order so the
Symfony imports come before the StrandsPhpClient imports.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 4ad0cfa2-4f88-4347-b5b6-14a066a97067

📥 Commits

Reviewing files that changed from the base of the PR and between 7bb28d6 and 6299334.

📒 Files selected for processing (60)
  • .goat-flow/decisions/ADR-002-otel-response-observation.md
  • .goat-flow/footguns/contract.md
  • .goat-flow/footguns/generated-files.md
  • .goat-flow/footguns/observability.md
  • .goat-flow/lessons/verification.md
  • .goat-flow/patterns/observability.md
  • .goat-flow/patterns/release.md
  • .goat-flow/patterns/testing.md
  • CHANGELOG.md
  • README.md
  • docs/laravel-config.md
  • docs/rich-input.md
  • docs/symfony-config.md
  • docs/usage-guide.md
  • docs/wire-contract-audit.md
  • docs/wire-contract-consumer-matrix.md
  • docs/wire-contract.md
  • examples/python-gateway/README.md
  • examples/python-gateway/app.py
  • examples/python-gateway/contract.py
  • examples/python-gateway/tests/smoke_contract.py
  • examples/python-gateway/tracing.py
  • src/Context/AgentInput.php
  • src/Http/Middleware/OtelTracingMiddleware.php
  • src/Http/ResponseObserver.php
  • src/Integration/Laravel/StrandsServiceProvider.php
  • src/Integration/StrandsClientFactory.php
  • src/Integration/Symfony/DependencyInjection/StrandsExtension.php
  • src/Response/AgentResponse.php
  • src/Response/Message.php
  • src/Response/MessageMetadata.php
  • src/Response/Usage.php
  • src/StrandsClient.php
  • src/Streaming/StreamEvent.php
  • src/Streaming/StreamResult.php
  • src/Streaming/StreamSseSummary.php
  • tests/Fixtures/wire-contract/consumer-summit-stream.sse
  • tests/Fixtures/wire-contract/custom-ambient-history-response.json
  • tests/Fixtures/wire-contract/custom-ambient-roles-response.json
  • tests/Fixtures/wire-contract/custom-halaxy-file-metadata-response.json
  • tests/Fixtures/wire-contract/custom-halaxy-file-summarise-stream.sse
  • tests/Fixtures/wire-contract/custom-halaxy-suggested-actions-analysis-response.json
  • tests/Fixtures/wire-contract/custom-healthkit-booking-respond-stream.sse
  • tests/Fixtures/wire-contract/custom-healthkit-chat-response.json
  • tests/Fixtures/wire-contract/custom-healthkit-intent-response.json
  • tests/Fixtures/wire-contract/discovery-response.json
  • tests/Fixtures/wire-contract/invoke-request-cache-point.json
  • tests/Fixtures/wire-contract/invoke-request-document-context-citations.json
  • tests/Fixtures/wire-contract/invoke-response-context-size.json
  • tests/Fixtures/wire-contract/invoke-response-message-metadata.json
  • tests/Fixtures/wire-contract/invoke-response-tools-full.json
  • tests/Http/Middleware/OtelTracingMiddlewareTest.php
  • tests/Http/Middleware/OtelTracingPhiSafetyTest.php
  • tests/Unit/AgentInputTest.php
  • tests/Unit/AgentResponseTest.php
  • tests/Unit/Contract/ConsumerCompatibilityTest.php
  • tests/Unit/Contract/WireContractFixtureTest.php
  • tests/Unit/ResponseObserverTest.php
  • tests/Unit/StrandsClientStreamTest.php
  • tests/Unit/StreamParserTest.php
✅ Files skipped from review due to trivial changes (23)
  • tests/Fixtures/wire-contract/custom-ambient-roles-response.json
  • tests/Fixtures/wire-contract/custom-healthkit-chat-response.json
  • tests/Fixtures/wire-contract/custom-halaxy-file-summarise-stream.sse
  • tests/Fixtures/wire-contract/custom-halaxy-suggested-actions-analysis-response.json
  • tests/Fixtures/wire-contract/discovery-response.json
  • .goat-flow/footguns/contract.md
  • tests/Fixtures/wire-contract/custom-healthkit-intent-response.json
  • .goat-flow/footguns/observability.md
  • tests/Fixtures/wire-contract/invoke-response-tools-full.json
  • tests/Fixtures/wire-contract/invoke-response-message-metadata.json
  • .goat-flow/footguns/generated-files.md
  • tests/Fixtures/wire-contract/invoke-request-document-context-citations.json
  • .goat-flow/lessons/verification.md
  • examples/python-gateway/README.md
  • tests/Fixtures/wire-contract/custom-healthkit-booking-respond-stream.sse
  • .goat-flow/patterns/observability.md
  • tests/Fixtures/wire-contract/invoke-response-context-size.json
  • docs/laravel-config.md
  • docs/wire-contract-consumer-matrix.md
  • .goat-flow/decisions/ADR-002-otel-response-observation.md
  • docs/wire-contract-audit.md
  • README.md
  • docs/rich-input.md

Comment on lines +321 to +336
public function withCachePoint(string $type = 'default', ?string $ttl = null): self
{
$clone = clone $this;
$block = [
'type' => 'cache_point',
'cache_type' => $type,
];

if ($ttl !== null) {
$block['ttl'] = $ttl;
}

$clone->contentBlocks[] = $block;

return $clone;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add an immutability unit test for withCachePoint() to prevent clone regressions.

Mutation testing already found an escaped clone-removal mutant here; a focused test that asserts the original instance is unchanged after withCachePoint() will close that gap.

As per coding guidelines: **/*.php: "AgentContext style builders are immutable clone-and-mutate APIs".

🧰 Tools
🪛 GitHub Check: Mutation Testing

[warning] 323-323:
Escaped Mutant for Mutator "CloneRemoval":

@@ @@
*/
public function withCachePoint(string $type = 'default', ?string $ttl = null): self
{

  •    $clone = clone $this;
    
  •    $clone = $this;
       $block = [
           'type' => 'cache_point',
           'cache_type' => $type,
    
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/Context/AgentInput.php` around lines 321 - 336, Add a unit test for
AgentInput::withCachePoint that verifies immutability: create an AgentInput
instance, capture its original state (e.g., contentBlocks), call
withCachePoint('default', '123'), and assert the returned instance is a
different object with the new 'cache_point' block while the original AgentInput
(the pre-call instance) remains unchanged (no new block and identical
contentBlocks); target the withCachePoint method to prevent clone-removal
regressions.

Comment thread src/StrandsClient.php

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d29f351b28

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +231 to +235
if ($summary->usage !== null) {
self::setUsageAttributes($span, $summary->usage);
}

if ($summary->stopReason !== null) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Mark stream_sse spans as failed on terminal error events

streamSse() now tracks terminal event type (complete vs error) in StreamSseSummary, but afterStreamSse() never uses terminalType to set an error status. When a custom SSE endpoint emits a terminal error event but returns HTTP 200, the span is currently recorded as successful, which hides real stream failures in observability dashboards and alerting. Set span status to STATUS_ERROR (and ideally an error description) when terminalType === 'error'.

Useful? React with 👍 / 👎.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/Context/AgentInput.php (1)

139-139: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use wire-contract s3 source type instead of s3_location.

Lines 139, 166, and 301 emit 'type' => 's3_location', which diverges from the Strands HTTP Wire Contract v1. The wrapper contract expects 'type' => 's3' for S3-backed content sources.

🔧 Proposed fix

Apply to all three S3 source constructions:

         $source = [
-            'type' => 's3_location',
+            'type' => 's3',
             'uri' => $s3Uri,
         ];

As per coding guidelines: This client targets the Strands HTTP Wire Contract v1 emitted by wrapper services, not raw sdk-python TypedDict shapes.

Also applies to: 166-166, 301-301

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/Context/AgentInput.php` at line 139, Update the S3 source type string to
match the Strands HTTP Wire Contract v1: replace all occurrences of "'type' =>
's3_location'" with "'type' => 's3'" in the AgentInput class where S3-backed
content sources are constructed (ensure you change the three spots referenced in
the review), i.e., within the S3 source construction logic used by AgentInput
(update the arrays/typed dicts created by the methods that build S3 sources so
they emit type: "s3" instead of "s3_location").
src/Response/AgentResponse.php (1)

145-155: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Keep wrapper metadata out of the forward-compat bag.

fromArray() now lifts the top-level metadata field into $wrapperMetadata, but it is still missing from $knownKeys. That means the same payload is also copied into $metadata, even though $metadata is documented as “unrecognised top-level response fields”. Add 'metadata' to $knownKeys so callers do not see the same field through two different public properties.

🩹 Suggested change
         $knownKeys = [
             'text', 'agent', 'session_id', 'usage', 'tools_used',
             'has_objective', 'stop_reason', 'structured_output',
             'interrupts', 'guardrail_trace', 'trace', 'message',
-            'context_size', 'projected_context_size',
+            'context_size', 'projected_context_size', 'metadata',
         ];
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/Response/AgentResponse.php` around lines 145 - 155, The top-level
"metadata" field is being lifted into $wrapperMetadata but not excluded from the
forward-compat bag, so it still appears in $metadata; update the $knownKeys
array in AgentResponse::fromArray (the $knownKeys variable) to include
'metadata' so array_diff_key excludes it and callers don't see duplicate data
between $metadata and $wrapperMetadata (leave the existing
$rawWrapperMetadata/$wrapperMetadata handling unchanged).
🧹 Nitpick comments (2)
src/Context/AgentInput.php (2)

405-410: 💤 Low value

Consider adding a test case to cover the explode limit parameter.

The explode limit of 2 is correct for typical MIME types, but mutation testing shows that changing it to 3 doesn't affect behavior (because typical image MIME types have no second slash). Adding a test with an edge-case MIME type would kill this mutant and document the expected behavior.

💡 Example test case
// Test that derives format correctly even from unusual MIME types
$this->assertEquals('png', AgentInput::deriveImageFormat('image/png'));
$this->assertEquals('svg+xml', AgentInput::deriveImageFormat('image/svg+xml'));
$this->assertEquals('vnd.custom', AgentInput::deriveImageFormat('image/vnd.custom'));
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/Context/AgentInput.php` around lines 405 - 410, Add a unit test for
AgentInput::deriveImageFormat that covers edge-case MIME types with additional
subtypes (e.g., "image/svg+xml" and "image/vnd.custom") to ensure the
explode('/', $mediaType, 2) behavior is preserved; specifically assert
deriveImageFormat('image/png') == 'png', deriveImageFormat('image/svg+xml') ==
'svg+xml', and deriveImageFormat('image/vnd.custom') == 'vnd.custom' so the
mutation that changes the explode limit is caught and the expected behavior is
documented.

323-338: ⚡ Quick win

Add an immutability unit test for withCachePoint().

Mutation testing flagged an escaped clone-removal mutant in this method. A focused test asserting that the original instance remains unchanged after calling withCachePoint() will prevent regressions.

Do you want me to generate a unit test that verifies the clone behavior and ensures the original AgentInput instance is not modified?

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/Context/AgentInput.php` around lines 323 - 338, Add a unit test for
AgentInput::withCachePoint that asserts immutability: create an AgentInput
instance, record its original contentBlocks (or whole state), call
withCachePoint('foo', '60s'), then assert the returned instance is not the same
object as the original (!==), assert the original instance's contentBlocks/state
is unchanged (same count and same contents), and assert the returned instance
has one additional content block matching
['type'=>'cache_point','cache_type'=>'foo','ttl'=>'60s']; also add a
complementary assertion for the case when ttl is null to verify ttl is absent in
the new block.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.claude/hooks/deny-dangerous.sh:
- Around line 1641-1648: split_command_segments() currently splits on any '&'
when "$next" != '&', which incorrectly splits redirection tokens like "2>&1" and
"&>out.log"; introduce a prev char variable updated each iteration and change
the split condition to only treat '&' as a background separator when it is not
part of a redirection token: i.e., only split when char == '&' && next != '&' &&
prev is not a digit && prev is not '>' or '<' && next is not '>' (so '&' in
"2>&1", "<&..." or "&>file" won’t split), updating split_segments and current as
before and continuing; reference function split_command_segments() and variables
char, next, prev, split_segments, current when making the change.

In `@src/Http/Middleware/OtelTracingMiddleware.php`:
- Around line 65-70: The current unconditional call to endOrphanedSpans() at the
top of OtelTracingMiddleware::beforeRequest() prematurely closes legitimate
parent spans when the middleware is re-used; remove that top-level call and
instead invoke endOrphanedSpans() only on explicit setup-failure paths (for
example inside the catch blocks that handle failures during span initialization
in beforeRequest() or where buildRequest()/invoke() fails in
StrandsClient::invoke()), so that orphan cleanup runs only when span setup
actually failed and not on every new request.

---

Outside diff comments:
In `@src/Context/AgentInput.php`:
- Line 139: Update the S3 source type string to match the Strands HTTP Wire
Contract v1: replace all occurrences of "'type' => 's3_location'" with "'type'
=> 's3'" in the AgentInput class where S3-backed content sources are constructed
(ensure you change the three spots referenced in the review), i.e., within the
S3 source construction logic used by AgentInput (update the arrays/typed dicts
created by the methods that build S3 sources so they emit type: "s3" instead of
"s3_location").

In `@src/Response/AgentResponse.php`:
- Around line 145-155: The top-level "metadata" field is being lifted into
$wrapperMetadata but not excluded from the forward-compat bag, so it still
appears in $metadata; update the $knownKeys array in AgentResponse::fromArray
(the $knownKeys variable) to include 'metadata' so array_diff_key excludes it
and callers don't see duplicate data between $metadata and $wrapperMetadata
(leave the existing $rawWrapperMetadata/$wrapperMetadata handling unchanged).

---

Nitpick comments:
In `@src/Context/AgentInput.php`:
- Around line 405-410: Add a unit test for AgentInput::deriveImageFormat that
covers edge-case MIME types with additional subtypes (e.g., "image/svg+xml" and
"image/vnd.custom") to ensure the explode('/', $mediaType, 2) behavior is
preserved; specifically assert deriveImageFormat('image/png') == 'png',
deriveImageFormat('image/svg+xml') == 'svg+xml', and
deriveImageFormat('image/vnd.custom') == 'vnd.custom' so the mutation that
changes the explode limit is caught and the expected behavior is documented.
- Around line 323-338: Add a unit test for AgentInput::withCachePoint that
asserts immutability: create an AgentInput instance, record its original
contentBlocks (or whole state), call withCachePoint('foo', '60s'), then assert
the returned instance is not the same object as the original (!==), assert the
original instance's contentBlocks/state is unchanged (same count and same
contents), and assert the returned instance has one additional content block
matching ['type'=>'cache_point','cache_type'=>'foo','ttl'=>'60s']; also add a
complementary assertion for the case when ttl is null to verify ttl is absent in
the new block.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: f34c353e-87f2-4733-a88e-e79947996539

📥 Commits

Reviewing files that changed from the base of the PR and between 6299334 and d29f351.

📒 Files selected for processing (17)
  • .claude/hooks/deny-dangerous.sh
  • .claude/hooks/guard-write-size.sh
  • .claude/settings.json
  • .goat-flow/patterns/testing.md
  • CLAUDE.md
  • docs/wire-contract-audit.md
  • scripts/dependencies-update.sh
  • src/Context/AgentInput.php
  • src/Http/Middleware/OtelTracingMiddleware.php
  • src/Response/AgentResponse.php
  • src/Response/Citation/CitationLocation.php
  • src/Response/GuardrailAssessment.php
  • src/StrandsClient.php
  • src/Streaming/PrintingCallbackHandler.php
  • tests/Unit/AgentInputTest.php
  • tests/Unit/Response/Citation/CitationLocationTest.php
  • tests/Unit/Streaming/StreamEventCitationTest.php
💤 Files with no reviewable changes (1)
  • tests/Unit/Streaming/StreamEventCitationTest.php
✅ Files skipped from review due to trivial changes (1)
  • docs/wire-contract-audit.md

Comment thread .claude/hooks/deny-dangerous.sh Outdated
Comment on lines +65 to +70
// Drain any spans left over from a previous request whose afterResponse()
// never fired (e.g. buildRequest() failed before the try/catch in
// StrandsClient::invoke()). Without this, the active scope leaks into
// subsequent operations and produces incorrect parent-child traces.
$this->endOrphanedSpans();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Unconditional orphan cleanup tears down valid parent spans.

Calling endOrphanedSpans() at the top of every beforeRequest() will also close any still-live span when the same middleware instance is reused re-entrantly before its matching afterResponse() fires. That turns legitimate nesting into an “orphaned span” error and breaks parent-child trace linkage. The cleanup should run only on explicit setup-failure paths, not on every new request.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/Http/Middleware/OtelTracingMiddleware.php` around lines 65 - 70, The
current unconditional call to endOrphanedSpans() at the top of
OtelTracingMiddleware::beforeRequest() prematurely closes legitimate parent
spans when the middleware is re-used; remove that top-level call and instead
invoke endOrphanedSpans() only on explicit setup-failure paths (for example
inside the catch blocks that handle failures during span initialization in
beforeRequest() or where buildRequest()/invoke() fails in
StrandsClient::invoke()), so that orphan cleanup runs only when span setup
actually failed and not on every new request.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 185 out of 188 changed files in this pull request and generated 3 comments.

Comment thread src/Response/Usage.php
Comment on lines 16 to +28
@@ -24,6 +25,7 @@ public function __construct(
public readonly int $cacheWriteInputTokens = 0,
public readonly int $latencyMs = 0,
public readonly int $timeToFirstByteMs = 0,
public readonly int $totalTokens = 0,
Comment thread .goat-flow/glossary.md Outdated
Comment on lines +7 to +8
- **StreamEvent** — A single typed event from an SSE stream. Types: Text, Thinking, ToolUse, ToolResult, Citation, Complete, Error, ReasoningSignature. Terminal events: Complete and Error.
- **StreamResult** — Accumulated result from a streaming session: full text, session ID, usage, tools used, TTFT (time to first text token), interrupts, guardrail trace, citations.
Comment thread .goat-flow/glossary.md Outdated
- **RequestMiddleware** — Interface for pre-/post-request hooks. `beforeRequest()` runs before auth; `afterResponse()` runs after completion (exceptions swallowed).
- **StrandsClientFactory** — Shared factory used by both Laravel and Symfony integrations to construct `StrandsClient` instances from framework config.
- **preflight** — The `composer preflight` command that runs all quality gates: PHPUnit, PHPStan, PHP-CS-Fixer, PHPMD, cyclomatic complexity, and optionally Infection.
- **StopReason** — Backed enum for why an agent stopped: `EndTurn`, `ToolUse`, `MaxTokens`, `StopSequence`, `GuardrailIntervened`, `Interrupted`.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 185 out of 188 changed files in this pull request and generated 6 comments.

Comment thread composer.json
Comment on lines 43 to 49
"suggest": {
"symfony/http-client": "Required for SSE streaming support via SymfonyHttpTransport. Without it, only invoke() is available via PsrHttpTransport.",
"symfony/framework-bundle": "Required for Symfony bundle integration (^6.4|^7.0)",
"illuminate/support": "Required for Laravel service provider integration (^10.0|^11.0|^12.0)",
"guzzlehttp/guzzle": "PSR-18 HTTP client -use with PsrHttpTransport for invoke() support without Symfony."
"guzzlehttp/guzzle": "PSR-18 HTTP client -use with PsrHttpTransport for invoke() support without Symfony.",
"open-telemetry/api": "Required to use OtelTracingMiddleware for distributed tracing of agent calls"
},
Comment thread docs/usage-guide.md
Comment on lines 949 to +959
Example SSE output:

```

### OpenTelemetry trace continuation

The PHP client injects W3C `traceparent` and `tracestate` headers when `OtelTracingMiddleware` is configured. Python wrappers should extract those headers and start wrapper/sdk-python spans as children of the PHP client span.

The reference middleware lives in `examples/python-gateway/tracing.py`. It continues the inbound trace and records only safe wrapper metadata such as operation, route, and wire version. Do not attach prompts, response text, document content, filenames, raw context metadata, raw tool payloads, or session ID values to spans.

Custom endpoints should follow the same rule. `postJson()` and `streamSse()` may carry app-owned payloads, so telemetry should summarize them by route, status, duration, event counts, terminal state, and safe usage counters only.
Comment thread tests/Unit/ResponseObserverTest.php Outdated
Comment on lines +65 to +67
$client->stream('hello', static function (): void {
});
}
Comment thread tests/Unit/ResponseObserverTest.php Outdated
Comment on lines +121 to +122
$client->streamSse('/custom-stream', ['message' => 'hello'], static function (): void {
});
Comment on lines +16 to +17
$received = null;
$handler = new class () extends StreamCallbackHandler {
Comment thread .goat-flow/glossary.md Outdated
Comment on lines +16 to +18
- **preflight** — The `composer preflight` command that runs all quality gates: PHPUnit, PHPStan, PHP-CS-Fixer, PHPMD, cyclomatic complexity, and optionally Infection.
- **StopReason** — Backed enum for why an agent stopped: `EndTurn`, `ToolUse`, `MaxTokens`, `StopSequence`, `GuardrailIntervened`, `Interrupted`.
- **TTFT** — Time to First Token. Measured client-side in `StreamResult` as the elapsed time from request start to the first Text event.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 796279b250

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

// never fired (e.g. buildRequest() failed before the try/catch in
// StrandsClient::invoke()). Without this, the active scope leaks into
// subsequent operations and produces incorrect parent-child traces.
$this->endOrphanedSpans();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve active spans when starting nested requests

Do not clear the stack unconditionally at the start of beforeRequest(): this call ends any currently active span, including legitimate in-flight spans. If user code makes a nested/re-entrant Strands call (for example, an onEvent callback in stream() triggering another client call on the same middleware instance), the outer span is popped here, marked orphaned, and ended before its own afterResponse() runs, which corrupts parent/child relationships and drops final status/duration for the outer operation.

Useful? React with 👍 / 👎.

Comment on lines +191 to +196
self::setUsageAttributes($span, $result->usage);
self::setToolsAttributes($span, $result->toolsUsed);
$span->setAttribute('strands.session.present', $result->sessionId !== null);
$span->setAttribute('strands.stream.text_events', $result->textEvents);
$span->setAttribute('strands.stream.total_events', $result->totalEvents);
$span->setAttribute('strands.stream.cancelled', $result->cancelled);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Mark stream spans failed on terminal error events

Handle terminal error events from /stream as failures in tracing. The stream path can finish with an SSE error event (treated as terminal by the parser) without throwing an exception, so afterResponse() receives HTTP 200 and this observer currently records only normal attributes, leaving the span successful even though the agent reported an error. Propagate terminal event type into StreamResult (or equivalent) and set STATUS_ERROR here when that terminal type is error.

Useful? React with 👍 / 👎.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8880ab10ff

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .gitignore
@@ -1,17 +1,60 @@
# Dependencies
/vendor/
<<<<<<< Updated upstream

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Remove unresolved merge markers from .gitignore

Resolve the merge conflict markers that were committed into .gitignore (<<<<<<<, =======, >>>>>>>). Leaving conflict markers in versioned files means this merge was not actually resolved and can silently pollute ignore rules (the marker lines are parsed as literal patterns), while also signaling an inconsistent tree state for downstream automation and maintainers. I also found the same marker pattern in .github/git-commit-instructions.md and .goat-flow/decisions/README.md, so this needs a full cleanup before release.

Useful? React with 👍 / 👎.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0785f92e02

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +322 to +324
$accept = $headers['Accept'] ?? '';

return str_contains($accept, 'text/event-stream')

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Normalize Accept header lookup for operation detection

deriveOperationName() reads only $headers['Accept'], but HTTP header names are case-insensitive and middleware can legally rebuild the header map using a different key casing (for example accept). In that case, SSE requests are misclassified as post_json instead of stream_sse, which corrupts span operation names and any metrics/alerts grouped by operation type. Use a case-insensitive header lookup (or normalize keys before access) before checking for text/event-stream.

Useful? React with 👍 / 👎.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fca51033a6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +412 to +414
$parts = explode('/', $mediaType, 2);

return $parts[1] ?? $mediaType;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Strip MIME parameters before deriving image format

deriveImageFormat() currently returns everything after the first /, so inputs like image/jpeg; charset=binary (or other parameterized MIME values passed through from HTTP metadata) produce an invalid format such as jpeg; charset=binary. Because both withImage() and withImageFromUrl() now send this format into wire-contract image blocks, strict wrappers can reject otherwise valid image requests at runtime. Normalize the subtype (e.g., split on ; and trim/lowercase) before assigning format.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants