Skip to content

docs: add manual A365 span instrumentation guide (without SDK)#255

Open
juliomenendez wants to merge 11 commits into
mainfrom
docs/manual-a365-span-instrumentation
Open

docs: add manual A365 span instrumentation guide (without SDK)#255
juliomenendez wants to merge 11 commits into
mainfrom
docs/manual-a365-span-instrumentation

Conversation

@juliomenendez
Copy link
Copy Markdown
Contributor

Adds documentation for teams that want A365 portal compatibility without the SDK dependency.

What this covers

  • Tiered attribute contract (required/recommended/optional) for all three span types (invoke_agent, inference, execute_tool)
  • Export protocol documentation (endpoint URL, auth, payload format, constraints, retry strategy)
  • Complete runnable Python examples using only opentelemetry-sdk + requests:
    • Example 1: Minimal invoke_agent span
    • Example 2: Full agent turn with span hierarchy
    • Example 3: Custom SpanExporter implementation for the A365 backend
    • Example 4: End-to-end agent loop
  • Validation and troubleshooting guide
  • Cross-link from existing integration guide

Audience

Python developers with existing OpenTelemetry instrumentation who want their spans to appear in the Agent 365 portal without importing any microsoft-agents-a365-* package.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

juliomenendez and others added 9 commits May 19, 2026 09:26
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 19, 2026 15:55
@juliomenendez juliomenendez requested a review from a team as a code owner May 19, 2026 15:55
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 19, 2026

⚠️ Deprecation Warning: The deny-licenses option is deprecated for possible removal in the next major release. For more information, see issue 997.

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a standalone documentation path for Python teams to emit Agent 365–compatible spans and export them to the A365 ingest endpoint without depending on any microsoft-agents-a365-* packages. This fits into the repo’s observability story by complementing the existing SDK-based “Integrating with existing OpenTelemetry” guide with a manual, SDK-free option.

Changes:

  • Introduces a new end-user guide (docs/manual-a365-span-instrumentation.md) covering attribute contract, export protocol, and runnable Python examples (including a DIY exporter).
  • Adds internal “superpowers” spec + implementation plan documents capturing the design and rollout steps.
  • Adds a cross-link from the existing OpenTelemetry integration guide to the new manual instrumentation guide.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
docs/superpowers/specs/2026-05-19-manual-a365-span-instrumentation-design.md Design/spec for the manual instrumentation contract and export protocol.
docs/superpowers/plans/2026-05-19-manual-a365-span-instrumentation.md Execution plan for authoring the manual instrumentation guide and related cross-links.
docs/manual-a365-span-instrumentation.md New user-facing manual instrumentation guide with attribute tables, protocol, and Python examples.
docs/integrating-with-existing-opentelemetry.md Adds a callout link to the new manual instrumentation guide for SDK-free scenarios.
Comments suppressed due to low confidence (4)

docs/manual-a365-span-instrumentation.md:85

  • The inference span table states gen_ai.operation.name may be TextCompletion or GenerateContent, but those values are currently filtered out by the Agent365 exporter (unsupported inference operation types). Update the table (and/or add an explicit note) so users don’t produce spans that won’t be exported/ingested.
| Tier | Attribute | Expected value | Notes |
|------|-----------|----------------|-------|
| **Required** | `gen_ai.operation.name` | `"Chat"` or `"TextCompletion"` or `"GenerateContent"` | See accepted values above |
| **Required** | `microsoft.tenant.id` | Tenant GUID | Same as parent |
| **Required** | `gen_ai.agent.id` | Agent GUID | Same as parent |
| **Required** | `gen_ai.request.model` | Model name (e.g. `"gpt-4o"`) | |

docs/manual-a365-span-instrumentation.md:403

  • The constraints state a 250,000-byte max individual span size, but the SDK’s enforced limit is 250 * 1024 bytes (256,000). Consider documenting this as “250 KiB (256,000 bytes)” or updating the exact byte count to match the implementation so users can size payloads correctly.
| Constraint | Value | Behavior |
|------------|-------|----------|
| Max payload size | ~900,000 bytes | Split spans across multiple POST requests |
| Max individual span | 250,000 bytes | Largest attributes are replaced with `"TRUNCATED"` |
| Retry on | 408, 429, 5xx | Exponential backoff; respect `Retry-After` header for 429 |
| Fail on | Other 4xx | Non-retryable; check auth and payload format |
| Timeout | 30 seconds | Per-request HTTP timeout |

docs/manual-a365-span-instrumentation.md:460

  • Example 3’s Agent365ManualExporter claims to be a replacement for the SDK exporter, but it doesn’t implement payload chunking (~900KB limit) or per-span truncation (~250KB) that the doc describes (and that the SDK implements). Either add chunking/truncation to the example or clearly label it as non-production and point readers to the required logic; otherwise the sample will routinely fail with larger traces.
    def export(self, spans: Sequence[ReadableSpan]) -> SpanExportResult:
        # Partition by (tenant_id, agent_id)
        groups = self._partition(spans)
        if not groups:
            return SpanExportResult.SUCCESS

        any_failure = False
        for (tenant_id, agent_id), group_spans in groups.items():
            url = (
                f"{A365_ENDPOINT}/observability/tenants/{tenant_id}"
                f"/otlp/agents/{agent_id}/traces?api-version=1"
            )
            payload = self._build_payload(group_spans)
            body = json.dumps(payload, separators=(",", ":"), ensure_ascii=False)

docs/manual-a365-span-instrumentation.md:473

  • The sample exporter always sends an Authorization: Bearer {token} header without validating that the resolver returned a non-empty token. The SDK treats a missing token differently; for manual export, it’s safer to fail the export (or at least log and skip) when the token is empty/None to avoid confusing 401s and hard-to-debug behavior.
            # Resolve auth token
            try:
                token = self._token_resolver(agent_id, tenant_id)
            except Exception as e:
                logger.error(f"Token resolution failed: {e}")
                any_failure = True
                continue

            headers = {
                "content-type": "application/json",
                "authorization": f"Bearer {token}",
            }

Comment on lines +42 to +43
| `TextCompletion` | Inference (text completion) |
| `GenerateContent` | Inference (content generation) |
- Add Agent365.Observability.OtelWrite auth scope requirement
- Add agent-ID-must-match-token constraint documentation
- Add output_messages operation type and span section
- Add server.port and gen_ai.output.messages to attribute tables
- Fix max payload: document 1MB server limit (900KB SDK buffer)
- Add payload chunking helper and span truncation guidance
- Fix token resolver signature to str | None, handle None case
- Add links mapping to DIY exporter (was hardcoded None)
- Add _chunk_by_size method to exporter for large batches

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants