docs: add manual A365 span instrumentation guide (without SDK)#255
docs: add manual A365 span instrumentation guide (without SDK)#255juliomenendez wants to merge 11 commits into
Conversation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds a standalone documentation path for Python teams to emit Agent 365–compatible spans and export them to the A365 ingest endpoint without depending on any microsoft-agents-a365-* packages. This fits into the repo’s observability story by complementing the existing SDK-based “Integrating with existing OpenTelemetry” guide with a manual, SDK-free option.
Changes:
- Introduces a new end-user guide (
docs/manual-a365-span-instrumentation.md) covering attribute contract, export protocol, and runnable Python examples (including a DIY exporter). - Adds internal “superpowers” spec + implementation plan documents capturing the design and rollout steps.
- Adds a cross-link from the existing OpenTelemetry integration guide to the new manual instrumentation guide.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| docs/superpowers/specs/2026-05-19-manual-a365-span-instrumentation-design.md | Design/spec for the manual instrumentation contract and export protocol. |
| docs/superpowers/plans/2026-05-19-manual-a365-span-instrumentation.md | Execution plan for authoring the manual instrumentation guide and related cross-links. |
| docs/manual-a365-span-instrumentation.md | New user-facing manual instrumentation guide with attribute tables, protocol, and Python examples. |
| docs/integrating-with-existing-opentelemetry.md | Adds a callout link to the new manual instrumentation guide for SDK-free scenarios. |
Comments suppressed due to low confidence (4)
docs/manual-a365-span-instrumentation.md:85
- The
inferencespan table statesgen_ai.operation.namemay beTextCompletionorGenerateContent, but those values are currently filtered out by the Agent365 exporter (unsupported inference operation types). Update the table (and/or add an explicit note) so users don’t produce spans that won’t be exported/ingested.
| Tier | Attribute | Expected value | Notes |
|------|-----------|----------------|-------|
| **Required** | `gen_ai.operation.name` | `"Chat"` or `"TextCompletion"` or `"GenerateContent"` | See accepted values above |
| **Required** | `microsoft.tenant.id` | Tenant GUID | Same as parent |
| **Required** | `gen_ai.agent.id` | Agent GUID | Same as parent |
| **Required** | `gen_ai.request.model` | Model name (e.g. `"gpt-4o"`) | |
docs/manual-a365-span-instrumentation.md:403
- The constraints state a 250,000-byte max individual span size, but the SDK’s enforced limit is 250 * 1024 bytes (256,000). Consider documenting this as “250 KiB (256,000 bytes)” or updating the exact byte count to match the implementation so users can size payloads correctly.
| Constraint | Value | Behavior |
|------------|-------|----------|
| Max payload size | ~900,000 bytes | Split spans across multiple POST requests |
| Max individual span | 250,000 bytes | Largest attributes are replaced with `"TRUNCATED"` |
| Retry on | 408, 429, 5xx | Exponential backoff; respect `Retry-After` header for 429 |
| Fail on | Other 4xx | Non-retryable; check auth and payload format |
| Timeout | 30 seconds | Per-request HTTP timeout |
docs/manual-a365-span-instrumentation.md:460
- Example 3’s
Agent365ManualExporterclaims to be a replacement for the SDK exporter, but it doesn’t implement payload chunking (~900KB limit) or per-span truncation (~250KB) that the doc describes (and that the SDK implements). Either add chunking/truncation to the example or clearly label it as non-production and point readers to the required logic; otherwise the sample will routinely fail with larger traces.
def export(self, spans: Sequence[ReadableSpan]) -> SpanExportResult:
# Partition by (tenant_id, agent_id)
groups = self._partition(spans)
if not groups:
return SpanExportResult.SUCCESS
any_failure = False
for (tenant_id, agent_id), group_spans in groups.items():
url = (
f"{A365_ENDPOINT}/observability/tenants/{tenant_id}"
f"/otlp/agents/{agent_id}/traces?api-version=1"
)
payload = self._build_payload(group_spans)
body = json.dumps(payload, separators=(",", ":"), ensure_ascii=False)
docs/manual-a365-span-instrumentation.md:473
- The sample exporter always sends an
Authorization: Bearer {token}header without validating that the resolver returned a non-empty token. The SDK treats a missing token differently; for manual export, it’s safer to fail the export (or at least log and skip) when the token is empty/None to avoid confusing 401s and hard-to-debug behavior.
# Resolve auth token
try:
token = self._token_resolver(agent_id, tenant_id)
except Exception as e:
logger.error(f"Token resolution failed: {e}")
any_failure = True
continue
headers = {
"content-type": "application/json",
"authorization": f"Bearer {token}",
}
| | `TextCompletion` | Inference (text completion) | | ||
| | `GenerateContent` | Inference (content generation) | |
- Add Agent365.Observability.OtelWrite auth scope requirement - Add agent-ID-must-match-token constraint documentation - Add output_messages operation type and span section - Add server.port and gen_ai.output.messages to attribute tables - Fix max payload: document 1MB server limit (900KB SDK buffer) - Add payload chunking helper and span truncation guidance - Fix token resolver signature to str | None, handle None case - Add links mapping to DIY exporter (was hardcoded None) - Add _chunk_by_size method to exporter for large batches Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds documentation for teams that want A365 portal compatibility without the SDK dependency.
What this covers
opentelemetry-sdk+requests:Audience
Python developers with existing OpenTelemetry instrumentation who want their spans to appear in the Agent 365 portal without importing any
microsoft-agents-a365-*package.Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com