ci: gate docstring quality and coverage in CI (#616)#3
Open
ci: gate docstring quality and coverage in CI (#616)#3
Conversation
…ive-computing#563) * feat: add token usage counter metrics Add mellea.llm.tokens.input/output counters following Gen-AI semantic conventions with zero overhead when disabled Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * feat: integrate token metrics into OpenAI, Ollama, WatsonX, and LiteLLM backends Add record_token_usage_metrics() calls to all backend post_processing methods to track input/output tokens. Add get_value() helper in backends/utils.py to handle dict/object attribute extraction. Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * feat: add token metrics to HuggingFace backend Calculate token counts from input_ids and output sequences. Records to both tracing spans and metrics using helper function. Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * test: add token metrics integration tests for all backends - Add integration tests for Ollama, OpenAI, LiteLLM, HuggingFace, WatsonX - Tests revealed metrics were coupled with tracing (architectural issue) - Fixed: Metrics now record independently of tracing spans - WatsonX: Store full response to preserve usage information - HuggingFace: Add zero-overhead guard, optimize test model Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * fix: use module-scoped fixture to prevent tracer provider reinitialization Use MonkeyPatch for cleanup and update Watsonx to granite-4-h-small. Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * docs: add token usage metrics documentation and examples - Add Token Usage Metrics section to docs/dev/telemetry.md with metric definitions, backend support table, and configuration examples - Create metrics_example.py demonstrating token tracking with tested console output - Update telemetry_example.py to reference new metrics example - Update examples/telemetry/README.md with metrics quick start guide Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * fix: lazy import is_metrics_enabled in backends Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * test: add streaming token metrics test and document timing Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * refactor: consolidate duplicate get_value function Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * feat: add streaming token usage metrics support Enable token metrics for streaming responses in OpenAI and LiteLLM backends. Parametrize backend tests for streaming/non-streaming coverage. Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * test: update to non-deprecated Granite 4 hybrid models - Replace llama3.2:1b with granite4:micro-h in telemetry tests - Replace deprecated granite-4.0-micro with granite-4.0-h-micro in HF tests - Use model constants instead of hardcoded strings - Remove redundant gh_run checks (rely on pytest markers) Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * style: apply ruff formatting to test signatures Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * test: skip HuggingFace test in CI (requires model download) Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * test: add unit tests and reorganize telemetry tests Add 4 unit tests for record_token_usage_metrics() in test_metrics_token.py. Split test_backend_telemetry.py into focused modules: - test_tracing_backend.py: backend tracing integration tests - test_metrics_backend.py: backend token metrics integration tests - test_metrics_token.py: unit tests for record_token_usage_metrics() Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * doc: addressed review comments Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> --------- Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> Co-authored-by: jakelorocco <59755218+jakelorocco@users.noreply.github.com>
…ting#569) * use larger model and change jinja ref for decompose example * llm as default constraint strategy if none * add comment stating 8b is needed for tags * adds default llm to constraint validation strategies
…#582) * docs: add initial specification for extensibility hooks in Mellea Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * docs: update hook system spec to factor component hooks and address design drifts Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * docs: add clarifications for component hook payload fields and additional suggestions by maintainers Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * docs: add implementation plan Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * docs: update implementation plan Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * docs: minor cleanups to implementation plan Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * feat: update to reflect programmatic and functional-first design Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * feat: specify hook payload write protection Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * chore: add optional dependency for plugin framework Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * feat: implemented hook system and initial set of hook types Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * feat: add plugin examples Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * refactor: update examples to use MelleaHookType enum Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * feat: add PluginMode enum Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * refactor: drop estimated_tokens from generation_pre_call payload Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * feat: add context manager block support for plugins and plugin sets Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * docs: update hook system specification to document with-block support Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * chore: removed unused imports Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * feat: implement tool hooks Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * feat: update example for tool call hooks Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * chore: tune internal log levels for clarity Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * chore: tune internal log levels for clarity Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * docs: updated spec with not implemented payload fields Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * fix: minor implementaiton bugs and tests Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * Update hook_system.md Added implementation priorities to Hook Table. * refactor: use cpex package; update handling of modified_payloads Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * chore: update lock file Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * chore: bump cpex version to 0.1.0.dev2 Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * fix: mode semantics Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * feat: implemented fire_and_forget mode Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * feat: update execution mode map Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * feat: update plugin modes and specs Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * feat: update examples with concurrent hooks Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * feat: update examples Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * feat: refine has_plugins to accept hook type Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * chore: update cpex version Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * chore: cleanup Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * refactor: tool call hook types Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * refactor: tool hooks example; payload mutation handling Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * fix: PR review comments Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * chore: renamed dependency group from cpex to hooks Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * chore: lint and formatting fixes Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * fix: mypy and remaining lint issues Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * docs: updated specs to reflect implementation changes Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * refactor: backend generate_from_context wrapper Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * refactor: generation pre call Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * fix: generation_post_call hook placement Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * feat: added modify result object, unregister function, other cleanups Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * feat: improve handling of generate_post_call Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * fix: previously existing mypy issues (can be cherry picked to fix main branch) Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * refactor: improves invoke_hook function; deduplicate context and other objects passed to plugins Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * docs: update hook specs Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * refactor: drop backend_kwargs from session_pre_init payload Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * refactor: improvements and bug fixes Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * refactor: remove unimplemented generation_stream_chunk from hook type enum Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * fix: regression introduced with weakrefs Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * chore: cleanup Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * docs: add tutorial-style examples for plugins Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * chore: fix formatting issues in new examples Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * refactor: drop weakrefs, unwrap session in payloads, and refactor execution modes Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * docs: clarify payload mutability approach in docstring Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * docs: update hook spec to document payload mutability approach Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * refactor: set default to silence plugin errors; added acceptance tests for plugins Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * fix: minor regressions in examples Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * docs: initial user docs for plugins Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * refactor: converted a few writable fields into observe-only fields in hook payloads Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * docs: minor updates and nits to the plugin docs Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> * refactor: move pre and post gen hooks * refactor: modify plugin fixture and fix tests * refactor: refactor tests for hook_call sites * refactor: move to local imports for hooks; fix pre-commit issues * fix: add back type error ignore --------- Signed-off-by: Frederico Araujo <frederico.araujo@ibm.com> Co-authored-by: Hendrik Strobelt <HendrikStrobelt@users.noreply.github.com> Co-authored-by: Jake LoRocco <jake.lorocco@ibm.com>
* fix: guarding optional imports for hooks Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com> * fix: issues with mellea when hooks not installed --------- Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com> Co-authored-by: Jake LoRocco <jake.lorocco@ibm.com>
- astream() on computed MOT now raises error - the last astream call does NOT contain the full text again. All astream() chunks concatenated will be the full text. Related tests are added/modified.
* fix: hf metrics tests run out of memory Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * test: add requires_heavy_ram marker to HuggingFace backend tests Add @pytest.mark.requires_heavy_ram to tests that instantiate LocalHFBackend to address memory leak issues when running these tests in pytest. This ensures tests are skipped on systems without sufficient RAM. Changes: - test/telemetry/test_metrics_backend.py: Added marker to test_huggingface_token_metrics_integration - test/stdlib/test_spans.py: Added marker to module-level pytestmark Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> --------- Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com>
…erative-computing#605) - Add --isolate-heavy CLI flag for explicit GPU isolation - Add @pytest.mark.requires_gpu_isolation marker - Rewrite pytest_collection_finish with 4-guard architecture - Fix test discovery (pytest --collect-only now works instantly) - Apply markers to all 4 heavy GPU test files - Fix failure propagation from subprocesses - Update documentation for new markers and flags Fixes generative-computing#604
…eus) (generative-computing#610) * feat: add configurable OTLP metrics exporter - Add MELLEA_METRICS_OTLP env var for explicit enablement - Support metrics-specific endpoint via OTEL_EXPORTER_OTLP_METRICS_ENDPOINT - Add configurable export interval via MELLEA_METRICS_EXPORT_INTERVAL - Add error handling and validation with helpful warnings Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * test: add unit tests for OTLP exporter enhancements Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * feat(telemetry): add Prometheus metrics exporter with tests Add Prometheus exporter support with HTTP endpoint for metrics scraping. Includes comprehensive unit tests and uses standard OpenTelemetry env vars. Also updates previous OTLP implementation to use standard OTEL_METRIC_EXPORT_INTERVAL (milliseconds) instead of custom MELLEA_METRICS_EXPORT_INTERVAL (seconds). Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * docs: add comprehensive metrics exporter documentation - Updated docs/dev/telemetry.md with detailed configuration for Console, OTLP, and Prometheus exporters - Added troubleshooting section for common metrics issues - Enhanced mellea/telemetry/metrics.py module docstring with exporter examples - Updated docs/examples/telemetry/metrics_example.py with configuration examples for all exporters Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * fix: addressed review Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * fix(telemetry): remove HTTP server from library, use registry-based Prometheus The library should not start an HTTP server for Prometheus scraping. Instead, register metrics with the prometheus_client default registry via PrometheusMetricReader and let the application expose the endpoint. - Replace OTEL_EXPORTER_PROMETHEUS_PORT/HOST with MELLEA_METRICS_PROMETHEUS - Remove start_http_server() call from metrics module - Update example to show application-side server startup and keep-alive - Update docs with registry-based approach and framework examples - Add missing env vars to telemetry __init__.py docstring - Clean up tests: remove port/server mocking, unused imports * fix: update docs/examples/telemetry/metrics_example.py Co-authored-by: Paul Schweigert <paul@paulschweigert.com> * fix: github ui commits skip pre-format Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> --------- Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> Co-authored-by: Paul Schweigert <paul@paulschweigert.com>
…ng#609) * Fix flaky test Signed-off-by: Fred Reiss <frreiss@us.ibm.com> * Add test cases for intrinsics formatters Signed-off-by: Fred Reiss <frreiss@us.ibm.com> * First step of manual merge Signed-off-by: Fred Reiss <frreiss@us.ibm.com> * xfail tests to work around known CI issues Signed-off-by: Fred Reiss <frreiss@us.ibm.com> * Make failing test produce useful output on CI Signed-off-by: Fred Reiss <frreiss@us.ibm.com> * More CI debugging Signed-off-by: Fred Reiss <frreiss@us.ibm.com> --------- Signed-off-by: Fred Reiss <frreiss@us.ibm.com>
* docs: API docs pipeline improvements
- Fix MDX parse errors: wrap bare doctest blocks, escape {/} outside fences
- Fix GitHub source links to use versioned tag format
- Richer landing page with prose bullet list and module descriptions
- Add --source-dir support to build.py and audit_coverage.py
- Add --quality docstring audit (8 issue categories, 322/322 pass)
- Add --orphans nav audit for MDX files absent from docs.json
- Add --fail-on-quality flag for CI/pre-commit hard gate
- Emit GitHub Actions annotations when run in CI (GITHUB_ACTIONS=true)
- Fix *args/**kwargs forwarder exemption using Griffe ParameterKind
- Remove generated API docs from version control; add to .gitignore
- Add poethepoet dev tasks (apidocs, apidocs-quality, apidocs-preview, etc)
- Disable CI workflow pending branch strategy decision (see PR generative-computing#611)
- Remove redundant files (requirements.txt, stray diagnostic output)
Fixes generative-computing#532
* chore: remove 'Made with Bob' attribution comments from docs-autogen tooling
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…n repair messages (generative-computing#633) * fix: update multiturnstrategy repair message append validation failure reasons to the repair message for MultiTurnStrategy Fixes generative-computing#631 Signed-off-by: va <va@us.ibm.com> * docs: update docstring to be correct --------- Signed-off-by: va <va@us.ibm.com> Co-authored-by: jakelorocco <59755218+jakelorocco@users.noreply.github.com>
…-computing#619) Adds return type and parameter annotations to ~50 functions across the 14 high-priority public API files identified in issue generative-computing#615. Also fixes two latent type bugs uncovered during annotation: - stdlib/requirements/md.py: _md_list/_md_table were returning bare bool where Requirement.validate() expected ValidationResult; wrap in ValidationResult(result=...) - stdlib/tools/interpreter.py: add narrowing asserts for stdout/stderr to confirm the -> str return type that the top-level assert guarantees Closes generative-computing#615
) (generative-computing#601) * docs: Phase 0 infrastructure + getting-started.md - CONTRIBUTING.md: writing conventions, PR checklist, code block runability rule, Backend note callout type - .markdownlint.json: fix MD025 front_matter_title so body H1 is allowed alongside YAML frontmatter title - getting-started.md: full tutorial page — install, hello world, user variables, requirements, core concepts, troubleshooting - glossary.md: skeleton in place * docs: Phase 1.2 — the-instruction-model.md Full how-to page covering instruct(), user variables, requirements, custom validation functions (req/check/simple_validate), sampling strategies + IVR loop, grounding context, images, ChatContext, and chat() vs instruct() comparison. Imports verified against source. One inline review note on icl_examples API pending verification. * docs: Phase 1.3 — backends-and-configuration.md Covers Ollama (default), OpenAI-compatible, LiteLLM, HuggingFace, and WatsonX backends. ModelOption constants table, system prompt pattern, direct backend construction. Backend note callouts on each provider. Imports verified against source. * docs: Phase 2.1 — generative-functions.md Covers @Generative decorator, Literal type constraints, Pydantic structured output, pre/post-conditions (PreconditionException), composing generative pipelines, and chain-of-thought pattern. Imports verified against source. * docs: Phase 2.2 — tools-and-agents.md Covers @tool decorator, MelleaTool.from_callable/from_langchain/ from_smolagents, ModelOption.TOOLS, uses_tool, tool_arg_validator, react() agentic loop with structured output, code_interpreter. Incorporates agent definition and ReACT context from old agents.mdx. Imports verified against source (react is async). * docs: Phase 2.3 — working-with-data.md Covers grounding context, RAG with FAISS + generative filtering, @mify / MObject pattern (query/transform, ad-hoc mify, custom stringify, funcs_include), and RichDocument with PDF parsing and table extraction. Incorporates content from mobjects.mdx and generative-slots.mdx. Imports verified against CI examples. * docs: Phase 2.4 — intrinsics.md Covers all RAG intrinsic operations: answerability, context relevance, hallucination detection, answer relevance rewriting, query rewriting, citations, and direct Intrinsic/GraniteCommonAdapter usage. Backend note callout on HF requirement. Imports verified against source. Note: adapters.mdx content (tool calling) already covered in tools-and-agents.md. * docs: Phase 2.5 — sampling-strategies.md Covers RejectionSamplingStrategy (with SamplingResult inspection), validation feedback via ValidationResult.reason, SOFAISamplingStrategy dual-model escalation with s2_solver_mode table, BudgetForcingSamplingStrategy, and MajorityVotingStrategyForMath. Review notes on budget forcing and majority voting exports/parameters. * docs: Phase 2.6 — async-and-streaming.md Covers async/sync method table, parallel generation with ModelOutputThunk, wait_for_all_mots, streaming with ModelOption.STREAM + astream(), and context warnings for concurrent ChatContext use. Imports verified. * docs: Phase 2.7 — act-and-aact.md Covers three abstraction levels (instruct/act/mfuncs), working with Message and Document, validation + sampling strategies via act(), structured output with format=, functional API (mfuncs.act/aact), and aact() async usage. Fixed stale numeric cross-references. * docs: Phase 4.1 — safety-and-validation.md Covers GuardianCheck + GuardianRisk (full enum table), custom criteria, groundedness detection, use as instruct() requirement, and input gate pattern. Backend note on Guardian model independence. Verified against CI example docs/examples/safety/guardian.py. * docs: Phase 4.2 — mcp-integration.md Covers FastMCP server creation, @mcp.tool decorator, mcp dev UI, ModelOption in tools, multiple tools in one server. Imports verified against mcp_example.ipynb CI notebook. * docs: Phase 4.3 — telemetry.md Covers two independent OTEL trace scopes (application + backend), all configuration env vars, start_session() as context manager for trace lifecycle, console debugging, Jaeger/OTLP export, programmatic status checks, and metrics API (create_counter/create_histogram). Verified against mellea/telemetry/__init__.py and telemetry_example.py. Includes Gen-AI semantic convention attribute tables. * docs: Phase 4.4 — custom-sessions.md Covers SimpleContext vs ChatContext, ctx introspection helpers (last_output/last_turn), session.clone() for context branching, session.reset(), and extending MelleaSession with a ChatCheckingSession example. Absorbs content from core-concept/context-management.mdx. Verified against session.py and creating_a_new_type_of_session.py. * docs: Phase 5.1 — generative-programming.md Conceptual page explaining what generative programs are, the deterministic/stochastic interleaving challenge, requirements as the core reliability mechanism, failure handling, uncertainty compounding, context management, and Mellea's position as execution layer (not orchestrator). Absorbs content from overview/generative-programming.mdx and overview/mellea-welcome.mdx. * docs: Phase 5.2 — mellea-core-internals.md Covers the three core data structures (CBlock, Component, ModelOutputThunk), six abstraction layers from MelleaSession down to direct backend.generate_from_context() with lazy thunks, composition via SimpleComponent, and template/prompt engineering (TemplateFormatter, TemplateRepresentation, Jinja2 template resolution, model-specific paths). Verified imports against session_deepdive step files. Absorbs content from prompt-engineering.mdx. * docs: Phase 5.3 — troubleshooting.md Covers installation errors (outlines/Rust, Intel Mac, missing extras), Ollama connectivity, requirements/sampling diagnosis with return_sampling_results=True, PreconditionException, react() loop exhaustion, tool selection debugging, async/event-loop errors, Jupyter nest_asyncio, and Guardian setup issues. * docs: Phase B — restructure guide/ into target hierarchy Reorganises the 18 flat guide/ pages written in Phase A into the target Diataxis-aligned directory structure: - getting-started/ installation.md + quickstart.md (split from getting-started.md) - concepts/ generative-programming.md, instruct-validate-repair.md - how-to/ use-async-and-streaming.md, use-context-and-sessions.md - integrations/ mcp-and-m-serve.md - evaluation-and-observability/ metrics-and-telemetry.md - advanced/ intrinsics.md, inference-time-scaling.md, security-and-taint-tracking.md, mellea-core-internals.md - troubleshooting/ common-errors.md - guide/ generative-functions, tools-and-agents, working-with-data, backends-and-configuration, act-and-aact, glossary (in place) Updates: - docs.json: replaces old MDX nav with new hierarchy (9 groups) - All cross-links updated to new relative paths - Nav footers updated to match new linear order - Navbar "Contribution Guide" link updated to /guide/CONTRIBUTING Old MDX pages (overview/, core-concept/) removed from nav; files kept on disk until Phase C content is verified complete. * docs: Phase C.1 — concepts/requirements-system.md Adds depth page on the Requirements system: Requirement class, ValidationResult, simple_validate(), req()/check(), check_only/purple-elephant effect, precondition_requirements + PreconditionException, SamplingResult inspection, and LLM-as-judge vs custom validator trade-offs. Updates instruct-validate-repair.md footer and docs.json nav. * docs: Phase C.2 — concepts/architecture-vs-agents.md Adds positioning page explaining Mellea as execution layer vs. orchestration frameworks (LangChain, smolagents). Covers the three adoption paths (greenfield, leaf-node injection, tool enrichment) with concrete code examples showing how Mellea functions compose inside smolagents and LangChain. Updates requirements-system.md footer and docs.json nav. * docs: Phase C.3 — how-to/enforce-structured-output.md Adds task-oriented guide for structured output covering @Generative with Literal/Pydantic return types, instruct(format=...) for dynamic prompts, content validation on structured output (at_least_n pattern), and guidance on choosing between the two approaches. Updates docs.json nav. * docs: Phase C.4 — how-to/write-custom-verifiers.md Adds practical guide for writing custom validation functions: full validation_fn signature, simple_validate shortcut, common patterns (JSON, Pydantic schema, regex, external API), ValidationResult.score, composing verifiers, and debugging with SamplingResult.sample_validations. Updates docs.json nav. * docs: Phase C.5 — integrations/ollama.md Adds Ollama integration page covering installation, default setup (granite4:micro), recommended models table, custom host configuration, ModelOption usage, vision models, OpenAI-compatible endpoint, and troubleshooting section. Updates docs.json nav. * docs: Phase C.6 — integrations/openai.md Adds OpenAI integration page covering OpenAI API setup, OpenAI-compatible local servers (LM Studio, Ollama endpoint, vLLM), vision/multimodal input, structured output with format=, ModelOption usage, and troubleshooting. Updates docs.json nav. * docs: Phase C.7 — tutorials/01-your-first-generative-program.md Adds the first tutorial: an 8-step walkthrough building a document analysis pipeline from a single instruct() call through requirements, rejection sampling, @Generative with Literal and Pydantic, and composition. Uses a consistent customer feedback example throughout. Adds Tutorials group to docs.json nav. * docs: Phase C.8 — concepts/context-and-sessions.md Adds architecture explanation page covering the Component/Backend/Context/ Session four-layer architecture, SimpleContext vs ChatContext trade-offs, context window management, session cloning, context inspection, and why explicit context management matters. Updates docs.json nav. * docs: Phase C.9 — evaluation-and-observability/handling-exceptions.md Adds error handling page covering SamplingResult.success=False patterns, PreconditionException inspection, ComponentParseError, backend connection errors, fallback patterns (simpler call, stronger model / SOFAI), and logging failures. Updates docs.json nav. * docs: Phase C.10 — integrations/bedrock-and-watsonx.md Adds cloud backends page: AWS Bedrock via create_bedrock_mantle_backend and LiteLLM, IBM WatsonX with WatsonxAIBackend. Covers credentials, region selection, available models, direct and environment-variable auth, and troubleshooting for both providers. Updates docs.json nav. * docs: Phase C-review fixes — nav footers, code corrections, linting Fix 9 nav footer mismatches caused by incremental page insertions not updating adjacent pages: quickstart, generative-programming, architecture-vs-agents, use-context-and-sessions, write-custom-verifiers, ollama, openai, metrics-and-telemetry, tutorials/01. Code fixes: - requirements-system.md: add missing RejectionSamplingStrategy import in precondition example - bedrock-and-watsonx.md: str(result) for consistency - instruct-validate-repair.md: correct diataxis to explanation - tutorials/01: fix stale Full example pointer, remove broken Next link to unwritten page 02 - use-context-and-sessions.md: add sidebarTitle to disambiguate from concepts page; fix over-heavy prerequisite Linting: - Add .markdownlint.json at docs/docs/ level so config covers all subdirectories (concepts/, how-to/, integrations/, etc.), not just guide/ * docs: fix README and add reader-facing index.md README.md had a broken fenced code block (mismatched backticks), a duplicate Getting Started section, emoji, and a wrong URL pointing to mellea.ai instead of docs.mellea.ai. Rewritten as a clean contributor setup guide. index.md is a new reader-facing landing page for GitHub and non-Mintlify browsing. Mintlify ignores it (root redirects to getting-started/installation via docs.json) but GitHub renders it as the directory index. * docs: expand index.md to show full section structure * docs: port 4 missing pages from Hendrik's MDX — generative-functions, mobjects-and-mify, configure-model-options, template-formatting * docs: fix convention violations in 4 new pages (US English, missing import, table spacing) * docs: update index.md with 4 new pages * docs: add Core Reference to index.md; cross-link tools-and-agents from generative-functions * docs: add advanced/lora-and-alora-adapters.md — train and use custom adapters * docs: fix import errors, deprecated model IDs, nav link, and add Mintlify redirects - configure-model-options.md: fix ModelOption import path (backends.types → backends); replace deprecated IBM_GRANITE_3_2_8B/IBM_GRANITE_4_MICRO_3B with current models - mobjects-and-mify.md: fix mify/MifiedProtocol import path (stdlib.mify → stdlib.components); fix ModelOption import path - docs.json: fix CONTRIBUTING navbar href to GitHub URL (was unreachable /guide/CONTRIBUTING); add feedback.thumbsRating; add redirects for all removed MDX pages to new paths - CONTRIBUTING.md: add docs writing guide link in Additional Resources * docs: fix docs badge URL in README (mellea.ai → docs.mellea.ai) * docs: add m serve section, fix landing page, add GitHub nav index - mcp-and-m-serve.md: retitle to "MCP and m serve"; add m serve section (serve() signature, starting the server, calling the endpoint); fix deprecated model IDs; fix nav footer (Previous was wrong page); fix MD028/MD024 lint warnings - index.mdx: new Mintlify landing page with CardGroup layout covering core concepts, integrations, and quick-start paths; replaces the plain list that was being served at / - docs/index.md: move GitHub-only nav index out of Mintlify root (to docs/ parent) so it no longer overrides the landing page * docs: revise landing page — closer to original style with updated content * docs: align landing page with mellea.ai messaging and voice * docs: fix landing page links — remove non-existent HuggingFace page, add How-To section * docs: remove oversized logo from landing page — navbar logo is sufficient * docs: split MCP page, add HuggingFace/vLLM integration, update landing page Split integrations/mcp-and-m-serve.md into two focused pages: - integrations/mcp.md — FastMCP tool wrapping for MCP clients - integrations/m-serve.md — OpenAI-compatible serving with m serve Add integrations/huggingface-and-vllm.md covering LocalHFBackend (experimental features: aLoRA, constrained decoding; cuda/mps/cpu auto) and LocalVLLMBackend (high-throughput batched inference; Linux only). Update index.mdx: add HuggingFace/vLLM card to Backends section, fix MCP card link, add subtle Mellea logo. Update docs.json: nav uses new page slugs, redirect /integrations/mcp-and-m-serve → /integrations/mcp. * docs: remove redundant logo from landing page body (navbar logo sufficient) * docs: fix logo CSS classes — dark/light were inverted * docs: remove page-body logo (wordmark-only SVG; navbar already shows it) * docs: add Mellea mushroom mascot to landing page * docs: fix and expand glossary — correct 5 wrong definitions, add 7 missing terms * docs: add m decompose guide page; expand glossary with 5 missing terms * docs: add glossary links on first use; strengthen CONTRIBUTING standard - Link Mellea-specific terms to glossary on first use across 8 pages: quickstart, tutorial/01, concepts/generative-programming, concepts/generative-functions, concepts/instruct-validate-repair, concepts/requirements-system, concepts/context-and-sessions - Add external links for Jinja2 and Pydantic on first use - Expand Requirement glossary entry to document req(), check(), and simple_validate() including the prompt-inclusion distinction - Fix metrics-and-telemetry.md Previous footer (was mcp-and-m-serve, now m-serve) - CONTRIBUTING.md: formalise glossary link rule with required-terms table and add checklist item for glossary links * docs: add integrations/langchain-and-smolagents.md Covers two integration patterns: - MelleaTool.from_langchain() — wrap any LangChain BaseTool for use in Mellea - MelleaTool.from_smolagents() — wrap smolagents tools (pip install 'mellea[smolagents]') - Seeding ChatContext from LangChain message history via convert_to_openai_messages Add to docs.json nav after m-serve; update m-serve and metrics-and-telemetry nav footers to reflect new page position. * docs: split bedrock-and-watsonx into separate bedrock.md and watsonx.md AWS Bedrock and IBM WatsonX are distinct platforms with different auth, packages, and model IDs. Each now has its own page. Nav chain: openai → bedrock → watsonx → huggingface-and-vllm Redirect: /integrations/bedrock-and-watsonx → /integrations/bedrock * docs: add how-to/use-images-and-vision.md; fix nav footer chain Covers PIL image input via instruct()/chat(), ImageBlock for OpenAI backend, multi-turn vision with ChatContext, and backend support matrix. Sources verified against vision_ollama_chat.py and vision_openai_examples.py examples. Also fix pre-existing nav bug: ollama.md Previous was pointing to write-custom-verifiers, skipping configure-model-options entirely. Nav chain: configure-model-options → use-images-and-vision → ollama * docs: fix landing page card, add ImageBlock to glossary, improve backend pages - index.mdx: split single "Bedrock / watsonx" card into separate AWS Bedrock and IBM WatsonX cards pointing to the correct split pages - glossary.md: add ImageBlock entry (used by use-images-and-vision.md) - bedrock.md: add glossary links for Backend/MelleaSession on first prose use; add Vision support section noting image input works via OpenAI-compatible path - watsonx.md: add glossary links for start_session/Backend on first prose use; add Vision support section noting WatsonxAIBackend does not support images * docs: split huggingface-and-vllm into separate huggingface.md and vllm.md - Create integrations/huggingface.md (LocalHFBackend, device selection, KV cache, aLoRA, vision, troubleshooting) - Create integrations/vllm.md (LocalVLLMBackend, batched inference, vision, troubleshooting) - Delete integrations/huggingface-and-vllm.md - docs.json: replace combined entry with huggingface + vllm; add redirect for old URL - index.mdx: split single card into separate HuggingFace and vLLM cards - Update nav footers: watsonx.md Next, mcp.md Previous * docs: split langchain-and-smolagents into separate langchain.md and smolagents.md - Create integrations/langchain.md (tool bridging, message history bridge, comparison table) - Create integrations/smolagents.md (tool bridging, comparison table) - Delete integrations/langchain-and-smolagents.md - docs.json: replace combined entry with langchain + smolagents; add redirect for old URL - Update nav footers: m-serve.md Next, metrics-and-telemetry.md Previous * docs: reorganise nav — rename Core Reference to Guides, co-locate m-serve, fix section assignments - Rename "Core Reference" → "Guides" (all 6 pages were diataxis how-to, not reference) - Move m-serve from Integrations → Guides alongside m-decompose (both first-party CLI tools) - Move handling-exceptions from Evaluation and Observability → How-To (it's a coding how-to, not observability) - Reorder Integrations: local (ollama, huggingface, vllm) → cloud (openai, bedrock, watsonx) → protocol/frameworks (mcp, langchain, smolagents) All 102 nav pages verified to exist on disk. * docs: float mascot logo left so intro paragraph wraps alongside it * docs: remove redundant Previous/Next footer nav (Mintlify handles this) * docs: remove Discord link from landing page * docs: expand ModelOutputThunk glossary entry with value, async, and streaming details * docs: remove .md extensions from internal links so Mintlify renders pages correctly * chore: trigger Mintlify rebuild * fix: use jsx styles on index.mdx Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com> * docs: remove duplicate H1 headings — Mintlify renders frontmatter title automatically * docs: add 10 new glossary entries and first-use cross-links * docs: add prefix-caching-and-kv-blocks page, KV smashing + SimpleLRUCache glossary entries * docs: add tutorials 02-03, LLM-as-a-judge how-to, and new glossary entries Add three new content pages: - tutorials/02-mifying-legacy-code: five-step tutorial on @mify — query and transform existing Python objects with m.query() and m.transform(), stringify_func, fields_include, funcs_include, and ad-hoc mify(obj) - tutorials/03-using-generative-slots: five-step tutorial on @Generative — Literal/Pydantic returns, pipeline composition, ChatContext injection, m.reset(), and pre/postcondition validation patterns - evaluation-and-observability/evaluate-with-llm-as-a-judge: how-to covering default LLMaJ behavior, standalone m.validate(), GenerateLog capture, purple elephant effect with check(), simple_validate bypass, combined checks, and SamplingResult metadata Also: - Add all three pages to docs.json nav - Add GenerateLog, LLM-as-a-judge, and Purple elephant effect to glossary - Add first-use glossary cross-links and full example pointers in each page * docs: add 14 new pages, fix nav, update AGENTS.md writing guide New pages: - tutorials/04-making-agents-reliable (ReACT, requirements, GuardianCheck) - how-to/refactor-prompts-with-cli (m decompose workflow) - how-to/unit-test-generative-code (pytest markers, TestBasedEval) - integrations/vertex-ai (LiteLLMBackend, vertex_ai/ model strings) - advanced/custom-components (Component protocol, TemplateRepresentation) - evaluation-and-observability/opentelemetry-tracing (spans, OTLP, Jaeger) - examples/index + 4 example pages (data-extraction, legacy-code, rag, telemetry) - community/contributing-guide, building-extensions, code-of-conduct - troubleshooting/faq (10 Q&A) Fixes: - tutorials/01: broken Next steps links; model-config review note added - docs.json: handling-exceptions moved to Eval & Observability (was How-To) - docs.json nav: all new pages registered - glossary: ComponentParseError, GuardianRisk, GuardianCheck expanded - AGENTS.md: Section 10 "Writing Docs" added with key conventions * docs: fix lint, complete review items, add missing strategy docs - Fix MD012 multiple-blank-lines in 20 files (trailing double blank lines) - Fix MD028 blank-line-inside-blockquote in smolagents.md - vertex-ai.md: replace "Hendrik please confirm" review note with verified LiteLLM docs — vertex_project/vertex_location keys are correct and override env vars at call time - inference-time-scaling.md: remove two "review needed" notes on BudgetForcingSamplingStrategy and MajorityVotingStrategyForMath; add source-verified parameter docs for both - inference-time-scaling.md: add sections for RepairTemplateStrategy, MultiTurnStrategy, and BaseSamplingStrategy (all in __all__ but previously undocumented) * docs: add missing glossary entries for new sampling strategies and PythonExecutionReq - Sampling strategy table: add RepairTemplateStrategy, MultiTurnStrategy, MBRDRougeLStrategy, BaseSamplingStrategy, correct MajorityVoting name to MajorityVotingStrategyForMath - Requirement entry: document PythonExecutionReq (code execution validator) with import path and key parameters * chore: delete legacy MDX files — replaced by new docs structure * ci: add markdownlint docs-lint job to CI * ci: add markdownlint pre-commit hook for docs/ * docs: refresh landing page cards; add index.mdx reminder to CONTRIBUTING checklist - Key patterns: swap MCP card for Tools and agents (@tool, MelleaTool, react()) - How-to guides: swap Handling exceptions for Use images and vision - Backends: add LiteLLM / Vertex AI card - CONTRIBUTING.md checklist: add item to review landing page cards when adding a major page * docs: separate contributor vs user content; fix internal references unit-test-generative-code.md: - Add single top-of-page callout directing Mellea contributors to contributing-guide#testing; remove all other contributor callouts - Rewrite session fixture using plain OllamaModelBackend (no gh_run) - Rewrite module markers section as generic user guidance with pyproject.toml snippet - Rewrite CI strategy section with a user-owned conftest.py pattern (CI=true) instead of Mellea's internal CICD=1 convention traced-generation-loop.md: - Replace dead internal reference docs/dev/telemetry.md (deleted file) with link to user-facing OpenTelemetry Tracing page mellea-core-internals.md: - mfuncs async row: "Mellea contributors" → "Advanced users building async pipelines" template-formatting.md: - "contributors and advanced users" → "advanced users and library authors" * docs: improve landing page key patterns and backends grid - Key patterns: remove 'Generative slots' (concept already in How it works section) replace 'Intrinsics and adapters' (too advanced/niche) with: - Async and streaming (use-async-and-streaming) - Safety checks (GuardianCheck via tutorial 04) - Backends: add LangChain as 8th card — makes even 4+4 grid * docs: add streaming/async tutorial; promote to T02, demote mify to T05 Add 02-streaming-and-async.md covering ainstruct(), streaming with ModelOption.STREAM/astream(), concurrent batch processing with wait_for_all_mots, mixed parallel/sequential pipelines, and context behaviour with async. Rename 02-mifying-legacy-code.md → 05-mifying-legacy-code.md so the main onboarding path (01 → 02 → 03 → 04) builds from universal async patterns to agents before introducing the Mellea-specific @mify feature. Update Tutorial 04 prerequisites to include Tutorial 02, since Step 7 introduces asyncio and react(). Update docs.json nav. * docs: add RAG how-to; expand examples index to all categories Add how-to/build-a-rag-pipeline.md covering the full RAG pattern: embedding and indexing, vector search, @Generative bool relevance filter, grounding_context for grounded generation, IVR requirements on the answer, and optional GuardianCheck groundedness verification. Includes a tuning table and a complete worked example. Expand examples/index.md from 4 documented examples to a comprehensive catalogue of all example categories, grouped by area (core concepts, data, agents, safety, integrations, performance, multimodal, observability, experimental). Preserves the existing 4 walkthrough pages at the top. Register build-a-rag-pipeline in docs.json How-To nav group. * docs: add cross-linking guideline for paired explanation/how-to pages When a feature has both a concepts/ explanation and a guide/ or how-to/ page, contributors should add a brief cross-link near the top of each so readers who land on either page can find the other. Adds the guideline under Diataxis classification and a PR checklist item. * docs: merge Guides into How-To; rename Advanced to Deep Dives Removes the Guides nav section — all 6 pages were how-to content and are now merged into a single How-To section (15 pages total). Core feature how-tos lead; task-specific how-tos follow. Moves integrations/m-serve from Guides to the Integrations section where its path already placed it logically. Renames Advanced to Deep Dives to signal optional/technical depth rather than implying a content type distinct from How-To. * docs: revert Advanced rename — keep as Advanced pending discussion * docs: add cross-links between paired explanation and how-to pages Add concept overview / practical usage callouts to the two clear paired page sets: - concepts/generative-functions ↔ guide/generative-functions - concepts/context-and-sessions ↔ how-to/use-context-and-sessions * docs: nav reorder, glossary additions, and CONTRIBUTING fixes Move Examples section to position 5 (after How-To, before Integrations) so runnable code follows concept and how-to content in the learning path. Glossary: add grounding_context and wait_for_all_mots entries. Fix CONTRIBUTING guide violations in new pages: - tutorials/02: add --- footer separator; link ModelOutputThunk, start_session()/SimpleContext/ChatContext on first use - how-to/build-a-rag-pipeline: link @Generative, GuardianCheck, grounding_context on first use; change Note to Backend note for the Ollama-specific GuardianCheck requirement - examples/resilient-rag-fallback: link @Generative and grounding_context on first use; add missing navigation footer * docs: correct Navigation footer guideline — Mintlify generates prev/next automatically * docs: fix footers, add cross-links, and standardise imports - Fix how-to/build-a-rag-pipeline: ## See also H2 → **See also:** bold footer - Add RAG how-to card to index.mdx (How-To section, 7 of 8 cards) - Add paired explanation/how-to cross-links: - concepts/requirements-system ↔ how-to/write-custom-verifiers - concepts/mobjects-and-mify ↔ tutorials/05-mifying-legacy-code - Add **See also:** footers to 12 pages missing them: guide/act-and-aact, guide/backends-and-configuration, guide/generative-functions, guide/m-decompose, guide/tools-and-agents, guide/working-with-data, how-to/use-async-and-streaming, tutorials/01, tutorials/03 (add --- separator), tutorials/04, examples/data-extraction-pipeline, examples/legacy-code-integration - Convert ## Next steps / ## What to try next H2 headings to **See also:** inline format (tutorials/01, tutorials/04) or bold text (examples) - Standardise import style in build-a-rag-pipeline to match example: import mellea → from mellea import generative, start_session * docs: fix three code correctness issues found in review 1. tutorials/03-using-generative-slots: add missing from typing import Literal to step 2 code block — FeedbackAnalysis uses Literal but the import was absent, causing a NameError if the block was run standalone. 2. tutorials/02-streaming-and-async: remove dead code from step 4 — FeedbackIssues class and extract_issues @Generative function were defined but never called; the pipeline used m.ainstruct() directly for extraction instead. 3. examples/resilient-rag-fallback: fix create_index() using global docs instead of parameter ds in the documentation page. Code worked by coincidence (always called with docs) but would silently ignore any other dataset passed in. Also removes spurious double --- separator from how-to/build-a-rag-pipeline footer. Note: the same bug exists in docs/examples/rag/simple_rag_with_filter.py (source file). That fix is tracked separately — committing the Python file here would trigger the mypy hook which currently fails on pre-existing optional-dependency import-not-found errors in mellea/backends/ that are unrelated to this change. * docs: fix broken links, shell quoting, and add validation tooling Address reviewer-reported issues from PR generative-computing#601 review: - Convert 22 relative ../../examples/ links to absolute GitHub URLs (Mintlify only serves docs/docs/, so relative paths 404 on the site) - Fix 5 other broken links (docs.json navbar, CONTRIBUTING placeholders, building-extensions API link, glossary docling URL, README escaping) - Quote all [extras] in pip/uv install commands for zsh compatibility (26 instances across 12+ files) - Fix simple_rag_with_filter.py: encode(docs) → encode(ds) parameter bug Add review tooling: - docs/scripts/check_docs.py: standalone validation script (stdlib only) checking links, Python code blocks, and shell quoting - docs/PR601-REVIEW.md: review comment tracker * docs: add missing imports to tutorial snippets and fix title casing - tutorials/03: add `from mellea import generative` to Steps 1-3 code blocks - tutorials/05: add `import mellea` and mify import to Steps 2-5 code blocks - concepts/generative-functions.md: "functions" → "Functions" in title Addresses reviewer comments M1-M4 and C3. * docs: fix missing start_session import in FAQ code blocks Two FAQ answers imported `generative` but used `start_session()` without importing it. Found by check_docs.py, not flagged by reviewers. * docs: fix 5 runtime errors found in PR review (E2, E4, E6, E7, E8) - E2: Remove unnecessary MelleaTool.from_callable() wrapping — @tool decorated functions are already MelleaTool objects - E4: Fix result.body → result.parsed_repr.body on ModelOutputThunk - E6: Fix langchain.tools → langchain_core.tools import path - E7: Fix mellea.stdlib.docs → mellea.stdlib.components.docs import path - E8: Replace broken Document example with working Message approach; filed generative-computing#636 for the underlying Document.parts() bug * docs: enhance import validation to check full mellea module paths Previously check_docs.py only validated the top-level package name (e.g. `mellea` exists), missing incorrect submodule paths like `mellea.stdlib.docs` (should be `mellea.stdlib.components.docs`). Now walks the filesystem to verify each dotted component resolves to a real package directory or .py file. Would have caught E7 from the PR review mechanically. * docs: address review comments — installation rewrite, WatsonX deprecation, tutorial cleanup - C1: reword landing page intro per reviewer suggestion - C4: remove dead blog link in requirements-system - C5: document grounding_context arbitrary key convention - C6: add sample output to tutorial 01 Step 1 - C7: remove duplicate @Generative steps from tutorial 01 (covered in tutorial 03) - C9: clarify ChatContext deprecation warning in tutorial 02 - E3: add ddgs + langchain-community install note - E5: add smolagents install note - E9: add mkdir and model-size warnings to m-decompose - I2+I3: rewrite installation.md with pip/uv as equals, add mellea[all] note - I6: replace WatsonX backend section with deprecation notice - Add deprecation banner to integrations/watsonx.md - Update PR601-REVIEW.md tracker with root cause analysis Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: bump Python version references from 3.10+ to 3.11+ Upstream merged generative-computing#603 (move off python 3.10), pyproject.toml now requires >=3.11. Update all doc references to match. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: fix angle-bracket email parsing error in code-of-conduct Mintlify treats `<email@example.com>` as JSX/HTML tags, causing a parse error at line 88. Use markdown link syntax instead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com> Co-authored-by: Paul S. Schweigert <paul@paulschweigert.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add OTLP logging export Adds OTLP logging handler that exports logs to OpenTelemetry collectors. Configured via MELLEA_LOGS_OTLP and OTEL_EXPORTER_OTLP_LOGS_ENDPOINT environment variables. Integrates with existing FancyLogger infrastructure. Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * refactor: rename get_otlp_handler to get_otlp_log_handler Renamed function for clarity to indicate it's specifically for log handling. Updated all references in telemetry module, core utils, and tests. Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> --------- Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com>
…ative-computing#646) * docs: implement publishing pipeline (generative-computing#617) Add docs-publish.yml workflow that builds, validates, and deploys docs to orphan deployment branches (docs/staging from main, docs/production from releases). Replaces docs-autogen-pr.yml. Includes pre-commit hooks for MDX validation and docstring quality, docs/PUBLISHING.md strategy document, and .gitignore entries for generated API docs. Validation is soft-fail by default with strict_validation toggle. workflow_dispatch supports force_publish for testing. * docs: use docs/preview as default workflow_dispatch target Safer default for manual dispatch — testing deploys to docs/preview rather than docs/staging. The automatic paths (push→staging, release→production) are unaffected. * docs: add label-gated preview deployment for PRs PRs with the 'docs-preview' label deploy to docs/preview branch. PRs without the label only run build + validation (no deploy). Also triggers on 'labeled' event so adding the label fires the workflow. * docs: document preview deployment and docs-preview label Update PUBLISHING.md with docs/preview branch, label-based PR deployment, and updated manual dispatch instructions. * docs: document fork PR limitation and fix markdownlint warnings Fork PRs can't deploy because GITHUB_TOKEN lacks write access to upstream. Documented the workaround (manual dispatch or push to upstream). Fixed MD032 blank line warnings. * docs: clarify Mintlify must use root path on deployment branches * docs: rename workflow to Docs, remove duplicate docs-lint from ci.yml markdownlint is already in Docs/build-and-validate with proper path filtering. Running it in ci.yml on every PR regardless of changed files was wasteful and duplicated. * ci: add job summaries to Docs and code-checks workflows Each validation step now tees its output to a temp log file. A final "Write job summary" step in both workflows writes a markdown table to $GITHUB_STEP_SUMMARY showing pass/fail per sub-check with key stats (lint issues, coverage %, symbol counts, test pass/fail counts). Collapsible <details> blocks include the full output for deeper inspection without cluttering the summary view. * ci: restore default pytest traceback (revert --tb=short) * ci: add branch, trigger, PR number and run URL to deploy commit message * docs: document deploy commit message format in PUBLISHING.md * docs: rename 'deployment branches' to 'docs branches' per review feedback * ci: opt into Node.js 24 for actions, add docstring quality to job summary * ci: run docstring quality in CI, fix false-positive annotation, update action versions audit_coverage.py emitted 'All documented symbols pass quality checks' even when --quality was not passed (quality_issues always [] without the flag). Now emits 'skipped' when --quality is absent, so the annotation is accurate. Add --quality to the CI audit step so quality is actually run and surfaced in the job summary. Update actions to latest versions (Node.js 24 compatible): - actions/checkout v4 → v6 - astral-sh/setup-uv v5 → v7 - actions/upload-artifact v4 → v7 - actions/download-artifact v4 → v8 * ci: fix job summary layout and docstring quality presentation - Split quality row into proper Result/Details columns matching table header - Strip 'run locally' from annotation message (we run it in CI now) - Add separate 'Docstring quality details' collapsible extracted from the coverage log so quality issues are easy to expand inline * ci: add job summary to deploy step showing destination branch and status * ci: remove per-item docstring quality annotations Individual annotations per missing docstring don't point to diff lines so they just create noise in the Annotations panel (14+ entries). The single summary annotation and the job summary collapsible provide sufficient visibility. Consistent with markdownlint and MDX validation which only emit one top-level result, not per-issue annotations. * ci: strip ANSI codes from job summary logs; defer docstring quality pre-commit hook - Strip ANSI escape codes from captured logs before writing to GITHUB_STEP_SUMMARY so collapsible sections render as clean text rather than raw escape sequences - Move docs-docstring-quality pre-commit hook to stages: [manual] with a TODO pointing to generative-computing#616 — Griffe loads the full package (~10s) which is too slow for normal commit flow; re-enable once quality issues reach 0 - CI audit still checks all symbols including methods (no --no-methods) so the job summary reports the full picture; hard-fail deferred to generative-computing#616 * ci: check all symbols including methods in docstring quality audit * ci: improve job summary detail rows and fix duplicate quality output - MDX Validation row now shows per-check error counts (e.g. "1 syntax error(s), 2 broken link(s)") instead of just PASS/FAIL - Markdownlint row already showed issue count; API Coverage row shows pct + symbols - Fix: coverage log is split at the quality section boundary so the docstring quality details collapsible no longer duplicates the coverage output * ci: raise collapsible log limits (100k for quality, 5k for others) * ci: raise docstring quality collapsible cap to 1MB * docs: clarify docs branches retain no history between deploys * docs: add --quality to local audit command to match CI behaviour
* docs: add Qiskit code validation IVR example Add comprehensive example demonstrating Instruct-Validate-Repair pattern for Qiskit quantum computing code generation with external validation. Features: - Pre-condition validation (prompt and input code) - Post-condition validation using flake8-qiskit-migration - Automatic repair loop with detailed error feedback - Code extraction from markdown blocks - Real-world use case: fixing deprecated Qiskit APIs Updated README with example description and requirements. Co-authored-by: va <va@us.ibm.com> Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * refactor: review feedback * fix: format code run ruff format Signed-off-by: va <va@us.ibm.com> * docs: add script block and fix uv run command in qiskit example Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * refactor: extract validation helpers into separate module - Move helper functions to validation_helpers.py - Reorganize into qiskit_code_validation/ subdirectory - Prepare structure for additional documentation Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * docs: add README and improve qiskit example configuration - Add comprehensive README with example prompts and troubleshooting - Move configuration inline for easier editing - Update parent README to document subdirectory - Add 'aer' to codespell ignore list for qiskit_aer package Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> * docs: add Future Work section to Qiskit IVR example Add concise documentation of planned enhancements: - MultiTurnStrategy integration for conversation-based repair - Grounding context to enable smaller models Addresses remaining work items from PR generative-computing#576 review discussion. Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> --------- Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com> Signed-off-by: va <va@us.ibm.com> Co-authored-by: va <va@us.ibm.com> Co-authored-by: //va <vabarbosa@users.noreply.github.com>
…mocks (generative-computing#567) * test: isolate astream_incremental tests from CI Fixes generative-computing#562 * test: add deterministic mock tests for astream incremental logic Introduces `test_astream_mock.py` to test `ModelOutputThunk`'s async queue incremental streaming logic deterministically without relying on highly-variable LLM backends. * test: adapt astream mock tests to upstream incremental semantics Update tests to match the astream() behavior change from PR generative-computing#618: - astream() now always returns incremental content (including final call) - astream() on a computed MOT raises RuntimeError * chore: trigger CI rebuild
…generative-computing#652) The dictionary keys are generator expressions, which are not hashable and as a result duplicates were not detected. Using tuple() explicitly creates a hashable, comparable key to properly deduplicate records. Closes generative-computing#651 Signed-off-by: Yannis Katsis <35782820+yannisk2@users.noreply.github.com>
…generative-computing#614) * docs: expand module-level docstrings for API reference (generative-computing#612) Replace one-liner and missing module docstrings across 53 files with substantive 2-4 sentence descriptions covering each module's purpose, key exports, and when to reach for it. Covers all three priority tiers from the issue: __init__.py landing pages, undocumented modules, and short stubs that merely restated the module name. * fix: replace EN DASH with hyphen in docstrings (RUF002) * docs: add Google-style Args/Returns to public functions (generative-computing#613) Add missing ``Args:`` and ``Returns:`` sections to all public functions and methods across ``mellea/`` and ``cli/`` that lacked them. Also converts RST-style ``:param:``/``:returns:`` docstrings inherited from the ``granite_io`` upstream into Google style to match the project convention (``pyproject.toml`` ``convention = "google"``). * docs: expand docstrings for public classes and functions (generative-computing#613) Add or expand Google-style docstrings across CLI and library modules: - cli/alora/train.py: add docstrings to load_dataset_from_json, formatting_prompts_func, SaveBestModelCallback, SafeSaveTrainer, train_model - cli/decompose/decompose.py: add class docstring to DecompVersion, expand run() with full Args section - cli/decompose/pipeline.py: add class docstrings to ConstraintResult, DecompSubtasksResult, DecompPipelineResult, DecompBackend; add full docstring with Args/Returns to decompose() - cli/decompose/utils.py: add Args/Returns to validate_filename - cli/eval/commands.py: add full docstring with Args to eval_run - cli/m.py: expand callback docstring - mellea/backends/adapters/adapter.py: expand LocalHFAdapter docstring - mellea/backends/cache.py: expand SimpleLRUCache docstring - mellea/backends/tools.py: expand SubscriptableBaseModel docstring; fix json_extraction to use Returns: instead of Yields: - mellea/core/backend.py: expand Backend docstring; add Args/Returns to generate_walk - mellea/core/base.py: add Args/Returns to blockify and get_images_from_component - mellea/core/utils.py: expand RESTHandler, JsonFormatter, FancyLogger class docstrings - mellea/helpers/openai_compatible_helpers.py: add Args/Returns to message_to_openai_message and messages_to_docs - mellea/stdlib/components/docs/richdocument.py: expand TableQuery and TableTransform class docstrings - mellea/stdlib/components/genslot.py: expand Argument, Function, Arguments, GenerativeSlot class docstrings; add SyncGenerativeSlot class docstring - mellea/stdlib/components/mobject.py: expand Query and Transform class docstrings - mellea/stdlib/requirements/requirement.py: add Args sections to req() and check() - mellea/stdlib/sampling/budget_forcing.py: expand BudgetForcingSamplingStrategy class docstring * docs: add missing Args: sections to class-level docstrings (generative-computing#613) Add Args: sections to class docstrings that had Attributes: but no Args:, completing Google-style docstring coverage: - core/requirement.py: ValidationResult, Requirement - core/sampling.py: SamplingResult - core/base.py: CBlock, ImageBlock, ModelOutputThunk, ContextTurn, TemplateRepresentation, GenerateLog, ModelToolCall - formatters/template_formatter.py: TemplateFormatter - formatters/granite/intrinsics/input.py: IntrinsicsRewriter - formatters/granite/intrinsics/output.py: TransformationRule, TokenToFloat, DecodeSentences, MergeSpans - formatters/granite/retrievers/embeddings.py: InMemoryRetriever - stdlib/components/chat.py: Message, ToolMessage - helpers/async_helpers.py: ClientCache * docs: remove redundant Attributes: entries and drop TASK.md (generative-computing#613) Remove Attributes: sections where fields duplicate constructor params verbatim (same name, same type, same description), keeping only entries that add genuinely new information (computed/derived attributes): - mellea/backends/backend.py: FormatterBackend - mellea/core/base.py: CBlock, ImageBlock, ModelOutputThunk, ContextTurn - cli/eval/runner.py: InputEvalResult, TestEvalResult (keep computed properties passed_count, total_count, pass_rate) Also removes TASK.md left over from development. * chore: upgrade mypy to 1.19.1 to fix cpex import-not-found in CI mypy 1.18.2 reports import-not-found for cpex.framework.* (which has no py.typed marker). 1.19.1 handles this correctly and was the version used when PR generative-computing#582 passed CI. * docs: fix pipeline docstring - call count depends on constraints not fixed at five * docs: apply Option C docstring convention — Args: on class only, clean Attributes: (generative-computing#613) - Remove Args: from __init__ docstrings across 60 classes; Args: moved to (or already present on) the class docstring so the docs pipeline (which skips __init__) and IDE hover both show the full parameter list without duplication - Review Attributes: on 46 classes: remove pure-echo entries that repeat Args: verbatim; retain sections where stored values differ in type or behaviour (type transforms such as str→CBlock, class constants such as YAML_NAME, computed values such as SamplingResult.result_index) - audit_coverage.py: drop no_attributes check (now optional by design); add duplicate_init_args check to catch Args: appearing in both class and __init__ - CONTRIBUTING.md: add class/__init__ docstring placement section with example - AGENTS.md: expand Google-style docstring bullet with Option C rule; remove duplicate line * docs: remove isolated venv from generate-ast.py, fix audit_coverage.py (generative-computing#613) - generate-ast.py: remove isolated .venv-docs-autogen — use the project venv (sys.executable) directly; mdxify and mellea are already installed via uv sync --all-extras --group dev; keep clean MDXIFY_CWD for import safety; remove --no-venv / --pypi-name / --pypi-version args and ensure_venv / pip_install helpers; update module docstring - audit_coverage.py: fix no_class_args to filter variadic *args/**kwargs by ParameterKind (matching the existing function check) so SimpleComponent (**kwargs) is not falsely flagged; update find_documented_symbols to match mdxify 0.2.37 heading format (### `SymbolName`) - mellea/core/utils.py: add Args: to RESTHandler class docstring (missed in Option C sweep); remove pure-echo Attributes: section * docs: add docstring validation guidance to CONTRIBUTING.md (generative-computing#613) Add a 'Validating docstrings' subsection under the class docstring placement rule with: - audit_coverage.py command to run after generate-ast.py - Table of key quality checks (no_class_args, duplicate_init_args, etc.) - Three real classes to hover over in VS Code to verify IDE UX * docs: add class/__init__ docstring placement rule to docs guide (generative-computing#613) The docs contribution guide only showed the function docstring pattern. Add the class docstring placement rule (Option C): Args: on class only, __init__ gets a summary sentence, Attributes: only for genuine transforms. Link back to CONTRIBUTING.md for the full validation workflow. * docs: update nav with plugins/telemetry groups (generative-computing#613) - docs/docs/docs.json: add plugins and telemetry/logging groups to API Reference nav (generated by pipeline run) - .gitignore: add .venv-docs-autogen/ (leftover from earlier failed run, no longer created by generate-ast.py) - docs/PR614-review-summary.md: add full review summary for agents and reviewers covering problem, Option C rationale, changes made, and known issues * fix: remove --no-venv arg from build.py calls to generate-ast.py generate-ast.py no longer accepts --no-venv (removed when the isolated venv was dropped in favour of sys.executable). build.py was still passing it, breaking the docs CI build. Signed-off-by: Nigel Jones <jonesn@uk.ibm.com> --------- Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
…tive-computing#654) (generative-computing#664) * docs: fix missing docstring sections in plugins and telemetry (generative-computing#654) Add Args/Returns/Raises sections to ~15 functions and Args to 4 classes in mellea/plugins/ and mellea/telemetry/tracing.py. Expand short HookType docstring. Convert RST cross-references to Google style. Audit now reports 0 quality issues. * docs: remove spurious Returns from contextmanager generators trace_application() and trace_backend() are @contextmanager generators that already have Yields: sections. Remove incorrect Returns: sections per Google docstring style.
…rative-computing#663) - Move plugins.mdx from core-concept/ to concepts/ (matches existing convention; all other core-concept/ pages already migrated) - Add to Concepts group in docs.json navigation - Apply documentation standards: diataxis frontmatter, sentence-case headings, markdownlint fixes, See also footer, glossary links on first use of Component/Requirement/MelleaSession - Fix broken internal link (/core-concept/interoperability → /integrations/mcp) - Add glossary entries: Hook/HookType, Plugin, PluginSet - Trim docs/dev/hook_system.md to internal design notes only, with pointer to user-facing page for usage docs
…inks (generative-computing#658) RST-style ``Symbol`` in docstrings caused add_cross_references to generate malformed link syntax: `[`Symbol`](url)` renders as inline code rather than a clickable link in Mintlify. - Add normalize_rst_backticks() pass in decorate_api_mdx.py (runs before add_cross_references) to convert ``x`` to `x` in MDX prose - Add validate_rst_docstrings() in validate.py to scan source files and report occurrences as a warning (does not fail the build) 91 source files / 992 occurrences detected; fix is applied at build time. Source cleanup tracked separately.
…ncies) (generative-computing#665) * docs: pre-release verification pass (generative-computing#645) - Fix 7 missing Returns sections in core, plugins, stdlib docstrings - Delete stale docs/index.md and docs/PR601-REVIEW.md artifacts - Add stale-file detection to validate.py with tests - Wire stale-file check into docs-publish workflow summary * docs: add doc import validation and fix stale imports - Add validate_doc_imports() to check mellea imports in doc code blocks resolve at import time (skips optional deps, handles submodule imports) - Add validate_stale_files() integration with generate_report() - Fix glossary.md: ChatContext/SimpleContext import paths - Fix intrinsics.md: GraniteCommonAdapter → CustomIntrinsicAdapter - Add 3 tests for doc import validation (17/17 pass) * docs: distinguish generator Yields from Returns in docstring audit The quality audit incorrectly flagged @contextmanager generators as missing Returns sections. Generators (Generator, Iterator, etc.) should use Yields, not Returns. Adds no_yields check kind and _GENERATOR_RETURN_PATTERNS detection. Fixes: trace_application/trace_backend no longer flagged for Returns. New: json_extraction correctly flagged for missing Yields. * docs: fix Yields section for json_extraction generator docstring
…g#551) * Add uncertainty/certainty intrinsic support Wire up the uncertainty intrinsic from ibm-granite/granite-lib-core-r1.0 with a high-level check_certainty() API. The intrinsic evaluates model confidence in its response given a user question and assistant answer. - Add check_certainty(context, backend) in core.py - Extract shared call_intrinsic() helper into _util.py - Update catalog to point uncertainty at granite-lib-core-r1.0 - Add test, example, and README entry Co-Authored-By: ink-pad <inkit.padhi@gmail.com> * Fix uncertainty description in README Co-Authored-By: ink-pad <inkit.padhi@gmail.com> * Rename _call_intrinsic to call_intrinsic Drop the underscore prefix and alias — use call_intrinsic consistently across _util.py, rag.py, core.py, and test_rag.py. Co-Authored-By: ink-pad <inkit.padhi@gmail.com> * Rename GraniteCommonAdapter to IntrinsicAdapter in _util.py Upstream PR generative-computing#571 renamed GraniteCommonAdapter to IntrinsicAdapter. Update our _util.py to match before rebasing onto origin/main. * Feat/requirement_check (#1) * Add uncertainty/certainty intrinsic support Wire up the uncertainty intrinsic from ibm-granite/granite-lib-core-r1.0 with a high-level check_certainty() API. The intrinsic evaluates model confidence in its response given a user question and assistant answer. - Add check_certainty(context, backend) in core.py - Extract shared call_intrinsic() helper into _util.py - Update catalog to point uncertainty at granite-lib-core-r1.0 - Add test, example, and README entry Co-Authored-By: ink-pad <inkit.padhi@gmail.com> * Fix uncertainty description in README Co-Authored-By: ink-pad <inkit.padhi@gmail.com> * Rename _call_intrinsic to call_intrinsic Drop the underscore prefix and alias — use call_intrinsic consistently across _util.py, rag.py, core.py, and test_rag.py. Co-Authored-By: ink-pad <inkit.padhi@gmail.com> * test * removed test file * added req check intrinsic * Update README.md * Update _util.py * updated files --------- Co-authored-by: ink-pad <inkit.padhi@gmail.com> Co-authored-by: manish-nagireddy <manish.nagireddy@ibm.com> * fix: add comment on repo structures * fix: linting * fix rc repo name * resolve rc references * fix: field ref in req check intrinsic --------- Co-authored-by: inkpad <inkit.padhi@ibm.com> Co-authored-by: Manish Nagireddy <65432909+mnagired@users.noreply.github.com> Co-authored-by: manish-nagireddy <manish.nagireddy@ibm.com> Co-authored-by: jakelorocco <59755218+jakelorocco@users.noreply.github.com>
* docs: removed outdated tutorial.md Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com> * dropping additional references to tutorial.md Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com> --------- Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>
generative-computing#670) * docs: fix MelleaPlugin/MelleaBasePayload missing from API coverage (generative-computing#667) Replace dual if/else class definitions with dynamic base classes so Griffe's static AST parser sees a single ClassDef node per class and always picks up the authoritative docstring. * docs: convert intrinsic core docstrings to Google style The Sphinx :param:/:return: style is not recognised by Griffe's quality audit. Convert to Google Args/Returns sections.
…generative-computing#669) * feat: add codeowners specifically for the granite-common part of mellea intrinsics * fix: empty commit to unstuck mergify
…e-computing#645) (generative-computing#672) * docs: add missing example categories to examples catalogue (generative-computing#645) Add plugins, m_decompose, tutorial, notebooks, and hello_world to the examples index page so all runnable example directories are discoverable from the published docs. Also add a note to CONTRIBUTING.md reminding authors to update the catalogue when adding new example directories. * docs: add examples catalogue validation check to doc validator Add validate_examples_catalogue() to validate.py that checks every example directory under docs/examples/ (containing .py files) has a corresponding entry in docs/docs/examples/index.md. Also refine the CONTRIBUTING.md guideline to frame it as a check rather than only an instruction for when adding examples.
…generative-computing#690) * fix: add support back for older models and requirement_check adapter * fix: fix model used in test_rag for intrinsics * fix: remove answer_relevance_classifier and answer_relevance_rewriter
19261e3 to
58a7f44
Compare
…e-computing#679) * Add uncertainty/certainty intrinsic support Wire up the uncertainty intrinsic from ibm-granite/granite-lib-core-r1.0 with a high-level check_certainty() API. The intrinsic evaluates model confidence in its response given a user question and assistant answer. - Add check_certainty(context, backend) in core.py - Extract shared call_intrinsic() helper into _util.py - Update catalog to point uncertainty at granite-lib-core-r1.0 - Add test, example, and README entry Co-Authored-By: ink-pad <inkit.padhi@gmail.com> * Fix uncertainty description in README Co-Authored-By: ink-pad <inkit.padhi@gmail.com> * Rename _call_intrinsic to call_intrinsic Drop the underscore prefix and alias — use call_intrinsic consistently across _util.py, rag.py, core.py, and test_rag.py. Co-Authored-By: ink-pad <inkit.padhi@gmail.com> * test * removed test file * added req check intrinsic * feat: extend sentence boundary marking to conversation history with shared index - Add `index` parameter to `mark_sentence_boundaries()` to allow callers to continue numbering across multiple calls; return the next available index - Add `all_but_last_message` as a valid `sentence_boundaries` key - Extend `_mark_sentence_boundaries()` to tag prior conversation turns when `all_but_last_message` is configured, using a shared running index with documents so that each context sentence has a globally unique tag Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Dennis Wei <dwei@us.ibm.com> * feat: extend DecodeSentences to conversation history and multi-source decoding - Accept `source: str | list[str]` to allow a single DecodeSentences rule to decode sentences from multiple locations in one pass - Add `all_but_last_message` as a valid source, decoding prior conversation turns with a running sentence index shared across all sources - Add optional `message_index` output field that records which conversation turn each attributed sentence came from Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Dennis Wei <dwei@us.ibm.com> * feat: add context-attribution catalog entry and update core repo - Update _CORE_REPO to "ibm-granite/granitelib-core-r1.0" - Add context-attribution intrinsic pointing to _CORE_REPO Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Dennis Wei <dwei@us.ibm.com> * test: add context-attribution test data and formatter tests - Add input/test_canned_input/test_canned_output/expected_result JSON test data files - Add YamlJsonCombo entry for context-attribution pointing to ibm-granite/granitelib-core-r1.0 - Exclude context-attribution from Ollama inference tests via _NO_OLLAMA_ADAPTER since an Ollama LoRA adapter is not yet available on the HF Hub Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Dennis Wei <dwei@us.ibm.com> * test: update context-attribution test_run_transformers file for mellea The model consistently produces {"r": 1, "c": [2, 0, 1, 19, 3]} with the mellea codebase, yielding 7 attribution records rather than the 12 produced on the granite-common side. Update the expected output accordingly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Dennis Wei <dwei@us.ibm.com> * feat: add find_context_attributions() API function Add find_context_attributions() to core.py since the context-attribution adapter lives in the ibm-granite/granitelib-core-r1.0 repo, but modelled after find_citations() in rag.py. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Dennis Wei <dwei@us.ibm.com> * docs: add context-attribution example Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Dennis Wei <dwei@us.ibm.com> * test: add test_find_context_attributions and test files Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Dennis Wei <dwei@us.ibm.com> --------- Signed-off-by: Dennis Wei <dwei@us.ibm.com> Co-authored-by: inkpad <inkit.padhi@ibm.com> Co-authored-by: ink-pad <inkit.padhi@gmail.com> Co-authored-by: manish-nagireddy <manish.nagireddy@ibm.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…puting#678) * Feat/guardianlib (generative-computing#8) * Update README.md Added policy_guardrails intrinsic * Update catalog.py Added policy guardrails * Create policy_guardrails.json Initial checkin * Create test_guardian.py initial checkin * Create policy_guardrails.py Initial check in * Create guardian.py Adding policy_guardrails intrinsic * Update guardian.py Fixed method call (call_intrinsic instead of _call_intrinsic) * feat: add guardian core intrinsic component * Added factuality examples * Added unit tests for factuality intrinsics * Fixed the pre-commit errors * fix guardian intrinsic names to match HF repo paths * fix: lint fixes from pre-commit * Removed duplicated code in correction example * refactor: remove _call_guardian_intrinsic workaround, use call_intrinsic * style: ruff format guardian.py --------- Co-authored-by: Moninder Singh <39064734+monindersingh@users.noreply.github.com> Co-authored-by: Subhajit Chaudhury <subhajit@ibm.com> Co-authored-by: Radu Marinescu <radu.marinescu@ie.ibm.com> * Fixed factuality correction example * fix: typo in factuality_correction example * fix: typo in factuality_correction example * fix: typo in factuality_correction example --------- Co-authored-by: Moninder Singh <39064734+monindersingh@users.noreply.github.com> Co-authored-by: Subhajit Chaudhury <subhajit@ibm.com> Co-authored-by: Radu Marinescu <radu.marinescu@ie.ibm.com> Co-authored-by: jakelorocco <jake.lorocco@ibm.com> Co-authored-by: jakelorocco <59755218+jakelorocco@users.noreply.github.com>
…puting#694) (generative-computing#697) Token count extraction in _post_process_async was gated behind `span is not None or metrics_enabled`, so mot.usage was never populated in plain (non-telemetry) runs. Now extracted unconditionally — usage is a standard mot field, not a telemetry concern.
58a7f44 to
dfb95c4
Compare
…ests to run (generative-computing#674) Adds a Required models section with a list of required ollama models for non qualitative tests to pass
…ed module issues (generative-computing#676) * upd: clean over long or result files * add: validation code generation * add: validation code icl * add: prompt init * add: validation decision * add: decomp jinja * refact: pipeline and primary stages * refact: module logging * fea: validation report * fea: validation report icl * fea: cli script with config * add: examples * add: examples * add: README doc * pre-commit: add test attribute * upd: type annotations * fix: add constraint type annotation * upd: pre-commit format * upd: pre-commit type annotations * upd: pre-commit format * add: m_decompose tests * add: constraint retry * upd: constraint * upd: constraint * upd: same logmode * fix: a missed parse * pre-commit format * add: multi request support * clean * fix: type clean * upd: input file arg * fea: constraint retry * test * fix * fix * fix * clean: final result * add: decompose tests * add: decompose tests * clean: pre-commit format * clean: pre-commit formating on decompose tests
…erative-computing#706) * fix: modify plugin logging to not print * fix: increase test timeout
…generative-computing#709) ruff version; and fix new issues
Add a hard-fail docstring quality gate to the docs-publish workflow: - New 'Docstring quality gate' step runs --quality --fail-on-quality --threshold 100; fails if any quality issue is found or coverage drops below 100% (both currently pass in CI) - Existing audit_coverage step (soft-fail, threshold 80) retained for the summary coverage metric Add typeddict_mismatch checks to audit_coverage.py: - typeddict_phantom: Attributes: documents a field not declared in the TypedDict - typeddict_undocumented: declared field absent from Attributes: section - Mirrors the existing param_mismatch logic for functions Pre-commit: enable --fail-on-quality on the manual-stage hook (CI is the hard gate; hook remains stages: [manual] as docs must be pre-built). Update CONTRIBUTING.md and docs/docs/guide/CONTRIBUTING.md with TypedDict docstring requirements and the two new audit check kinds.
… and fix hints audit_coverage.py: - Add file/line fields to every issue dict (repo-relative path + def line) - _print_quality_report now shows [file:line] per issue, per-kind Fix:/Ref: hints linking to CONTRIBUTING.md anchors, and emits ::error file=...,line=... GHA annotations so issues appear inline in PR diffs - Cap GHA annotations at 10 per check kind with "N more in job log" notice - Add _KIND_FIX_HINTS and _gha_file_annotation helpers; _CONTRIB_DOCS_URL constant validate.py: - Convert all check functions from list[str] to list[dict] errors (file/line/message) - Add line-number tracking to validate_source_links, validate_internal_links, validate_anchor_collisions, and validate_doc_imports - Emit per-error GHA annotations with file/line; shared 20-annotation budget across all checks so every category gets representation in PR diff - Fix icon bug: summary rows now use correct pass/fail icon - Group detailed errors by check type with section headers docs-publish.yml: - Add --orphans and --output /tmp/quality_report.json to quality gate step - Upload quality_report.json as docstring-quality-report artifact (30-day retention) pyproject.toml: - cli/**/*.py: suppress only D2/D3/D4xx style rules; enable D1xx (missing docstrings) as a ruff-level complement to the audit_coverage quality gate docs/docs/guide/CONTRIBUTING.md: - Add CI docstring checks reference section with per-kind tables (fix instructions + anchors) for all 11 check kinds across 4 categories - Add callout explaining GHA annotation cap (10 per kind) and where to find the full list (job log + JSON artifact)
…y links
audit_coverage.py (audit_nav_orphans):
- Probe docs/docs/docs.json before docs/mint.json so both Mintlify v1
and v2 nav configs are supported
- Extend _extract to handle plain string page entries used by docs.json
(v2 uses "pages": ["api/..."] strings; v1 used {"page": "api/..."} dicts)
- Previously mint.json was never found, nav_refs stayed empty, and every
MDX file was reported as an orphan
docs-publish.yml (Write job summary):
- When the quality gate fails, render a prominent markdown callout with a
direct link to the CI docstring checks reference section in CONTRIBUTING.md
- Add a per-kind fix reference table with clickable anchor links to each
category section (missing/short, args/returns, class Option C, TypedDict)
- Per-kind Ref: URLs in the raw log are inside a text block and do
not render as links in the step summary; this table surfaces them rendered
…ped notice docs-publish.yml: - Parse per-kind counts from _print_quality_report section headers in the quality gate log (e.g. "Missing docstrings (12)") and show them as a comma-separated breakdown in the Docstring Quality table cell instead of just the total — gives developers an immediate view of which categories are failing without expanding the log audit_coverage.py: - Remove the "skipped (pass --quality to enable)" GHA notice emitted by the coverage-only step; there is always a dedicated quality gate step immediately after so the notice was misleading and redundant
audit_coverage.py: - Coverage miss section now shows a structured Fix:/Ref: block with the exact generate-ast.py command and a link to CONTRIBUTING.md#validating-docstrings - Missing symbols listed one per line (symbol indented under module) for scannability instead of comma-joined on one long line - Emit a ::error or ::warning GHA annotation with symbol/module counts when coverage symbols are undocumented validate.py: - Add _CHECK_FIX_HINTS dict mapping each check label to a (fix text, ref URL) pair, covering all 8 check types with specific fix instructions and links into CONTRIBUTING.md (root or guide as appropriate) - _print_check_errors now prints Fix:/Ref: under each section header, matching the pattern established by _print_quality_report
audit_coverage.py: - Add missing_param_type check: fires when Args: section exists but one or more concrete params lack Python type annotations; naturally non-overlapping with no_args (which fires when section is absent) - Add missing_return_type check: fires when Returns: section is documented but the function has no return annotation; naturally non-overlapping with no_returns (annotation exists but section absent) - Add fix hints and CONTRIBUTING.md anchors for both new check kinds - Update kind_labels and iteration order in _print_quality_report generate-ast.py: - Add remove_internal_modules() post-generation filter step - Uses AST-based import analysis: a submodule is internal when the parent __init__.py imports from at least one sibling submodule but not from this one (import-based visibility, not __all__ name-matching) - Conservative: keeps module when parent imports nothing (indeterminate) or __init__.py cannot be parsed - _CONFIRMED_INTERNAL_MODULES hardcoded set for known internals where parent imports nothing (json_util, backend_instrumentation); these should eventually be renamed with _ prefix per Python convention - Package index files (stem == parent dir) are never filtered docs.json: nav regenerated by build; internal modules removed from nav CONTRIBUTING.md: add missing_param_type / missing_return_type to CI docstring checks reference table docs-publish.yml: add both new kinds to summary kind_short and kind_anchors fix-reference table
audit_coverage.py was walking all non-_-prefixed source modules via Griffe, including internal modules (json_util, backend_instrumentation, etc.) whose MDX files were removed by remove_internal_modules() in generate-ast.py. This caused coverage to drop because those symbols were no longer 'documented' but were still counted in the denominator. Apply the same import-based public-API filter in discover_public_symbols(): skip submodules that the parent __init__.py does not import from, mirroring the generate-ast.py logic. _CONFIRMED_INTERNAL_MODULES kept in sync. Also drop the per-kind anchor table from the GHA job summary. Anchors in GitHub Actions summaries only navigate to the top of the referenced document, so the table added noise without working links.
Adds two new quality check kinds that fire when the type explicitly
stated in an Args:/Returns: docstring entry disagrees with the Python
type annotation in the function signature:
param_type_mismatch — 'param (OldType): ...' vs annotation 'NewType'
return_type_mismatch — 'Returns: OldType: ...' vs annotation '-> NewType'
Both checks fire only when BOTH sides have an explicit type; one-sided
absence is already handled by missing_param_type / missing_return_type.
Type comparison uses _types_match() / _normalize_type() which handles:
- typing aliases: List→list, Dict→dict, Optional→X|None, Union→A|B
- typing. prefix stripping
- pipe-union component ordering (str|None == None|str)
- incidental whitespace
Known conservative suppressions (prefer false negatives over false positives,
since there is no per-site suppression mechanism):
- Nested generics not fully expandable by regex (e.g. Optional[list[str]])
are silently skipped — both sides must fully normalise to be compared
- Union with bracket-containing members
- Callable argument ordering
…ubmodule For a module file mellea/pkg/submodule.py, Griffe gives filepath ending in .py (not __init__.py). The parent __init__.py is fp.parent/__init__.py. The previous code used fp.parent.parent which is correct for packages (whose filepath IS the __init__.py) but goes one level too far for plain module files — it was checking the grandparent init instead of the parent. Effect: genslot, react, unit_test_eval and similar non-exported modules in stdlib/components were incorrectly counted as public symbols, inflating the denominator and lowering the reported coverage percentage.
48b32b4 to
4bfcd0a
Compare
- Fix missing_param_type, missing_return_type, param_type_mismatch, return_type_mismatch, no_args, no_returns, and missing docstring issues - Add TYPE_CHECKING imports for HuggingFace types in util.py with type: ignore[union-attr] for pre-existing None-safety gaps - Add Granite3ChatCompletion import to granite32/33 input.py for correct sanitize() parent signature match - Convert reST-style docstrings to Google style in intrinsics/input.py - Document AST single-quote normalization for Literal types in CONTRIBUTING.md
Adding TYPE_CHECKING annotations to util.py made mypy check function bodies it previously skipped (untyped params = implicit Any = no body checking). This exposed a pre-existing Tensor-not-callable issue and a dict-variance issue in mobject.py. Suppress with targeted type: ignore comments — these are not new bugs, just newly visible ones.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Misc PR
Type of PR
Description
Adds a hard-fail docstring quality gate to the docs-publish workflow (
--quality --fail-on-quality --threshold 100). Both checks currently pass in CI (100% coverage, 0 quality issues).Also adds a
typeddict_mismatchscanner toaudit_coverage.py— flagsAttributes:sections onTypedDictclasses that document phantom fields or omit declared ones (mirrors the existingparam_mismatchlogic for functions).Pre-commit hook updated to use
--fail-on-quality; staysstages: [manual]since it requires pre-built docs. CI is the hard gate.Contribution docs updated with TypedDict docstring requirements and the two new check kinds.
Testing