Skip to content

v0.5.0a1 The Sentinel: plugin-enabled scanning, adaptive engine hardening, and preflight green#21

Closed
PythonWoods-Dev wants to merge 1 commit intomainfrom
release/v0.5.0a1-final
Closed

v0.5.0a1 The Sentinel: plugin-enabled scanning, adaptive engine hardening, and preflight green#21
PythonWoods-Dev wants to merge 1 commit intomainfrom
release/v0.5.0a1-final

Conversation

@PythonWoods-Dev
Copy link
Copy Markdown
Contributor

Summary
This PR finalizes the Sentinel milestone for v0.5.0a1 by closing the remaining blocker and stabilizing runtime, performance, and release quality gates.

What was delivered
Plugin-enabled scanning (Safe Harbor fully implemented)
Added explicit plugin allowlist support in project configuration.
Rule loading is now deterministic:
Core rules are always active.
Custom regex rules are loaded from configuration.
External plugin rules are loaded only when explicitly listed.
Added robust core fallback behavior for environments where entry-point metadata is not available.
Parallel Phase B optimization
In the parallel + validate links path, Phase B now performs link collection without re-running rule checks.
This removes redundant compute and reduces overhead on larger documentation trees.
Robustness and UX improvements
Added clear fail-fast validation for worker values.
Updated scanner documentation text to match actual execution behavior and constraints.
Fixed final typing issue detected by mypy in preflight.
Quality and release readiness
Full preflight pipeline passes.
Test suite passes with coverage above required threshold.
Examples were verified end-to-end and classified by expected behavior.
Changelog and release-track messaging were aligned:
0.4.x documented as abandoned exploratory line.
0.5.x documented as active stabilization line.
Configuration and behavior impact
New plugin allowlist behavior enables strict user control over third-party rule activation.
Existing projects without plugin configuration continue to work with core rules enabled.
Invalid worker values now return immediate, clear validation errors.
Validation performed
nox preflight: pass
ruff check and format check: pass
mypy: pass
pytest: pass
coverage threshold: pass
Examples verification matrix
Example Result Expected Notes
examples/broken-docs Fail Fail Negative fixture with intentionally broken and unsafe links
examples/i18n-standard Pass Pass Healthy i18n fixture
examples/mkdocs-basic Pass Pass Healthy MkDocs baseline fixture
examples/security_lab Fail Fail Security fixture with traversal and absolute-path violations
examples/vanilla Pass Pass Healthy engine-agnostic fixture
examples/zensical-basic Pass Pass Healthy Zensical fixture
Examples status summary:

Expected Pass confirmed: 4 out of 4
Expected Fail confirmed: 2 out of 2
Why this PR matters
This change set converts the Sentinel release promises into executable behavior:

Safe Harbor is real, not only documented.
Parallel scanning is leaner under link validation.
Error handling is clearer for users and CI operators.
Release gates are green with verified fixture behavior across examples.

Body commit:

enable plugin-driven scanning with safe-harbor allowlist via plugins config
keep core rules always active with robust fallback when entry points are unavailable
optimize parallel Phase B by skipping rule engine re-run during link collection
validate workers input early with clear error messages and aligned docstrings
add and update tests for plugin loading, config parsing, and worker validation
fix mypy typing issue in scanner to pass nox preflight
verify examples matrix: positive fixtures pass, negative security/broken fixtures fail as expected
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR finalizes the v0.5.0a1 “Sentinel” milestone by introducing a plugin-enabled rule system (with an explicit allowlist), unifying sequential/parallel scanning into a single adaptive entry point, and extending configuration loading to support [tool.zenzic] in pyproject.toml.

Changes:

  • Replaced the legacy RuleEngine and separate scan entry points with AdaptiveRuleEngine and a unified scan_docs_references(...) -> (reports, link_errors) API (adaptive sequential/parallel + optional link validation + telemetry).
  • Added plugin discovery/loading via the zenzic.rules entry-point group, plus plugins = [...] allowlisting in config and a new zenzic plugins list CLI command.
  • Added pyproject.toml config fallback ([tool.zenzic]) and updated docs/tests/changelog for the new behavior and breaking API changes.

Reviewed changes

Copilot reviewed 33 out of 34 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
uv.lock Bumps project version and constrains httpx to <1.0.
tests/test_rules.py Updates rule engine tests for AdaptiveRuleEngine, adds plugin/contract coverage.
tests/test_references.py Updates tests for new scan_docs_references tuple return and unified link-validation path.
tests/test_parallel.py Migrates parallel tests to unified adaptive API; adds workers validation coverage.
tests/test_integration_finale.py Adds integration coverage for plugin listing CLI and telemetry emission.
tests/test_config.py Adds config parsing tests for plugins and pyproject.toml support.
tests/test_cli.py Updates CLI tests to patch the unified scan entry point.
src/zenzic/models/references.py Updates documentation text to reference AdaptiveRuleEngine.
src/zenzic/models/config.py Adds plugins field and implements pyproject.toml fallback loading.
src/zenzic/main.py Registers new plugins Typer sub-app.
src/zenzic/core/scanner.py Unifies scan entry point, adds adaptive parallel mode, telemetry, plugin-aware rule engine construction.
src/zenzic/core/rules.py Introduces AdaptiveRuleEngine, eager pickle validation, and plugin registry/discovery helpers.
src/zenzic/core/exceptions.py Adds PluginContractError.
src/zenzic/cli.py Switches CLI to unified scan API; adds zenzic plugins list command.
src/zenzic/init.py Version bump to 0.5.0a1.
README.md Updates highlights and configuration-loading documentation for v0.5.0a1.
README.it.md Mirrors the v0.5.0a1 highlights and release-track messaging (Italian).
pyproject.toml Version bump; adds zenzic.rules entry point for broken-links; constrains httpx.
docs/usage/commands.md Documents zenzic plugins list.
docs/usage/advanced.md Updates programmatic API docs for unified scan entry point + hybrid adaptive engine.
docs/it/usage/commands.md Documents zenzic plugins list (Italian).
docs/it/usage/advanced.md Updates advanced usage docs for unified scan entry point (Italian).
docs/it/developers/plugins.md Adds Italian plugin-authoring contract documentation.
docs/it/developers/index.md Adds link to plugin-authoring docs (Italian).
docs/it/configuration/index.md Documents config priority chain incl. pyproject.toml fallback (Italian).
docs/it/architecture.md Documents hybrid adaptive engine behavior/diagram (Italian).
docs/it/about/index.md Adds changelog link and release-track messaging (Italian).
docs/developers/plugins.md Adds plugin-authoring contract documentation.
docs/developers/index.md Adds link to plugin-authoring docs.
docs/configuration/index.md Documents config priority chain incl. pyproject.toml fallback.
docs/architecture.md Replaces old parallel-scan section with hybrid adaptive engine behavior/diagram.
docs/about/index.md Adds changelog link and release-track messaging.
CHANGELOG.md Adds v0.5.0a1 entry, including breaking change notes and new features.
CHANGELOG.it.md Removes the Italian changelog file.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

source="broken-links",
origin="zenzic",
)
)
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PluginRegistry.list_rules() documents/assumes results are sorted by entry-point name (and tests assert this), but when the core fallback "broken-links" is appended it is not re-sorted. If entry-points are discoverable but Zenzic’s own distribution metadata isn’t (e.g., running from source while third-party plugins are installed), this will return an unsorted list and make CLI output / API ordering non-deterministic. Consider sorting again before returning (or inserting the fallback in order).

Suggested change
)
)
# Ensure deterministic ordering: sorted by source name, as documented.
results.sort(key=lambda r: r.source)

Copilot uses AI. Check for mistakes.
Comment on lines +743 to +767
def _build_rule_engine(config: ZenzicConfig) -> AdaptiveRuleEngine | None:
"""Construct a :class:`~zenzic.core.rules.AdaptiveRuleEngine` from the config.

Returns ``None`` when no custom rules are configured, avoiding the
overhead of engine construction on projects that do not use the feature.
Load order is deterministic:

1. Core rules registered by Zenzic itself (always enabled).
2. Regex rules from ``[[custom_rules]]``.
3. External plugin rules explicitly listed in ``plugins = [...]``.

Returns ``None`` when no rules are available.
"""
from zenzic.core.rules import CustomRule # deferred to keep import graph clean
from zenzic.core.rules import CustomRule, PluginRegistry # deferred to keep import graph clean

if not config.custom_rules:
return None
rules = [
registry = PluginRegistry()
rules = registry.load_core_rules()
rules.extend(
CustomRule(
id=cr.id,
pattern=cr.pattern,
message=cr.message,
severity=cr.severity,
)
for cr in config.custom_rules
]
return RuleEngine(rules)

)
rules.extend(registry.load_selected_rules(config.plugins))
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_build_rule_engine() always loads core rules, and currently the only core rule is VSMBrokenLinkRule whose check() is a no-op (it only works via check_vsm). With no custom_rules/plugins configured, this still constructs an AdaptiveRuleEngine and causes _scan_single_file() to do an extra md_file.read_text() + engine.run() for every file, but it can never produce any findings. Consider returning None when there are no non-VSM rules to run in this pipeline (e.g., when config.custom_rules and config.plugins are empty), or alternatively running VSM-aware rules via run_vsm in a VSM-capable pipeline so core rules provide value.

Copilot uses AI. Check for mistakes.
Comment on lines +858 to +860
**O(N) reads:** each file is read exactly once in sequential mode. In
parallel mode external URL registration runs a lightweight sequential pass
in the main process after workers complete (workers discard scanners).
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scan_docs_references() docstring claims each file is read exactly once in sequential mode, but ReferenceScanner.harvest() already performs two independent passes over the file (Shield stream + content stream), and rule execution can add an additional read_text() pass. This docstring should be updated to reflect the actual read behaviour so users can reason about performance accurately (especially for the parallel+validate_links path which adds another scan pass).

Suggested change
**O(N) reads:** each file is read exactly once in sequential mode. In
parallel mode external URL registration runs a lightweight sequential pass
in the main process after workers complete (workers discard scanners).
**Read behaviour:** total I/O remains :math:`O(N)` in the number of files,
but individual files may be read multiple times. In sequential mode the
scanner typically performs separate Shield and content passes, and some
rules may trigger an additional ``read_text()`` call. In parallel mode the
same per-worker behaviour applies; when ``validate_links=True`` an extra
lightweight sequential pass in the main process registers external URLs
after workers complete (workers discard scanners).

Copilot uses AI. Check for mistakes.
@PythonWoods-Dev
Copy link
Copy Markdown
Contributor Author

superseded by release/v0.5.0a1-final-v2

@PythonWoods-Dev PythonWoods-Dev deleted the release/v0.5.0a1-final branch April 2, 2026 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants