A security gate tool for Python projects that integrates multiple security scanners into a unified CI/CD pipeline with policy-based enforcement and baseline management.
- Multi-Tool Integration: Run Bandit, pip-audit, and Semgrep from a single command
- Policy-Based Enforcement: Configure severity thresholds for pass/warn/fail
- Baseline Management: Suppress known low/medium findings while enforcing new issues
- Structured Output: JSON reports for automation and auditing
- Configuration File Support: Define project security policies in
.sentinel.toml - Flexible Exclusions: Customize which files/directories to skip
- Tool Validation: Check if required tools are installed before running
- Detailed Logging: Configurable log levels for debugging
- Zero Dependencies: Core runs on Python stdlib (tools installed separately)
- ML-Based False Positive Reduction: Intelligent scoring to predict which findings are likely false positives
Sentinel can use machine learning to predict which findings are likely false positives, helping you focus on real security issues.
# Use heuristic scoring (no model or dependencies needed)
sentinel --ml-enabled
# Use a trained model
sentinel --ml-enabled --ml-model-path models/my-model.jsonThe ML scorer analyzes each finding using 25+ features:
File Path Signals:
- Test files (
tests/,*_test.py) - Scripts and tools (
scripts/,bin/) - Migrations (
migrations/,alembic/) - Example/demo code
- Vendor/third-party code
Code Pattern Detection:
- User input sources (
request.,sys.argv) - Shell execution (
shell=True,os.system) - Dangerous functions (
eval,exec) - Hardcoded secrets (
password =,api_key =) - SQL queries with string formatting
- File operations, network calls, crypto usage
Report Output:
{
"ml_score": 0.234,
"ml_label": "likely_fp",
"ml_confidence": 0.532,
"ml_reason": [
{"feature": "is_test_file", "contribution": -0.3},
{"feature": "severity_high_critical", "contribution": 0.2}
],
"model_type": "heuristic"
}You can train models on your own labeled data:
- Label findings as true/false positives in
my_training_data.json:
{
"findings": [
{
"finding": {...},
"label": false,
"code_snippet": "assert user.is_authenticated"
}
]
}- Train the model (requires
scikit-learn):
pip install ".[ml]"
python examples/train_ml_model.py- Use your model:
sentinel --ml-enabled --ml-model-path models/sentinel-ml-model.jsonAdd to .sentinel.toml:
[ml]
enabled = true
model_path = "models/sentinel-ml-model.json" # Optional- Explainable: See top contributing features for each score
- Lightweight: Heuristic mode works without dependencies
- Trainable: Learn from your team's historical data
- Optional: ML is off by default, fully opt-in
See docs/ml_scoring.md for details.
pip install -e .Sentinel requires external security tools to be installed:
# Install security scanners
pip install bandit pip-audit semgrep
# Or install specific versions
pip install bandit==1.7.5 pip-audit==2.6.1 semgrep==1.45.0Scan the current directory:
sentinelScan a specific directory:
sentinel /path/to/repoSelect specific tools:
sentinel --tools bandit,pip-auditCreate a baseline from current findings:
sentinel --write-baseline sentinel-baseline.jsonRun scan with baseline filtering:
sentinel --baseline sentinel-baseline.jsonOnly new findings (or existing high/critical) will cause failures.
Create .sentinel.toml in your project root:
[policy]
fail_on = ["high", "critical"]
warn_on = ["medium"]
tools = ["bandit", "pip-audit", "semgrep"]
exclusions = ["vendor", "third_party"]
baseline_path = "sentinel-baseline.json"
report_path = "sentinel-report.json"
log_level = "INFO"See .sentinel.toml.example for a complete example.
sentinel [path] [options]
Positional Arguments:
path Path to repository (default: current directory)
Options:
--baseline PATH Path to baseline JSON file for filtering
--write-baseline PATH Create baseline file and exit
--out PATH Output report path (default: sentinel-report.json)
--tools TOOLS Comma-separated tools to run (default: all)
--config PATH Path to config file (default: .sentinel.toml)
--exclude PATTERN Exclusion pattern (can be repeated)
--log-level LEVEL Logging level (DEBUG, INFO, WARNING, ERROR)
--skip-tool-check Skip checking if external tools are installed
Sentinel uses exit codes to integrate with CI/CD:
- 0: Passed (no findings above threshold)
- 1: Passed with warnings (medium findings only)
- 2: Failed (high or critical findings present)
Configure thresholds in .sentinel.toml:
[policy]
fail_on = ["critical"] # Only critical findings fail
warn_on = ["high", "medium"] # High and medium trigger warnings{
"repo_path": "/path/to/repo",
"generated_at": "2024-01-15T10:30:00Z",
"findings": [
{
"tool": "bandit",
"severity": "high",
"title": "B602",
"path": "app/shell.py",
"line": 42,
"message": "subprocess call with shell=True",
"metadata": {
"test_id": "B602",
"confidence": "high"
},
"fingerprint": "a1b2c3d4e5f6g7h8"
}
],
"counts": {
"low": 5,
"medium": 2,
"high": 1,
"critical": 0
}
}Sentinel summary:
critical: 0
high: 1
medium: 2
low: 5
Report: sentinel-report.json
Top high findings:
- B602 | app/shell.py:42 | a1b2c3d4e5f6g7h8
Baselines allow you to acknowledge existing technical debt while preventing new issues:
- Create Baseline: Captures fingerprints of current low/medium findings
- Filter on Scan: Suppresses baselined low/medium findings
- Never Suppress High/Critical: High and critical findings always surface
# Scan and create baseline
sentinel --write-baseline baseline.json
# Commit to version control
git add baseline.json
git commit -m "Add security baseline"# Run with baseline filtering
sentinel --baseline baseline.jsonOr configure in .sentinel.toml:
baseline_path = "sentinel-baseline.json"- Baseline low/medium findings during initial adoption
- Gradually fix baselined issues over time
- Never baseline high/critical findings (they always fail)
- Review baseline changes in pull requests
Scans Python code for common security issues:
# Runs: bandit -r . -f json --quietDefault exclusions: .venv, .git, __pycache__, etc.
Scans Python dependencies for known vulnerabilities:
# Runs: pip-audit -r requirements.txt -f jsonLooks for requirements.txt or requirements-dev.txt.
Pattern-based static analysis:
# Runs: semgrep scan --config=auto --json --quietUses Semgrep Registry rules (requires network access).
name: Security Gate
on: [push, pull_request]
jobs:
security:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Install tools
run: |
pip install sentinel-pipeline
pip install bandit pip-audit semgrep
- name: Run Sentinel
run: sentinel --baseline sentinel-baseline.json
- name: Upload report
if: always()
uses: actions/upload-artifact@v3
with:
name: security-report
path: sentinel-report.jsonsecurity_gate:
stage: test
image: python:3.11
before_script:
- pip install sentinel-pipeline bandit pip-audit semgrep
script:
- sentinel --baseline sentinel-baseline.json
artifacts:
when: always
paths:
- sentinel-report.json# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Run tests with coverage
pytest --cov=sentinel# Format code
ruff format .
# Lint code
ruff check .See docs/architecture.md for detailed architecture documentation.
- JSON as Source of Truth: All decisions derive from structured output
- Fail Loudly: Tool errors become high-severity findings
- Protocol-Based Runners: Easy to add new scanners
- Deterministic Fingerprints: Enable reproducible baselining
- Zero Network Calls: All operations are local (except tools themselves)
- Architecture: System design and components
- Design Decisions: Rationale for key choices
- Threat Model: Security considerations
Contributions welcome! Please:
- Add tests for new features
- Update documentation
- Follow existing code style
- Ensure all tests pass
MIT License - see LICENSE file for details
- Added ML-based false positive prediction system
- Heuristic scoring with 25+ features (no dependencies required)
- Optional trained model support using logistic regression
- Explainable predictions with feature contributions
- ML scoring integration in CLI and reports
- Training infrastructure with example data
- Comprehensive ML documentation
- Support for running tools as Python modules when not in PATH
- Enhanced tool validation to check both PATH and module execution
- Added configuration file support (
.sentinel.toml) - Added tool availability validation
- Added path validation in CLI
- Made exclusions configurable
- Added structured logging system
- Improved error messages with better context
- Added comprehensive test coverage (53 tests)
- Fixed duplicate code in semgrep runner
- Completed architecture and threat model documentation
- Initial release
- Basic scanner integration (Bandit, pip-audit, Semgrep)
- Baseline management
- JSON reports
- Policy-based exit codes