Skip to content

feat: Integrate Bandit security scanner as pre-commit hook for automated Python code analysis #2417

@khushal-winner

Description

@khushal-winner

Is your feature request related to a problem? Please describe.
Currently, Python scripts in the scripts/ folder (e.g. convert.py, check_translations.py, capec_map_enricher.py) are not automatically scanned for common security issues during development or CI.
This means potential vulnerabilities like unsafe YAML deserialization (yaml.load without safe loader), dangerous subprocess calls, hardcoded secrets patterns, or string-based SQL injection risks could be introduced and go unnoticed until manual review or production issues occur.
Recent PRs like #2406 (switching to FAILSAFE_SCHEMA) show the value of proactive security hardening, but we need automated enforcement to catch similar issues early.

Describe the solution you'd like
Add Bandit (https://github.com/PyCQA/bandit) — a fast, AST-based security linter for Python — as a pre-commit hook in the existing .pre-commit-config.yaml.

This would:

  • Run automatically on git commit for changed .py files in scripts/
  • Flag high-confidence/high-severity issues (e.g. unsafe YAML, exec/eval, pickle risks)
  • Be fast (<1s on typical changes) and non-blocking
  • Allow easy configuration (e.g. severity thresholds, exclusions for tests/)

Proposed hook addition to .pre-commit-config.yaml:

- repo: https://github.com/PyCQA/bandit
  rev: 1.7.9  # latest stable as of 2026
  hooks:
    - id: bandit
      name: "Bandit security scan"
      args: ["-r", "scripts/", "--severity-level", "high", "--confidence-level", "high", "--format", "txt"]
      files: \.py$
      exclude: ^tests/

Describe alternatives you've considered

  1. Relying on existing pylint (already in pre-commit)
    → Pylint catches some general issues, but it's not security-focused. It misses dedicated vuln patterns like unsafe yaml.load (B301), exec/eval, pickle risks, or subprocess shell=True — Bandit's AST-based checks are purpose-built for these.

  2. Running Bandit (or similar) only in CI (GitHub Actions)
    → This is a good backup and would block bad merges, but it doesn't prevent developers from committing vulnerable code in the first place. Pre-commit catches issues at the earliest stage (before even pushing), saving reviewer time and reducing noise in PRs.

  3. Using heavier / more advanced security tools (e.g. Semgrep, Snyk, Trivy, or SonarQube)
    → Semgrep is powerful for custom rules but slower and more complex to set up. Snyk/Trivy require accounts or external services. Bandit is lightweight, Python-native, OWASP-aligned, fast (<1s on typical changes), and already widely used in security-focused projects — making it the ideal first step without adding dependencies or overhead.

  4. Doing nothing / waiting for more manual hardening PRs
    → This risks subtle vulnerabilities accumulating in scripts/ (especially as features grow, e.g. more external calls in the converter). Proactive automation aligns better with OWASP's security ethos.

Additional context

  • Current pre-commit already includes gitleaks (secrets), shellcheck (shell), and pylint (general Python quality) — adding Bandit fills the dedicated Python security linting gap without overlap.
  • No performance hit: Bandit runs in <1s on scripts/, only on staged .py files.
  • Complements recent security PRs (fix: explicitly use FAILSAFE_SCHEMA for yaml.load() security hardening #2406 FAILSAFE_SCHEMA switch) by enforcing similar best practices automatically.
  • If approved, happy to:
    • Open PR with exact .pre-commit-config.yaml update
    • Add optional bandit.yml for custom skips/rules
    • Update README "Development Setup" section
    • Run trial scans and share results

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions