Security scanner for the LLM fine-tuning lifecycle
Detect backdoors in datasets, malicious LoRA adapters, data poisoning, and model weight tampering across the entire fine-tuning pipeline — from base model selection through deployment.
Built for security engineers, ML engineers, and auditors securing LLM fine-tuning workflows against supply chain attacks, backdoor injection, and safety alignment bypass.
- Features
- Installation
- Quick Start
- CLI Commands
- Advanced Usage
- Architecture
- Ecosystem Integration
- Related Projects
- Academic Foundation
- Development
- License
- Dataset Poisoning Detection — Trigger patterns, label flipping, role imbalance, duplicate injection, statistical anomalies (Shannon entropy, length distributions)
- LoRA Backdoor Detection — Suspicious target modules (embed, lm_head), extreme weight values, trigger keywords in config, layer modification analysis
- Weight Integrity Verification — SHA-256 manifests, model fingerprinting, format validation, symlink detection, tamper detection
- Pipeline Audit — Hardcoded credential scanning, unsafe model loading patterns, suspicious imports, environment variable exposure, dependency pinning checks
- Supply Chain Analysis — Base model provenance, fine-tuning records, deployment configuration review, model card validation
- Compliance Checks — NIST AI RMF 1.0 (12 checks across Govern/Map/Measure/Manage), OWASP LLM Top 10 (10 categories)
- Multiple Output Formats — Rich console, structured JSON, Jinja2 HTML with auto-escaped templates
- Configurable — Adjustable file size limits, scan timeout, and statistical outlier thresholds
- MITRE ATLAS Mapped — All findings mapped to MITRE ATLAS v2 techniques (25 entries)
- Ecosystem Integrations — MCPGuard policy export, mcp-taxonomy adapter for MCPscop dashboards
pip install aishield-scannerWith optional PyTorch support for deep weight analysis:
pip install aishield-scanner[torch]For development:
git clone https://github.com/Carlos-Projects/AIShield.git
cd AIShield
pip install -e ".[dev]"# Full security scan
aishield scan ./my-fine-tuned-model/
# Dataset poisoning analysis
aishield dataset ./training-data/
# LoRA adapter analysis
aishield lora ./lora-adapter/
# Weight integrity check
aishield weights ./model/
# Pipeline audit
aishield pipeline ./fine-tuning-project/
# Generate weight manifest
aishield manifest ./model/
# Verify weights against manifest
aishield manifest ./model/ --verify
# Supply chain trust assessment
aishield supply-chain ./model/| Command | Description |
|---|---|
aishield scan <path> |
Full security scan (dataset + LoRA + weights + pipeline) |
aishield dataset <path> |
Dataset poisoning and provenance analysis |
aishield lora <path> |
LoRA adapter backdoor detection |
aishield weights <path> |
Weight integrity and fingerprinting |
aishield pipeline <path> |
Pipeline audit and supply chain analysis |
aishield manifest <path> |
Generate or verify weight integrity manifest |
aishield supply-chain <path> |
Supply chain trust assessment |
# JSON output
aishield scan ./model/ --json
# HTML report
aishield scan ./model/ --html report.html
# Save to file
aishield scan ./model/ -o report.txt
# Redact local paths from output
aishield scan ./model/ --redact-paths
# NIST AI RMF compliance check
aishield pipeline ./project/ --compliance nist
# OWASP LLM Top 10 coverage
aishield pipeline ./project/ --compliance owasp# Scan only specific types
aishield scan ./model/ --types dataset,lora
# Skip files larger than 50MB
aishield scan ./model/ --max-file-size 52428800 # 50MB in bytes
# Set scan timeout to 10 minutes
aishield scan ./model/ --timeout 600
# Tighten outlier detection (z-score threshold = 2.0)
aishield dataset ./data/ --outlier-threshold 2.0
# Disable timeout entirely
aishield scan ./model/ --timeout 0
# Combine flags (256MB, 15min timeout, tighter anomaly detection)
aishield scan ./large-model/ \
--max-file-size 268435456 \
--timeout 900 \
--outlier-threshold 2.5 \
--redact-paths# Full supply chain report
aishield supply-chain ./model/ --json
# Generate and verify weight manifest
aishield manifest ./model/
aishield manifest ./model/ --verifyTry AIShield with the included sample files:
cd examples/
# Scan sample dataset
aishield dataset . --json
# Full scan with HTML report
aishield scan . --html report.html
open report.html# NIST AI RMF 1.0
aishield pipeline ./project/ --compliance nist
# OWASP LLM Top 10
aishield pipeline ./project/ --compliance owaspaishield/
├── scanner.py # Core scanning engine + Finding/ScanResult models
├── cli.py # Typer CLI interface (7 commands)
├── dataset/
│ ├── poisoning_detector.py # Trigger patterns, label flipping, duplicates
│ ├── provenance_verifier.py # Chain of custody, source verification
│ └── statistics.py # Shannon entropy, length anomalies (z-score configurable)
├── lora/
│ ├── analyzer.py # Config analysis, target module checks
│ ├── backdoor_detector.py # Weight extremes, layer modifications
│ └── diff.py # Adapter vs base model diff
├── weights/
│ ├── integrity_checker.py # Manifest verification, format checks, symlinks
│ ├── fingerprinter.py # Model fingerprinting for tamper detection
│ └── manifest.py # SHA-256 weight manifest generation
├── pipeline/
│ ├── auditor.py # Credential, unsafe load, import audit
│ ├── supply_chain.py # Base model → fine-tune → deploy tracing
│ └── compliance.py # NIST AI RMF, OWASP LLM Top 10 checks
├── reporters/
│ ├── console.py # Rich-formatted console output
│ ├── json.py # Structured JSON export
│ └── html.py # Jinja2 HTML reports (auto-escaped)
├── utils/
│ ├── crypto.py # SHA-256 hashing, fingerprinting
│ └── file_io.py # Streaming reads, UTF-8 validation
├── export/
│ └── mcpguard.py # MCPGuard-compatible policy generation
└── taxonomy.py # mcp-taxonomy adapter
| Project | Integration |
|---|---|
| reverse-abliterate | Shared weight manifest patterns, CLI conventions |
| MCPGuard | Generates YAML policies compatible with MCPGuard rules |
| MCPscop | JSON reports via mcp-taxonomy adapter → MCPscop dashboard |
| mcp-taxonomy | aishield_finding_to_taxonomy() maps 7 categories → AttackCategory |
| agentforensics | Post-incident forensics for AI agent behavior analysis |
AIShield is part of the Carlos-Projects AI security ecosystem:
| Project | Description |
|---|---|
| AIShield | 🔒 Security scanner for LLM fine-tuning (you are here) |
| MCPGuard | 🛡️ Runtime security proxy for MCP/A2A protocols |
| MCPwn | ⚔️ Offensive security testing for MCP servers |
| Palisade Scanner | 🔍 Web content scanner for prompt injection |
| reverse-abliterate | 🔄 Detect and reverse model abliteration |
| AgentGate | 🚪 Policy-based firewall for AI agents |
| ModelChain | 📦 SBOM generator for AI models |
| DataShield | 🧹 Privacy-preserving data sanitization for AI training |
| mcp-taxonomy | 📐 Canonical MCP security taxonomy |
| MCPscop | 📊 Unified security dashboard for MCP scanners |
AIShield's detection methodology is grounded in peer-reviewed research and industry frameworks:
- arXiv:2605.25073 — "Security in the Fine-Tuning Lifecycle of Large Language Models"
- arXiv:2605.25937 — "Building an Adversarial Malware Dataset by Family and Type"
- arXiv:2605.25376 — "KYA: A Framework-Agnostic Trust Layer for Autonomous Systems"
- MITRE ATLAS v2 — 25 technique mappings for fine-tuning attack scenarios
- NIST AI RMF 1.0 — AI Risk Management Framework (Govern, Map, Measure, Manage)
- OWASP Top 10 for LLMs — LLM application security risk coverage
make dev # install dev dependencies
make test # run 212 tests
make lint # ruff check (0 errors)
make typecheck # mypy check (0 errors)
make coverage # test + coverage report
make check # lint + typecheck + test (all-in-one)
make build # build Python package
make docker # build Docker image# Install dev dependencies
pip install -e ".[dev]"
# Run tests (212 tests)
python -m pytest tests/ -v
# Run with coverage
python -m pytest tests/ -v --cov
# Lint (ruff — 0 errors)
ruff check src/ tests/
# Type check (mypy — 0 errors on 30 source files)
mypy src/aishield/MIT — See LICENSE for details.
Report Bug · Request Feature · Contributing · Security · Code of Conduct · Changelog