AIShield 🔒

Security scanner for the LLM fine-tuning lifecycle

Detect backdoors in datasets, malicious LoRA adapters, data poisoning, and model weight tampering across the entire fine-tuning pipeline — from base model selection through deployment.

Built for security engineers, ML engineers, and auditors securing LLM fine-tuning workflows against supply chain attacks, backdoor injection, and safety alignment bypass.

Features

Dataset Poisoning Detection — Trigger patterns, label flipping, role imbalance, duplicate injection, statistical anomalies (Shannon entropy, length distributions)
LoRA Backdoor Detection — Suspicious target modules (embed, lm_head), extreme weight values, trigger keywords in config, layer modification analysis
Weight Integrity Verification — SHA-256 manifests, model fingerprinting, format validation, symlink detection, tamper detection
Pipeline Audit — Hardcoded credential scanning, unsafe model loading patterns, suspicious imports, environment variable exposure, dependency pinning checks
Supply Chain Analysis — Base model provenance, fine-tuning records, deployment configuration review, model card validation
Compliance Checks — NIST AI RMF 1.0 (12 checks across Govern/Map/Measure/Manage), OWASP LLM Top 10 (10 categories)
Multiple Output Formats — Rich console, structured JSON, Jinja2 HTML with auto-escaped templates
Configurable — Adjustable file size limits, scan timeout, and statistical outlier thresholds
MITRE ATLAS Mapped — All findings mapped to MITRE ATLAS v2 techniques (25 entries)
Ecosystem Integrations — MCPGuard policy export, mcp-taxonomy adapter for MCPscop dashboards

Installation

pip install aishield-scanner

With optional PyTorch support for deep weight analysis:

pip install aishield-scanner[torch]

For development:

git clone https://github.com/Carlos-Projects/AIShield.git
cd AIShield
pip install -e ".[dev]"

Quick Start

# Full security scan
aishield scan ./my-fine-tuned-model/

# Dataset poisoning analysis
aishield dataset ./training-data/

# LoRA adapter analysis
aishield lora ./lora-adapter/

# Weight integrity check
aishield weights ./model/

# Pipeline audit
aishield pipeline ./fine-tuning-project/

# Generate weight manifest
aishield manifest ./model/

# Verify weights against manifest
aishield manifest ./model/ --verify

# Supply chain trust assessment
aishield supply-chain ./model/

CLI Commands

Command	Description
`aishield scan <path>`	Full security scan (dataset + LoRA + weights + pipeline)
`aishield dataset <path>`	Dataset poisoning and provenance analysis
`aishield lora <path>`	LoRA adapter backdoor detection
`aishield weights <path>`	Weight integrity and fingerprinting
`aishield pipeline <path>`	Pipeline audit and supply chain analysis
`aishield manifest <path>`	Generate or verify weight integrity manifest
`aishield supply-chain <path>`	Supply chain trust assessment

Output Formats

# JSON output
aishield scan ./model/ --json

# HTML report
aishield scan ./model/ --html report.html

# Save to file
aishield scan ./model/ -o report.txt

# Redact local paths from output
aishield scan ./model/ --redact-paths

# NIST AI RMF compliance check
aishield pipeline ./project/ --compliance nist

# OWASP LLM Top 10 coverage
aishield pipeline ./project/ --compliance owasp

Advanced Usage

Configurable Scan Flags

# Scan only specific types
aishield scan ./model/ --types dataset,lora

# Skip files larger than 50MB
aishield scan ./model/ --max-file-size 52428800  # 50MB in bytes

# Set scan timeout to 10 minutes
aishield scan ./model/ --timeout 600

# Tighten outlier detection (z-score threshold = 2.0)
aishield dataset ./data/ --outlier-threshold 2.0

# Disable timeout entirely
aishield scan ./model/ --timeout 0

# Combine flags (256MB, 15min timeout, tighter anomaly detection)
aishield scan ./large-model/ \
  --max-file-size 268435456 \
  --timeout 900 \
  --outlier-threshold 2.5 \
  --redact-paths

Supply Chain Assessment

# Full supply chain report
aishield supply-chain ./model/ --json

# Generate and verify weight manifest
aishield manifest ./model/
aishield manifest ./model/ --verify

Examples

Try AIShield with the included sample files:

cd examples/

# Scan sample dataset
aishield dataset . --json

# Full scan with HTML report
aishield scan . --html report.html
open report.html

Compliance Reporting

# NIST AI RMF 1.0
aishield pipeline ./project/ --compliance nist

# OWASP LLM Top 10
aishield pipeline ./project/ --compliance owasp

Architecture

aishield/
├── scanner.py              # Core scanning engine + Finding/ScanResult models
├── cli.py                  # Typer CLI interface (7 commands)
├── dataset/
│   ├── poisoning_detector.py   # Trigger patterns, label flipping, duplicates
│   ├── provenance_verifier.py  # Chain of custody, source verification
│   └── statistics.py           # Shannon entropy, length anomalies (z-score configurable)
├── lora/
│   ├── analyzer.py             # Config analysis, target module checks
│   ├── backdoor_detector.py    # Weight extremes, layer modifications
│   └── diff.py                 # Adapter vs base model diff
├── weights/
│   ├── integrity_checker.py    # Manifest verification, format checks, symlinks
│   ├── fingerprinter.py        # Model fingerprinting for tamper detection
│   └── manifest.py             # SHA-256 weight manifest generation
├── pipeline/
│   ├── auditor.py              # Credential, unsafe load, import audit
│   ├── supply_chain.py         # Base model → fine-tune → deploy tracing
│   └── compliance.py           # NIST AI RMF, OWASP LLM Top 10 checks
├── reporters/
│   ├── console.py              # Rich-formatted console output
│   ├── json.py                 # Structured JSON export
│   └── html.py                 # Jinja2 HTML reports (auto-escaped)
├── utils/
│   ├── crypto.py               # SHA-256 hashing, fingerprinting
│   └── file_io.py              # Streaming reads, UTF-8 validation
├── export/
│   └── mcpguard.py             # MCPGuard-compatible policy generation
└── taxonomy.py                 # mcp-taxonomy adapter

Ecosystem Integration

Project	Integration
reverse-abliterate	Shared weight manifest patterns, CLI conventions
MCPGuard	Generates YAML policies compatible with MCPGuard rules
MCPscop	JSON reports via `mcp-taxonomy` adapter → MCPscop dashboard
mcp-taxonomy	`aishield_finding_to_taxonomy()` maps 7 categories → AttackCategory
agentforensics	Post-incident forensics for AI agent behavior analysis

Related Projects

AIShield is part of the Carlos-Projects AI security ecosystem:

Project	Description
AIShield	🔒 Security scanner for LLM fine-tuning (you are here)
MCPGuard	🛡️ Runtime security proxy for MCP/A2A protocols
MCPwn	⚔️ Offensive security testing for MCP servers
Palisade Scanner	🔍 Web content scanner for prompt injection
reverse-abliterate	🔄 Detect and reverse model abliteration
AgentGate	🚪 Policy-based firewall for AI agents
ModelChain	📦 SBOM generator for AI models
DataShield	🧹 Privacy-preserving data sanitization for AI training
mcp-taxonomy	📐 Canonical MCP security taxonomy
MCPscop	📊 Unified security dashboard for MCP scanners

Academic Foundation

AIShield's detection methodology is grounded in peer-reviewed research and industry frameworks:

arXiv:2605.25073 — "Security in the Fine-Tuning Lifecycle of Large Language Models"
arXiv:2605.25937 — "Building an Adversarial Malware Dataset by Family and Type"
arXiv:2605.25376 — "KYA: A Framework-Agnostic Trust Layer for Autonomous Systems"
MITRE ATLAS v2 — 25 technique mappings for fine-tuning attack scenarios
NIST AI RMF 1.0 — AI Risk Management Framework (Govern, Map, Measure, Manage)
OWASP Top 10 for LLMs — LLM application security risk coverage

Development

Quick start

make dev       # install dev dependencies
make test      # run 212 tests
make lint      # ruff check (0 errors)
make typecheck # mypy check (0 errors)
make coverage  # test + coverage report
make check     # lint + typecheck + test (all-in-one)
make build     # build Python package
make docker    # build Docker image

Manual commands

# Install dev dependencies
pip install -e ".[dev]"

# Run tests (212 tests)
python -m pytest tests/ -v

# Run with coverage
python -m pytest tests/ -v --cov

# Lint (ruff — 0 errors)
ruff check src/ tests/

# Type check (mypy — 0 errors on 30 source files)
mypy src/aishield/

License

MIT — See LICENSE for details.

Report Bug · Request Feature · Contributing · Security · Code of Conduct · Changelog

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github		.github
docs		docs
examples		examples
src/aishield		src/aishield
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
SECURITY_REVIEW.md		SECURITY_REVIEW.md
SECURITY_REVIEW_DEEP.md		SECURITY_REVIEW_DEEP.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AIShield 🔒

Table of Contents

Features

Installation

Quick Start

CLI Commands

Output Formats

Advanced Usage

Configurable Scan Flags

Supply Chain Assessment

Examples

Compliance Reporting

Architecture

Ecosystem Integration

Related Projects

Academic Foundation

Development

Quick start

Manual commands

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AIShield 🔒

Table of Contents

Features

Installation

Quick Start

CLI Commands

Output Formats

Advanced Usage

Configurable Scan Flags

Supply Chain Assessment

Examples

Compliance Reporting

Architecture

Ecosystem Integration

Related Projects

Academic Foundation

Development

Quick start

Manual commands

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages