Skip to content

feat: add NVD CVE database lookup for banner scanning#3

Merged
santhsecurity merged 2 commits into
santhsecurity:mainfrom
PNP-MA:feat/nvd-cve-lookup
May 23, 2026
Merged

feat: add NVD CVE database lookup for banner scanning#3
santhsecurity merged 2 commits into
santhsecurity:mainfrom
PNP-MA:feat/nvd-cve-lookup

Conversation

@PNP-MA

@PNP-MA PNP-MA commented May 23, 2026

Copy link
Copy Markdown
Contributor

Integrates a local NVD SQLite cache for CVE matching beyond the built-in rule set. The scanner queries extracted banner keywords against CVE descriptions when a synced NVD database is available.

Also wires cve_ids and confidence into portscan findings — CVE ID is now a structured field and confidence is derived from CVSS score (built-in rules) or severity (NVD lookup).

Adds --nvd-db CLI arg for configuring the database path. NVD findings carry the nvd tag so they can be filtered independently.

Verification

Before the fix, portscan CVE findings had "cve_ids": [] and "confidence": null. After:

{
  "confidence": 0.53,
  "cve_ids": ["CVE-2018-15473"]
}

Usage

The NVD database must be synced separately with nvd_cve (install via cargo install nvd_cve). Once synced to the default path (~/.cache/nvd/nvd.sqlite3), gossan automatically queries it during port scans.

Summary by CodeRabbit

  • New Features
    • Added CVE lookup using a local NVD database to enrich scan results when an NVD cache is available.
    • Scan results now show associated CVE identifiers and a confidence score for each match.
    • New optional CLI argument to specify a custom local NVD database path for lookups.

Review Change Stack

@coderabbitai

coderabbitai Bot commented May 23, 2026

Copy link
Copy Markdown

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ea45f247-8e41-4ebd-8ade-13fbaf052663

📥 Commits

Reviewing files that changed from the base of the PR and between 5f589f2 and bad9850.

📒 Files selected for processing (4)
  • crates/cli/src/args.rs
  • crates/cli/src/main.rs
  • crates/portscan/src/cve/nvd.rs
  • crates/portscan/src/lib.rs

📝 Walkthrough

Walkthrough

Adds optional CLI nvd_db path, rusqlite dependency, an NVD lookup module (init, try_search, banner parsing, severity heuristics, tests), and integrates NVD enrichment into portscan to append CVE-tagged findings with confidence scores.

Changes

NVD CVE Database Lookup

Layer / File(s) Summary
CLI option and dependency export
crates/cli/src/args.rs, crates/portscan/Cargo.toml, crates/portscan/src/cve.rs
Adds nvd_db: Option<String> (env NVD_DB_PATH) behind portscan feature; adds rusqlite dependency; cve.rs exports pub mod nvd.
Application startup initialization
crates/cli/src/main.rs
Resolves DB path from CLI or env and calls gossan_portscan::cve::nvd::init(...) at startup.
NvdDatabase init and lifecycle
crates/portscan/src/cve/nvd.rs (init/path logic)
Introduces NvdDatabase, global OnceLock, init and try_search, and default DB path resolution (XDG_CACHE_HOME/HOME/temp).
Banner parsing and SQLite search
crates/portscan/src/cve/nvd.rs (core search, helpers, tests)
Implements banner normalization and term extraction, search_description using SQLite LIKE, JSON extraction of English descriptions, keyword-based severity heuristics, building Findings with CVE id, confidence, evidence, and tags; includes unit tests.
CVE enrichment in portscan
crates/portscan/src/cve.rs, crates/portscan/src/lib.rs
correlate_with_rules sets CVE and confidence from rule.cvss; probe_port (cfg feature "nvd") spawns blocking task to call cve::nvd::try_search and extends findings with NVD results.

Sequence Diagram

sequenceDiagram
  participant User
  participant CLI
  participant Main
  participant Portscan
  participant NvdDatabase
  participant SQLite
  User->>CLI: run CLI with/without --nvd-db
  CLI->>Main: parsed nvd_db value
  Main->>NvdDatabase: init(db_path)
  Portscan->>NvdDatabase: try_search(banner, service)
  NvdDatabase->>SQLite: query cve WHERE description LIKE '%term%'
  SQLite-->>NvdDatabase: rows with cve id + data JSON
  NvdDatabase-->>Portscan: findings (CVE id, severity, confidence, evidence)
  Portscan->>Main: append findings
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I nibble banners, sniff the DB well,
I query SQLite where CVE stories dwell,
I tease out versions, map severity true,
And hop back with findings, a rabbit's small coup!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the main change: adding NVD CVE database lookup functionality for banner scanning. It directly reflects the primary purpose of the PR.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Integrates a local NVD SQLite cache for CVE matching beyond the built-in rule set. The scanner queries extracted banner keywords against CVE descriptions when a synced NVD database is available.

Also wires `cve_ids` and `confidence` into portscan findings — CVE ID is now a structured field and confidence is derived from CVSS score (built-in rules) or severity (NVD lookup).

Adds `--nvd-db` CLI arg for configuring the database path. NVD findings carry the `nvd` tag so they can be filtered independently.
@PNP-MA PNP-MA force-pushed the feat/nvd-cve-lookup branch from 4504b6f to 5f589f2 Compare May 23, 2026 18:35

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/portscan/src/cve.rs`:
- Around line 349-350: Clamp the computed confidence to the [0.0, 1.0] range
before passing it to .confidence: compute a bounded value from rule.cvss (e.g.
let confidence = (rule.cvss / 10.0).clamp(0.0, 1.0) and then call
.confidence(confidence as f64)) so out-of-range CVSS in user rules cannot
produce invalid confidence values; update the code around .cve(&rule.cve) and
.confidence((rule.cvss / 10.0) as f64) to use this clamped value.

In `@crates/portscan/src/cve/nvd.rs`:
- Around line 172-183: The search_description function issues unbounded "LIKE"
scans (search_description) which can explode on large NVD DBs; change the
function to bound matches and/or use full-text search: update the signature to
accept a max_results parameter (or switch to using the FTS5 virtual table and
MATCH) and modify the SQL to either use an FTS MATCH query or append "LIMIT ?2"
to the prepared statement, passing the max_results parameter when calling
stmt.query_map; ensure callers are updated to pass a sensible limit so repeated
keyword queries cannot trigger unbounded scans.

In `@crates/portscan/src/lib.rs`:
- Around line 680-686: The spawn_blocking join result for the NVD lookup is
currently ignoring Err(JoinError) silently; update the #[cfg(feature = "nvd")]
block that calls tokio::task::spawn_blocking(move || cve::nvd::try_search(&b,
&s)).await to explicitly handle the Err branch: match the await result (or use
if let Err(e) = ...) and emit a clear diagnostic including the JoinError and
context (e.g., which CVE/scan input b/s and that the NVD enrichment failed)
instead of dropping it, then decide whether to continue without enrichment or
propagate the error; reference the cve::nvd::try_search call, the spawn_blocking
future, and the nvd_findings/findings variables when adding the
logging/handling.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3c78a381-06c9-4718-956a-f99957ff9ece

📥 Commits

Reviewing files that changed from the base of the PR and between 9f2d36a and 5f589f2.

📒 Files selected for processing (6)
  • crates/cli/src/args.rs
  • crates/cli/src/main.rs
  • crates/portscan/Cargo.toml
  • crates/portscan/src/cve.rs
  • crates/portscan/src/cve/nvd.rs
  • crates/portscan/src/lib.rs

Comment on lines +349 to +350
.cve(&rule.cve)
.confidence((rule.cvss / 10.0) as f64)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Clamp confidence to [0.0, 1.0] before writing it.

Line 350 assumes CVSS is always bounded. Since community rules are user-provided, out-of-range CVSS can produce invalid confidence values.

Suggested fix
-            .confidence((rule.cvss / 10.0) as f64)
+            .confidence(((rule.cvss / 10.0) as f64).clamp(0.0, 1.0))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
.cve(&rule.cve)
.confidence((rule.cvss / 10.0) as f64)
.cve(&rule.cve)
.confidence(((rule.cvss / 10.0) as f64).clamp(0.0, 1.0))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/portscan/src/cve.rs` around lines 349 - 350, Clamp the computed
confidence to the [0.0, 1.0] range before passing it to .confidence: compute a
bounded value from rule.cvss (e.g. let confidence = (rule.cvss /
10.0).clamp(0.0, 1.0) and then call .confidence(confidence as f64)) so
out-of-range CVSS in user rules cannot produce invalid confidence values; update
the code around .cve(&rule.cve) and .confidence((rule.cvss / 10.0) as f64) to
use this clamped value.

Comment on lines +172 to +183
fn search_description(
conn: &rusqlite::Connection,
text: &str,
) -> Result<Vec<String>, rusqlite::Error> {
let mut stmt = conn.prepare(
"SELECT id FROM cve WHERE description LIKE '%' || ?1 || '%'",
)?;
let ids = stmt
.query_map(rusqlite::params![text], |row| row.get(0))?
.filter_map(|r| r.ok())
.collect();
Ok(ids)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Bound NVD match cardinality to avoid unbounded scan cost.

Line 177 performs an unbounded LIKE scan, and the caller runs it for many keywords. On a large NVD DB this can explode results and stall scans.

Suggested fix
 fn search_description(
     conn: &rusqlite::Connection,
     text: &str,
 ) -> Result<Vec<String>, rusqlite::Error> {
     let mut stmt = conn.prepare(
-        "SELECT id FROM cve WHERE description LIKE '%' || ?1 || '%'",
+        "SELECT id FROM cve WHERE description LIKE '%' || ?1 || '%' LIMIT 200",
     )?;
     let ids = stmt
         .query_map(rusqlite::params![text], |row| row.get(0))?
         .filter_map(|r| r.ok())
         .collect();
     Ok(ids)
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fn search_description(
conn: &rusqlite::Connection,
text: &str,
) -> Result<Vec<String>, rusqlite::Error> {
let mut stmt = conn.prepare(
"SELECT id FROM cve WHERE description LIKE '%' || ?1 || '%'",
)?;
let ids = stmt
.query_map(rusqlite::params![text], |row| row.get(0))?
.filter_map(|r| r.ok())
.collect();
Ok(ids)
fn search_description(
conn: &rusqlite::Connection,
text: &str,
) -> Result<Vec<String>, rusqlite::Error> {
let mut stmt = conn.prepare(
"SELECT id FROM cve WHERE description LIKE '%' || ?1 || '%' LIMIT 200",
)?;
let ids = stmt
.query_map(rusqlite::params![text], |row| row.get(0))?
.filter_map(|r| r.ok())
.collect();
Ok(ids)
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/portscan/src/cve/nvd.rs` around lines 172 - 183, The
search_description function issues unbounded "LIKE" scans (search_description)
which can explode on large NVD DBs; change the function to bound matches and/or
use full-text search: update the signature to accept a max_results parameter (or
switch to using the FTS5 virtual table and MATCH) and modify the SQL to either
use an FTS MATCH query or append "LIMIT ?2" to the prepared statement, passing
the max_results parameter when calling stmt.query_map; ensure callers are
updated to pass a sensible limit so repeated keyword queries cannot trigger
unbounded scans.

Comment thread crates/portscan/src/lib.rs Outdated
Comment on lines +680 to +686
if let Ok(nvd_findings) = tokio::task::spawn_blocking(move || {
cve::nvd::try_search(&b, &s)
})
.await
{
findings.extend(nvd_findings);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
rg -nP --type=rust 'spawn_blocking\s*\(' -C3
rg -nP --type=rust 'if let Ok\([^)]*\)\s*=\s*tokio::task::spawn_blocking' -C3
rg -nP --type=rust 'JoinError|tracing::(warn|error|debug).*(spawn_blocking|nvd)' -C3

Repository: santhsecurity/gossan

Length of output: 2851


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== portscan/src/lib.rs context around lines 660-710 ==="
sed -n '650,710p' crates/portscan/src/lib.rs | nl -ba | sed -n '1,120p'

echo
echo "=== NVD-related references in crates/portscan/src/lib.rs ==="
rg -n --hidden --no-ignore --glob='*.rs' "nvd|try_search|cve::nvd" crates/portscan/src/lib.rs

echo
echo "=== How cve::nvd::try_search is called elsewhere in repo ==="
rg -n --hidden --no-ignore --glob='*.rs' "cve::nvd::try_search" crates | head -n 50

echo
echo "=== Locate definition of try_search in cve/nvd module ==="
rg -n --hidden --no-ignore --glob='*.rs' "mod nvd|try_search" crates | head -n 50

echo
echo "=== Search for spawn_blocking join-error handling patterns elsewhere ==="
rg -n --hidden --no-ignore --glob='*.rs' \
  "match\s+tokio::task::spawn_blocking|tokio::task::spawn_blocking.*await\s*\{|if let Ok\([^)]*\)\s*=\s*tokio::task::spawn_blocking|JoinError" \
  crates | head -n 200

Repository: santhsecurity/gossan

Length of output: 182


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== portscan/src/lib.rs context around lines 650-710 ==="
awk 'NR>=650 && NR<=710 {printf "%d:%s\n", NR, $0}' crates/portscan/src/lib.rs

echo
echo "=== NVD-related references in crates/portscan/src/lib.rs ==="
rg -n --hidden --no-ignore --glob='*.rs' "nvd|try_search|cve::nvd" crates/portscan/src/lib.rs

echo
echo "=== How cve::nvd::try_search is called elsewhere in repo (first 50 matches) ==="
rg -n --hidden --no-ignore --glob='*.rs' "cve::nvd::try_search" crates | head -n 50

echo
echo "=== Search for definition of try_search in cve/nvd module (first 50 matches) ==="
rg -n --hidden --no-ignore --glob='*.rs' "fn\s+try_search|try_search\s*\(" crates | head -n 50

echo
echo "=== Search for spawn_blocking join-error handling patterns elsewhere (first 200 matches) ==="
rg -n --hidden --no-ignore --glob='*.rs' \
  "match\s+tokio::task::spawn_blocking|if let Ok\([^)]*\)\s*=\s*tokio::task::spawn_blocking|JoinError" \
  crates | head -n 200

Repository: santhsecurity/gossan

Length of output: 3210


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Does crates/portscan/src/lib.rs use tracing? ==="
rg -n --glob='*.rs' "tracing::(warn|error|debug|info)!" crates/portscan/src/lib.rs crates/portscan/src | head -n 50

echo
echo "=== Is tracing imported in crates/portscan/src/lib.rs? ==="
rg -n --glob='*.rs' "^\\s*(use\\s+)?tracing::" crates/portscan/src/lib.rs

echo
echo "=== Any existing JoinError handling in portscan crate? ==="
rg -n --glob='*.rs' "JoinError" crates/portscan/src | head -n 50

echo
echo "=== Any other spawn_blocking call sites using if let Ok(...) pattern? ==="
rg -n --glob='*.rs' "if let Ok\\([^)]*\\)\\s*=\\s*tokio::task::spawn_blocking" crates/portscan/src | head -n 50

Repository: santhsecurity/gossan

Length of output: 4961


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== crates/portscan/src/cve/nvd.rs around try_search (lines ~120-190) ==="
awk 'NR>=110 && NR<=210 {printf "%d:%s\n", NR, $0}' crates/portscan/src/cve/nvd.rs

echo
echo "=== crates/scm/src/git_scanner.rs spawn_blocking await handling (lines ~1-120) ==="
awk 'NR>=1 && NR<=140 {printf "%d:%s\n", NR, $0}' crates/scm/src/git_scanner.rs

echo
echo "=== crates/intel/src/lib.rs spawn_blocking join handling (lines ~140-250) ==="
awk 'NR>=140 && NR<=260 {printf "%d:%s\n", NR, $0}' crates/intel/src/lib.rs

Repository: santhsecurity/gossan

Length of output: 11161


Handle spawn_blocking join failures explicitly for NVD lookup.

In crates/portscan/src/lib.rs (the #[cfg(feature = "nvd")] block), the if let Ok(...) = ... .await path drops Err(JoinError) silently, so a failed/panicking NVD lookup leads to missing enrichment without any diagnostics. Log the Err branch.

Suggested patch
-            if let Ok(nvd_findings) = tokio::task::spawn_blocking(move || {
-                cve::nvd::try_search(&b, &s)
-            })
-            .await
-            {
-                findings.extend(nvd_findings);
-            }
+            match tokio::task::spawn_blocking(move || cve::nvd::try_search(&b, &s)).await {
+                Ok(nvd_findings) => findings.extend(nvd_findings),
+                Err(e) => tracing::warn!(error = %e, "NVD lookup task failed"),
+            }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if let Ok(nvd_findings) = tokio::task::spawn_blocking(move || {
cve::nvd::try_search(&b, &s)
})
.await
{
findings.extend(nvd_findings);
}
match tokio::task::spawn_blocking(move || cve::nvd::try_search(&b, &s)).await {
Ok(nvd_findings) => findings.extend(nvd_findings),
Err(e) => tracing::warn!(error = %e, "NVD lookup task failed"),
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/portscan/src/lib.rs` around lines 680 - 686, The spawn_blocking join
result for the NVD lookup is currently ignoring Err(JoinError) silently; update
the #[cfg(feature = "nvd")] block that calls tokio::task::spawn_blocking(move ||
cve::nvd::try_search(&b, &s)).await to explicitly handle the Err branch: match
the await result (or use if let Err(e) = ...) and emit a clear diagnostic
including the JoinError and context (e.g., which CVE/scan input b/s and that the
NVD enrichment failed) instead of dropping it, then decide whether to continue
without enrichment or propagate the error; reference the cve::nvd::try_search
call, the spawn_blocking future, and the nvd_findings/findings variables when
adding the logging/handling.

@santhsecurity

Copy link
Copy Markdown
Owner

Pushed cargo fmt --all to fix the CI failure (cosmetic only — your code is unchanged). Reviewed: clean OnceLock singleton, optional via --nvd-db / NVD_DB_PATH, graceful no-op when DB absent, NVD-tagged findings filterable. Two non-blocking followups worth opening as issues: (1) extract_search_terms overlaps with existing crates/classify keyword logic — consolidation candidate; (2) substring-match against CVE descriptions has no version-range comparison, so any version of e.g. openssh matches every openssh CVE — confidence-gated and filterable, but worth a precision pass eventually. Thanks for the contribution!

@santhsecurity santhsecurity merged commit 42cd5b8 into santhsecurity:main May 23, 2026
0 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants