[recipes] Perplexity conversation and memory import#109
Conversation
justfinethanku
left a comment
There was a problem hiding this comment.
Code Review: Perplexity Conversation Import
What's Good
This is a high-quality contribution that follows OB1 standards well:
- Excellent documentation: README is comprehensive, well-structured, and includes all required sections (Prerequisites, Step-by-Step, Expected Outcome, Troubleshooting)
- Clean security posture: No credentials, secrets, or API keys hardcoded. Environment variables used correctly. properly configured.
- Safe database operations: No dangerous SQL. Only inserts to
thoughtstable via REST API. Core table structure untouched. - Smart deduplication: Sync log prevents duplicate imports on re-runs
- Timestamp preservation: Original Perplexity dates retained in
created_atcolumn (not import date) — excellent attention to detail - Dry-run mode: Safe testing without database writes
- Local LLM option: Ollama support for privacy-conscious users
- Proper error handling: Retry logic, graceful failures, informative error messages
metadata.json Validation
All required fields present and valid:
- ✅
name,description,category - ✅
author.nameandauthor.github - ✅
version(valid semver: 1.0.0) - ✅
requires.open_brain= true - ✅
tags(6 tags) - ✅
difficulty(beginner) - ✅
estimated_time(20 minutes) - ✅
createdandupdated(2026-03-22)
Folder Structure
✅ Contribution is in recipes/perplexity-conversation-import/
✅ Contains: README.md, metadata.json, import-perplexity.py, requirements.txt, .env.example, .gitignore
PR Format
✅ Title: [recipes] Perplexity conversation and memory import — follows convention
Technical Review
Strengths:
- Three-stage pipeline is well-designed: Parse → Summarize → Ingest
- LLM summarization prompt is thoughtfully tuned for Perplexity's Q&A format and appropriately selective
- JSON profile flattening handles complex memory exports elegantly
- Rate limiting (0.2s between ingests) respects Supabase REST API limits
- Cost estimates provided (/bin/zsh.0003/conversation for summarization, /bin/zsh.000002/thought for embeddings)
- Dependencies are minimal and appropriate (requests, openpyxl)
Code quality:
- Clean Python 3.10+ code with type hints where helpful
- Good error handling with retry logic
- Proper use of environment variables
- No SQL injection risks (uses JSON payloads to REST API)
- File size acceptable (~40KB Python script)
Automated Review Checklist
- ✅ Folder structure correct for recipes category
- ✅ Required files present (README.md + metadata.json + code)
- ✅ metadata.json valid and complete
- ✅ No credentials, API keys, or secrets
- ✅ SQL safety — no DROP TABLE, DROP DATABASE, TRUNCATE, or unqualified DELETE FROM
- ✅ Category-specific artifacts — Python script with detailed instructions
- ✅ PR format — title starts with
[recipes] - ✅ No binary blobs over 1MB
- ✅ README completeness — all required sections present
- ✅ No primitive dependencies declared
- ✅ Scope check — all changes within contribution folder
- ✅ No local MCP servers (N/A for this recipe)
Test Results Validation
PR author reports testing against real Perplexity export:
- 1,607 conversations found, 1,605 processed, 1,733 thoughts generated
- 64 memory entries found, 61 processed, 61 thoughts generated
- Estimated cost: /bin/zsh.39
- Zero errors
This demonstrates thorough testing on a production-scale dataset.
Minor Suggestions (Optional)
These are nice-to-haves, not blockers:
- README clarity: Consider adding a note in Prerequisites about requesting the Perplexity export early (can take days) so users aren't blocked mid-setup
- Error messaging: Line 223 could suggest checking Perplexity sheet names if export format changes
- Progress indicator: For large exports (hundreds of conversations), a progress bar would improve UX (though the current verbose output works fine)
Verdict: ✅ Ready to merge
This contribution meets all automated and human review standards. It's well-documented, thoroughly tested, secure, and provides genuine value to the Open Brain community.
Great work, @demarant! This is exactly the kind of quality contribution OB1 needs. The attention to detail on timestamp preservation and the selective summarization prompt shows real thought about how people will use this data long-term.
Recommendation: Merge after author acknowledges any minor suggestions (or merge as-is — they're truly optional).
justfinethanku
left a comment
There was a problem hiding this comment.
Code Review: Perplexity Conversation Import
What's Good
This is a high-quality contribution that follows OB1 standards well:
- Excellent documentation: README is comprehensive, well-structured, and includes all required sections (Prerequisites, Step-by-Step, Expected Outcome, Troubleshooting)
- Clean security posture: No credentials, secrets, or API keys hardcoded. Environment variables used correctly. .gitignore properly configured.
- Safe database operations: No dangerous SQL. Only inserts to thoughts table via REST API. Core table structure untouched.
- Smart deduplication: Sync log prevents duplicate imports on re-runs
- Timestamp preservation: Original Perplexity dates retained in created_at column (not import date) — excellent attention to detail
- Dry-run mode: Safe testing without database writes
- Local LLM option: Ollama support for privacy-conscious users
- Proper error handling: Retry logic, graceful failures, informative error messages
metadata.json Validation
All required fields present and valid:
- ✅ name, description, category
- ✅ author.name and author.github
- ✅ version (valid semver: 1.0.0)
- ✅ requires.open_brain = true
- ✅ tags (6 tags)
- ✅ difficulty (beginner)
- ✅ estimated_time (20 minutes)
- ✅ created and updated (2026-03-22)
Folder Structure
✅ Contribution is in recipes/perplexity-conversation-import/
✅ Contains: README.md, metadata.json, import-perplexity.py, requirements.txt, .env.example, .gitignore
PR Format
✅ Title: [recipes] Perplexity conversation and memory import — follows convention
Technical Review
Strengths:
- Three-stage pipeline is well-designed: Parse → Summarize → Ingest
- LLM summarization prompt is thoughtfully tuned for Perplexity's Q&A format and appropriately selective
- JSON profile flattening handles complex memory exports elegantly
- Rate limiting (0.2s between ingests) respects Supabase REST API limits
- Cost estimates provided ($0.0003/conversation for summarization, $0.000002/thought for embeddings)
- Dependencies are minimal and appropriate (requests, openpyxl)
Code quality:
- Clean Python 3.10+ code with type hints where helpful
- Good error handling with retry logic
- Proper use of environment variables
- No SQL injection risks (uses JSON payloads to REST API)
- File size acceptable (~40KB Python script)
Automated Review Checklist
- ✅ Folder structure correct for recipes category
- ✅ Required files present (README.md + metadata.json + code)
- ✅ metadata.json valid and complete
- ✅ No credentials, API keys, or secrets
- ✅ SQL safety — no DROP TABLE, DROP DATABASE, TRUNCATE, or unqualified DELETE FROM
- ✅ Category-specific artifacts — Python script with detailed instructions
- ✅ PR format — title starts with [recipes]
- ✅ No binary blobs over 1MB
- ✅ README completeness — all required sections present
- ✅ No primitive dependencies declared
- ✅ Scope check — all changes within contribution folder
- ✅ No local MCP servers (N/A for this recipe)
Test Results Validation
PR author reports testing against real Perplexity export:
- 1,607 conversations found, 1,605 processed, 1,733 thoughts generated
- 64 memory entries found, 61 processed, 61 thoughts generated
- Estimated cost: $0.39
- Zero errors
This demonstrates thorough testing on a production-scale dataset.
Minor Suggestions (Optional)
These are nice-to-haves, not blockers:
- README clarity: Consider adding a note in Prerequisites about requesting the Perplexity export early (can take days) so users aren't blocked mid-setup
- Error messaging: Line 223 could suggest checking Perplexity sheet names if export format changes
- Progress indicator: For large exports (hundreds of conversations), a progress bar would improve UX (though the current verbose output works fine)
Verdict: ✅ Ready to merge
This contribution meets all automated and human review standards. It's well-documented, thoroughly tested, secure, and provides genuine value to the Open Brain community.
Great work, @demarant! This is exactly the kind of quality contribution OB1 needs. The attention to detail on timestamp preservation and the selective summarization prompt shows real thought about how people will use this data long-term.
Recommendation: Merge after author acknowledges any minor suggestions (or merge as-is — they're truly optional).
Contribution Type
/recipes)What does this do?
Imports Perplexity AI search history and memory entries into Open Brain as searchable thoughts. Handles two data streams:
Original timestamps from the Perplexity export are preserved in the
created_atcolumn so imported thoughts retain their real dates.The script operates on the
.xlsxfile that Perplexity provides as a data export (users request it from the privacy team — no self-service UI export currently available).Requirements
thoughtstableChecklist
README.mdwith prerequisites, step-by-step instructions, and expected outcomemetadata.jsonhas all required fieldsTest Results
Tested against a real Perplexity export on a live Open Brain instance:
Estimated API cost: $0.39 ($0.38 summarization + $0.004 embeddings).