Skip to content

[recipes] Perplexity conversation and memory import#109

Merged
justfinethanku merged 1 commit intoNateBJones-Projects:mainfrom
demarant:contrib/demarant/perplexity-import
Mar 24, 2026
Merged

[recipes] Perplexity conversation and memory import#109
justfinethanku merged 1 commit intoNateBJones-Projects:mainfrom
demarant:contrib/demarant/perplexity-import

Conversation

@demarant
Copy link
Copy Markdown
Contributor

Contribution Type

  • Recipe (/recipes)

What does this do?

Imports Perplexity AI search history and memory entries into Open Brain as searchable thoughts. Handles two data streams:

  • Conversations: Each search query + answer is summarized by an LLM into 1-3 standalone thoughts (gpt-4o-mini via OpenRouter).
  • Memory: Perplexity's curated memory entries are ingested directly — they're already concise summaries. JSON profile rows (user persona data with demographics, interests, technology preferences, etc.) are automatically detected and flattened into per-section thoughts.

Original timestamps from the Perplexity export are preserved in the created_at column so imported thoughts retain their real dates.

The script operates on the .xlsx file that Perplexity provides as a data export (users request it from the privacy team — no self-service UI export currently available).

Requirements

  • Python 3.10+
  • OpenRouter API key (for embeddings + conversation summarization)
  • Working Open Brain setup with thoughts table

Checklist

  • I've read CONTRIBUTING.md
  • My contribution has a README.md with prerequisites, step-by-step instructions, and expected outcome
  • My metadata.json has all required fields
  • I tested this on my own Open Brain instance
  • No credentials, API keys, or secrets are included

Test Results

Tested against a real Perplexity export on a live Open Brain instance:

Metric Conversations Memory
Found 1,607 64
Processed 1,605 61
Thoughts generated 1,733 61
Ingested 1,733 61
Errors 0 0

Estimated API cost: $0.39 ($0.38 summarization + $0.004 embeddings).

Copy link
Copy Markdown
Collaborator

@justfinethanku justfinethanku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Perplexity Conversation Import

What's Good

This is a high-quality contribution that follows OB1 standards well:

  • Excellent documentation: README is comprehensive, well-structured, and includes all required sections (Prerequisites, Step-by-Step, Expected Outcome, Troubleshooting)
  • Clean security posture: No credentials, secrets, or API keys hardcoded. Environment variables used correctly. properly configured.
  • Safe database operations: No dangerous SQL. Only inserts to thoughts table via REST API. Core table structure untouched.
  • Smart deduplication: Sync log prevents duplicate imports on re-runs
  • Timestamp preservation: Original Perplexity dates retained in created_at column (not import date) — excellent attention to detail
  • Dry-run mode: Safe testing without database writes
  • Local LLM option: Ollama support for privacy-conscious users
  • Proper error handling: Retry logic, graceful failures, informative error messages

metadata.json Validation

All required fields present and valid:

  • name, description, category
  • author.name and author.github
  • version (valid semver: 1.0.0)
  • requires.open_brain = true
  • tags (6 tags)
  • difficulty (beginner)
  • estimated_time (20 minutes)
  • created and updated (2026-03-22)

Folder Structure

✅ Contribution is in recipes/perplexity-conversation-import/
✅ Contains: README.md, metadata.json, import-perplexity.py, requirements.txt, .env.example, .gitignore

PR Format

✅ Title: [recipes] Perplexity conversation and memory import — follows convention

Technical Review

Strengths:

  1. Three-stage pipeline is well-designed: Parse → Summarize → Ingest
  2. LLM summarization prompt is thoughtfully tuned for Perplexity's Q&A format and appropriately selective
  3. JSON profile flattening handles complex memory exports elegantly
  4. Rate limiting (0.2s between ingests) respects Supabase REST API limits
  5. Cost estimates provided (/bin/zsh.0003/conversation for summarization, /bin/zsh.000002/thought for embeddings)
  6. Dependencies are minimal and appropriate (requests, openpyxl)

Code quality:

  • Clean Python 3.10+ code with type hints where helpful
  • Good error handling with retry logic
  • Proper use of environment variables
  • No SQL injection risks (uses JSON payloads to REST API)
  • File size acceptable (~40KB Python script)

Automated Review Checklist

  1. ✅ Folder structure correct for recipes category
  2. ✅ Required files present (README.md + metadata.json + code)
  3. ✅ metadata.json valid and complete
  4. ✅ No credentials, API keys, or secrets
  5. ✅ SQL safety — no DROP TABLE, DROP DATABASE, TRUNCATE, or unqualified DELETE FROM
  6. ✅ Category-specific artifacts — Python script with detailed instructions
  7. ✅ PR format — title starts with [recipes]
  8. ✅ No binary blobs over 1MB
  9. ✅ README completeness — all required sections present
  10. ✅ No primitive dependencies declared
  11. ✅ Scope check — all changes within contribution folder
  12. ✅ No local MCP servers (N/A for this recipe)

Test Results Validation

PR author reports testing against real Perplexity export:

  • 1,607 conversations found, 1,605 processed, 1,733 thoughts generated
  • 64 memory entries found, 61 processed, 61 thoughts generated
  • Estimated cost: /bin/zsh.39
  • Zero errors

This demonstrates thorough testing on a production-scale dataset.

Minor Suggestions (Optional)

These are nice-to-haves, not blockers:

  1. README clarity: Consider adding a note in Prerequisites about requesting the Perplexity export early (can take days) so users aren't blocked mid-setup
  2. Error messaging: Line 223 could suggest checking Perplexity sheet names if export format changes
  3. Progress indicator: For large exports (hundreds of conversations), a progress bar would improve UX (though the current verbose output works fine)

Verdict: ✅ Ready to merge

This contribution meets all automated and human review standards. It's well-documented, thoroughly tested, secure, and provides genuine value to the Open Brain community.

Great work, @demarant! This is exactly the kind of quality contribution OB1 needs. The attention to detail on timestamp preservation and the selective summarization prompt shows real thought about how people will use this data long-term.


Recommendation: Merge after author acknowledges any minor suggestions (or merge as-is — they're truly optional).

Copy link
Copy Markdown
Collaborator

@justfinethanku justfinethanku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Perplexity Conversation Import

What's Good

This is a high-quality contribution that follows OB1 standards well:

  • Excellent documentation: README is comprehensive, well-structured, and includes all required sections (Prerequisites, Step-by-Step, Expected Outcome, Troubleshooting)
  • Clean security posture: No credentials, secrets, or API keys hardcoded. Environment variables used correctly. .gitignore properly configured.
  • Safe database operations: No dangerous SQL. Only inserts to thoughts table via REST API. Core table structure untouched.
  • Smart deduplication: Sync log prevents duplicate imports on re-runs
  • Timestamp preservation: Original Perplexity dates retained in created_at column (not import date) — excellent attention to detail
  • Dry-run mode: Safe testing without database writes
  • Local LLM option: Ollama support for privacy-conscious users
  • Proper error handling: Retry logic, graceful failures, informative error messages

metadata.json Validation

All required fields present and valid:

  • ✅ name, description, category
  • ✅ author.name and author.github
  • ✅ version (valid semver: 1.0.0)
  • ✅ requires.open_brain = true
  • ✅ tags (6 tags)
  • ✅ difficulty (beginner)
  • ✅ estimated_time (20 minutes)
  • ✅ created and updated (2026-03-22)

Folder Structure

✅ Contribution is in recipes/perplexity-conversation-import/
✅ Contains: README.md, metadata.json, import-perplexity.py, requirements.txt, .env.example, .gitignore

PR Format

✅ Title: [recipes] Perplexity conversation and memory import — follows convention

Technical Review

Strengths:

  1. Three-stage pipeline is well-designed: Parse → Summarize → Ingest
  2. LLM summarization prompt is thoughtfully tuned for Perplexity's Q&A format and appropriately selective
  3. JSON profile flattening handles complex memory exports elegantly
  4. Rate limiting (0.2s between ingests) respects Supabase REST API limits
  5. Cost estimates provided ($0.0003/conversation for summarization, $0.000002/thought for embeddings)
  6. Dependencies are minimal and appropriate (requests, openpyxl)

Code quality:

  • Clean Python 3.10+ code with type hints where helpful
  • Good error handling with retry logic
  • Proper use of environment variables
  • No SQL injection risks (uses JSON payloads to REST API)
  • File size acceptable (~40KB Python script)

Automated Review Checklist

  1. ✅ Folder structure correct for recipes category
  2. ✅ Required files present (README.md + metadata.json + code)
  3. ✅ metadata.json valid and complete
  4. ✅ No credentials, API keys, or secrets
  5. ✅ SQL safety — no DROP TABLE, DROP DATABASE, TRUNCATE, or unqualified DELETE FROM
  6. ✅ Category-specific artifacts — Python script with detailed instructions
  7. ✅ PR format — title starts with [recipes]
  8. ✅ No binary blobs over 1MB
  9. ✅ README completeness — all required sections present
  10. ✅ No primitive dependencies declared
  11. ✅ Scope check — all changes within contribution folder
  12. ✅ No local MCP servers (N/A for this recipe)

Test Results Validation

PR author reports testing against real Perplexity export:

  • 1,607 conversations found, 1,605 processed, 1,733 thoughts generated
  • 64 memory entries found, 61 processed, 61 thoughts generated
  • Estimated cost: $0.39
  • Zero errors

This demonstrates thorough testing on a production-scale dataset.

Minor Suggestions (Optional)

These are nice-to-haves, not blockers:

  1. README clarity: Consider adding a note in Prerequisites about requesting the Perplexity export early (can take days) so users aren't blocked mid-setup
  2. Error messaging: Line 223 could suggest checking Perplexity sheet names if export format changes
  3. Progress indicator: For large exports (hundreds of conversations), a progress bar would improve UX (though the current verbose output works fine)

Verdict: ✅ Ready to merge

This contribution meets all automated and human review standards. It's well-documented, thoroughly tested, secure, and provides genuine value to the Open Brain community.

Great work, @demarant! This is exactly the kind of quality contribution OB1 needs. The attention to detail on timestamp preservation and the selective summarization prompt shows real thought about how people will use this data long-term.


Recommendation: Merge after author acknowledges any minor suggestions (or merge as-is — they're truly optional).

@justfinethanku justfinethanku merged commit 726e506 into NateBJones-Projects:main Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants