Skip to content

feat: Add org-mode file indexing and search #9

@plur9

Description

@plur9

Summary

Add support for indexing and searching .org files (org-mode format) alongside markdown files.

Motivation

Currently datacortex only indexes markdown files (zettel, page, journal, etc.). This means GTD tasks in org/inbox.org and org/next_actions.org are not searchable via /search or the knowledge graph.

Being able to search across both knowledge (markdown) and tasks (org) would enable queries like:

  • "What tasks mention the Dubai pilot?"
  • "Show me all TODO items related to fundraising"
  • "Find blocked tasks from last month"

Current State

-- File types currently indexed:
SELECT DISTINCT type FROM files;
-- active, agent, clipping, journal, page, readme, research, zettel

-- Org files indexed:
SELECT COUNT(*) FROM files WHERE path LIKE '%.org';
-- 0

Proposed Solution

  1. Add org parser to datacortex/indexer/ that extracts:

    • Headlines (as searchable titles)
    • TODO state (TODO, NEXT, WAITING, DONE)
    • Tags (:tag1:tag2:)
    • Properties (:PROPERTIES: drawer)
    • Body content
    • Timestamps
  2. New node types: task, project, heading

  3. Link extraction: Parse [[wiki-links]] and file: links to build graph connections

  4. Embedding: Embed task headlines + context for semantic search

Example Use Cases

# Find tasks about a topic
datacortex search "dubai pilot" --type task

# Search across everything
datacortex search "fundraising deadline"

References

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions