Epic: AI-Driven Semantic Search for envited-x Data Assets

# Epic: AI-Driven Semantic Search for envited-x Data Assets

## Description

**Goal:** Implement an intelligent, scalable search solution for envited-x data assets to enable semantic discovery, unlock Work Package 1 (Indexing), and prepare the platform for handling extremely large datasets.

**Problem:** Current LLMs have context window limitations, making it unfeasible to search massive datasets (e.g., terabytes of video footage, gigabytes of point clouds) directly. We need a way to structure and index the associated metadata to make these assets searchable and relatable.

**Proposed Solution: The Agent-Based Search Bar**

This Epic proposes creating a search interface where an AI agent acts as an intelligent intermediary between the user and the data catalog.

1.  **Metadata Indexing:**
    
      * Index JSON-LD files (referenced by tokens/IPFS) into a structured database.
      * **Primary Focus:** Implement the indexing into a **Graph Database** to leverage existing ontologies and create connections (triples) between entities.
      * **Investigation:** Research and compare the benefits of a **Vector Database** versus a Graph Database, or explore a combined approach.

2.  **Agent-Led Query Translation:**
    
      * The AI Agent will receive natural language queries from the user.
      * The Agent will use its knowledge of the ontologies to understand the query's intent (e.g., "map of Munich" -\> "GEO reference").
      * The Agent will generate and execute precise, structured queries (e.g., SPARQL) against the indexed database.

3.  **Interactive Refinement:**
    
      * The Agent will engage in a conversational flow to resolve ambiguities, asking the user for clarification (e.g., "You mentioned Munich. Are you searching for a Geo reference, or something else?").
      * This interactive process will also provide valuable feedback to refine ontologies and understand actual user search intent.

4.  **Flexible Search Tiers:**
    
      * Implement an optional **"Extended Search"** or **"Deep Search"** feature that allows users to pay for a more comprehensive search.
      * In a Deep Search, the agent would load a large portion of the pre-sorted data into its context (RAM/Level 1 Cache equivalent) to perform a full Large Language Model search.

## Key Objectives & Deliverables

  * Database schema and indexing logic for Graph DB.
  * Basic indexer functionality for JSON-LD metadata.
  * Functional AI agent capable of translating user text into database queries.
  * Working prototype of the search bar integrated with the agent and database.
  * Unblock Work Package 1 by demonstrating confident understanding and implementation of the data structure and indexing solution.

## Potential Sub-Tasks (Initial Sprint Focus)

  * Define and set up the Graph Database infrastructure.
  * Implement a tool to index token-referenced JSON-LD files.
  * Develop a minimal agent skill to map a natural language input to a basic SPARQL query.
  * Create an initial set of test data and ontologies to demonstrate core search capabilities.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Epic: AI-Driven Semantic Search for envited-x Data Assets #485

Epic: AI-Driven Semantic Search for envited-x Data Assets

Description

Key Objectives & Deliverables

Potential Sub-Tasks (Initial Sprint Focus)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Epic: AI-Driven Semantic Search for envited-x Data Assets #485

Description

Epic: AI-Driven Semantic Search for envited-x Data Assets

Description

Key Objectives & Deliverables

Potential Sub-Tasks (Initial Sprint Focus)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions