Skip to content

fix(search): expose markdown frontmatter on chunks#314

Open
RerankerGuo wants to merge 4 commits into
agentscope-ai:mainfrom
RerankerGuo:fix/markdown-chunk-frontmatter-filter-api
Open

fix(search): expose markdown frontmatter on chunks#314
RerankerGuo wants to merge 4 commits into
agentscope-ai:mainfrom
RerankerGuo:fix/markdown-chunk-frontmatter-filter-api

Conversation

@RerankerGuo

Copy link
Copy Markdown
Contributor

Summary

  • copy parsed Markdown frontmatter onto emitted FileChunk metadata
  • preserve JSON-safe frontmatter values such as conversation_date for search filters
  • add coverage showing chunk metadata and metadata search_filter plumbing

Refs #302

Verification

  • .. [100%]
    2 passed in 1.02s
  • check python ast.........................................................Passed
    check yaml...........................................(no files to check)Skipped
    check xml............................................(no files to check)Skipped
    check toml...........................................(no files to check)Skipped
    check json...........................................(no files to check)Skipped
    detect private key.......................................................Passed
    trim trailing whitespace.................................................Passed
    Add trailing commas......................................................Passed
    black....................................................................Passed
    flake8...................................................................Passed
    pylint...................................................................Passed
    Check package with Pyroma................................................Passed

@ployts

ployts commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

Thanks for working on this! Exposing the markdown frontmatter on chunks is a great addition for search filtering.
To ensure backward compatibility and prevent any unexpected side effects for existing users, I suggest adding a configuration flag (e.g., enable_frontmatter_metadata or include_frontmatter_in_metadata) to toggle this behavior. Could we set it to False by default?
This would allow users to opt-in when needed while keeping the default behavior unchanged. Let me know your thoughts!

@RerankerGuo

RerankerGuo commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

Updated per feedback. I added the include_frontmatter_in_metadata flag, set the default config value to false, and adjusted tests to cover both the backward-compatible default and the opt-in metadata behavior.

Verification:

  • Targeted pytest: 2 passed
  • Pre-commit on touched files: passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants