Skip to content

Implement evals for ai-helpers skills #99

@patternfly-jira-sync

Description

@patternfly-jira-sync

Define and implement an evaluation framework for measuring AI skill output quality across all ai-helpers plugins. The spike (child story) determines the framework and criteria; follow-up stories implement evals per plugin.

Scope:

  • Research and select an eval framework

  • Coordinate with UXD on their evals research to date

  • Define "good output" criteria per skill

  • Implement eval test cases for all skills across all plugins (react, migration, design-to-code, code-review, pf-workshop)

  • Evals should be runnable locally and in CI


Jira Issue: PF-4230

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions