Create evaluations

whenever we change the prompt in `tigent.yml`, the system prompt, or any relevant code, we want to make sure we don't break current behavior.

Ideas we had so far

1. Add tests to the code with webhook fixtures and desired outcome in form of labels
2. Create a dedicated evaluations repository on the @tigent org with issues and labels, then use the issues as input (dynamically requested when running evaluations) and the applied labels as output
3. Use the vercel/ai repository itself by adding `evaluations` which is an array of numbers and use issues/PRs on the ai repository as evals

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create evaluations #167

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Create evaluations #167

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions