Code Generation Evaluation Framework

This Issue is to track the progress of an evaluation framework for code generation functionality.

- [ ] Pass RAG data #3 to large-ish LLM
- [ ] Ingest the codebase and dependencies
- [ ] Delete functions and ask LLM to recreate them, use projects own tests to evaluate
- [ ] Any other coding benchmarks we can think of, with a focus on using the contextual data.

Resources:
- https://aider.chat
- https://github.com/FSoft-AI4Code/RepoHyper/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Code Generation Evaluation Framework #6

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Code Generation Evaluation Framework #6

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions