Inference for InfiCoder-Eval

The InfiCoder Team

Project Page: https://infi-coder.github.io/inficoder-eval/

(Forked from Code Generation LM Evaluation Harness)

Features | Usage | Contribution

Features

This is a very lightweight fork of bigcode-evaluation-harness to support inference on InfiCoder-Eval benchmark prompts.

The setup process and prerequisite are the same as the original bigcode-evaluation-harness framework. There are only some minor changes to the original code (e.g., support max_new_tokens and always use_cache in generation) along with InfiCoder-Eval tasks added.

New tasks for InfiCoder-Eval:

code-ffqa-v2

The default one, prompt with system_prompt + '\n' + content_prompt.
code-ffqa-v2-endn

Prompt with system_prompt + '\n' + content_prompt + '\n'.
code-ffqa-v2-deepseek-chat

deepseek-coder-instruct format
code-ffqa-v2-baichuan2

baichuan2 models format
code-ffqa-v2-zypher

zypher-7b-beta format
code-ffqa-v2-octo

octopack model format
code-ffqa-v2-wizard

wizard-python model format
code-ffqa-v2-phi

phi-1.5 model format
code-ffqa-v2-inficoder

Our InfiCoder model format

For detail information, please visit InfiCoder-Eval.

Usage

For InfiCoder-Eval, we only use this framework for response generation. The actual evaluation is delegated to our Evaluation Repo, which can be deployed in the same instance or another one.

An example usage can be found in run.sh:

# This shell exemplifies how to run the inference for inficoder-eval with this repo
# see detailed instructions in https://infi-coder.github.io/inficoder-eval/

export DATASET_CSV_PATH=..../inficoder-eval-framework/batched_prompts/suite_v2.0.0_dev.csv

# for example, to evaluate Phi-1.5
# first, generate responses
accelerate launch ..../ffqa-evaluation-harness/main.py --model microsoft/phi-1_5 --tasks code-ffqa-v2-phi --batch_size 16 --precision bf16 --n_samples 30 --do_sample True --temperature 0.2 --top_p 0.9 --save_generations --save_references --trust_remote_code --generation_only --max_length_generation 2048 --save_generations_path generations_phi-1_5.json --eos='<|endoftext|>'

# then, join with case names and output a csv file, later the evaluation framework can process
python3 ffqa_processor.py generations_phi-1_5.json references.json ../phi-1_5_output.csv --eos '<|endoftext|>'

A detailed illustration is on our project page: https://infi-coder.github.io/inficoder-eval/.

Implementing new tasks

To implement a new task or prompting method for our InfiCoder-Eval, please read and modify here: bigcode_eval/tasks/code_ffqa_v200.py. For generic task extensions, see the guide in docs/guide. The are also contribution guidelines in this CONTRIBUTING.md

In the long term, we plan to integrate InfiCoder-Eval evaluation framework into this repo and merge this benchmark into the official bigcode-evaluation-harness. If you are interested in this effort, you are more than welcome to contact us!

Acknowledgements

We thank the BigCode team for developing such a great framework and EleutherAI for their work on the lm-evaluation harness from which this repository is built upon.

Cite as

@misc{li2023inficodereval,
  author = {InfiCoderTeam},
  title = {InfiCoder-Eval: Systematically Evaluating Question-Answering for Code Large Language Models},
  year = {2023},
  publisher = {Github Pages},
  howpublished = "\url{https://infi-coder.github.io/inficoder-eval/}"
}

Name		Name	Last commit message	Last commit date
Latest commit History 816 Commits
.github/workflows		.github/workflows
bigcode_eval		bigcode_eval
docs		docs
finetuning		finetuning
leaderboard		leaderboard
templates		templates
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Dockerfile-multiple		Dockerfile-multiple
LICENSE		LICENSE
README.md		README.md
ffqa_processor.py		ffqa_processor.py
main.py		main.py
makefile		makefile
requirements.txt		requirements.txt
run.sh		run.sh
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Inference for InfiCoder-Eval

The InfiCoder Team

Project Page: https://infi-coder.github.io/inficoder-eval/

(Forked from Code Generation LM Evaluation Harness)

Features | Usage | Contribution

Features

Usage

Implementing new tasks

Acknowledgements

Cite as

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Inference for InfiCoder-Eval

The InfiCoder Team

Project Page: https://infi-coder.github.io/inficoder-eval/

(Forked from Code Generation LM Evaluation Harness)

Features | Usage | Contribution

Features

Usage

Implementing new tasks

Acknowledgements

Cite as

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages