diff --git a/content-blog/lm-eval-challenges.md b/content-blog/lm-eval-challenges.md new file mode 100644 index 0000000..7fad229 --- /dev/null +++ b/content-blog/lm-eval-challenges.md @@ -0,0 +1,41 @@ +--- +title: "Challenges in Language Model Evaluation" +date: 2024-04-20T00:00:00 +description: "ICML 2024 Tutorial" +author: ["Lintang Sutawika", "Hailey Schoelkopf"] +draft: true +--- + +$$ +\text{July 22nd, Time and Place TBA} +$$ + +NLP and Machine Learning rely on benchmarks and evaluation to accurately track progress in the field and assess the efficacy of new models and methodologies. For this reason, good evaluation practices and accurate reporting are crucial. How- ever, language models not only inherit the chal- lenges previously faced in benchmarking, but also introduce a slew of novel considerations which can make proper comparison across models dif- ficult, misleading, or near-impossible. In this tu- torial, we aim to bring attendees up to speed on the state of language model evaluation, and high- light current challenges in evaluating language model performance through discussing the vari- ous methods of evaluation, tasks and benchmarks commonly associated with evaluating progress in language model research. We will then discuss how these common pitfalls can be addressed and what considerations should be taken to enhance future work. + +## Contact Info + +- Lintang Sutawika: `lintang@eleuther.ai` +- Hailey Schoelkopf: `hailey@eleuther.ai` + +## Schedule + +TBA + +## Reading List + +TBA + +## Citation + +TBA + + +