judge-model

Star

Here are 3 public repositories matching this topic...

IAAR-Shanghai / xFinder

Star

[ICLR 2025] xFinder: Large Language Models as Automated Evaluators for Reliable Evaluation

Updated Nov 14, 2025
Python

IAAR-Shanghai / xVerify

Star

xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

benchmark regex reliability evaluation llm reliability-tools chatgpt cc-by-nc-nd-4 open-compass llm-as-a-judge deepseek-math judge-model reasoning-models open-r1 xverify math-verify

Updated Nov 13, 2025
Jupyter Notebook

Self-hosted Fusion API and OpenRouter Fusion alternative for building reliable multi-model LLM ensembles — fan out one prompt to N models, then synthesize one answer with a judge model. OpenAI-compatible API, CLI, and eval harness.

python benchmarking ai-agents model-fusion fastapi llm openrouter llm-orchestration judge-model openai-compatible llm-ensemble fusion-api

Updated Jun 15, 2026
Python

Improve this page

Add a description, image, and links to the judge-model topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the judge-model topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

judge-model

Here are 3 public repositories matching this topic...

IAAR-Shanghai / xFinder

IAAR-Shanghai / xVerify

luckeyfaraday / fusion-engine

Improve this page

Add this topic to your repo