llm-arena

Here are 10 public repositories matching this topic...

black-yt / structai

StructAI offers a robust toolkit for LLM interaction—such as structured outputs, context management, and parallel execution.

Updated Mar 2, 2026
Python

wchao6891 / ChineseStressBench

Star

中文高压复杂任务Benchmark。主要是测模型会不会在真实工作里误事。This is a Chinese-language high-pressure complex task benchmark. The main purpose is to test whether the model will cause problems in real-world applications.

benchmark decision-making stress-testing ai-safety reasoning ai-evaluation llm-evaluation llm-arena llm-benchmark chinese-benchmark

Updated May 9, 2026
HTML

skorotkiewicz / tronmcp

Sponsor

Star

Tron-style multiplayer light-cycle game for LLMs via MCP

rust mcp sse tron llm llm-arena

Updated Feb 23, 2026
Rust

horde-research / Kaz-Offline-Arena

Star

Offline LLM evaluation pipeline for Kazakh: run local HF models, auto-judge, export JSON for the Arena leaderboard: https://huggingface.co/spaces/kz-transformers/kaz-offline-arena

evaluation kazakh bradley-terry-model llm llm-arena

Updated May 1, 2025
Jupyter Notebook

s41r4j / ocla

Sponsor

Star

Open Cyber LLM Arena | A transparent, crowdsourced benchmarking platform for evaluating Large Language Models (LLMs) on cybersecurity tasks.

benchmarking benchmark ai cybersecurity cyber llm llm-arena ai-benchmark cybersecurity-ai ocla

Updated Feb 10, 2026
TypeScript

faeton / multicooker

Star

Run several LLM agents on the same task in parallel docker sandboxes, then have other LLMs judge them. Uses your Claude Pro / ChatGPT Plus / Gemini Advanced subscriptions — no API keys.

docker benchmarking sandbox gemini codex evaluation-framework ai-agents claude parallel-execution llm prompt-engineering llm-evaluation llm-as-judge llm-arena claude-code

Updated May 16, 2026
Python

mayafree-ai / Huggingface-MAYA

Star

Open-source mirror of 4 flagship MAYA AI Hugging Face Spaces (all-leaderboard, QWEN-3_5-CHAT, openclaw-moltbot, fish-s2-pro-zero) ? each folder is a deployable Space

open-source leaderboard gradio voice-cloning speech-to-speech huggingface-spaces llm-chatbot qwen korean-ai llm-arena maya-ai

Updated Apr 22, 2026
HTML

GinSing1226 / multi-ai-web-chatroom

Star

A AI comparison chatroom base on AI-web,NOT API. Send one message, get simultaneous responses from ChatGPT, DeepSeek, Gemini, GLM and more. Local-first, FREE,saves conversations as Markdown files.一款跨平台 AI 对比聊天室，自动且免费操作AI网页，不使用API。一次发送，获取多个AI平台输出

markdown chatbot playwright chatgpt deepseek llm-arena ai-skill openclaw-skills

Updated Mar 3, 2026
TypeScript

kaying-studio / OpenArenaStudio

Star

Generate side-by-side LLM coding battle videos with your own API keys — free, local, open source.

benchmark arena screen-recorder code-generation open-arena llm ai-video llm-arena llm-comparison llm-benchmark open-screen

Updated Apr 29, 2026

subhakantrout / local-ai-engine

Star

Cortex is a hyper-efficient, local, multi-model AI reasoning engine with support for RAG, Tree of Thought, Arena mode, and persistent memory.

rag fastapi tree-of-thought local-ai ollama reasoning-engine llm-arena

Updated May 16, 2026
Python

Improve this page

Add a description, image, and links to the llm-arena topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-arena topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llm-arena

Here are 10 public repositories matching this topic...

black-yt / structai

wchao6891 / ChineseStressBench

skorotkiewicz / tronmcp

horde-research / Kaz-Offline-Arena

s41r4j / ocla

faeton / multicooker

mayafree-ai / Huggingface-MAYA

GinSing1226 / multi-ai-web-chatroom

kaying-studio / OpenArenaStudio

subhakantrout / local-ai-engine

Improve this page

Add this topic to your repo