Hack with Oumi: SLMs for Voice Agents

This repo contains Oumi’s problem statement and starter template for:

Nebius.Build SF, March 15, 2026
Eclipse 6.0, April 4-5, 2026

Problem Statement

The theme for Oumi’s hackathon track is Small Language Models (SLMs) for Voice Agents.

A Voice Agent differs from a standard AI agent in that users interact with it through spoken conversation. Instead of typing prompts, users speak to the agent and it responds with synthesized speech.

A typical Voice Agent pipeline works as follows:

Audio is captured from the user’s microphone.
A speech-to-text (STT) model transcribes the audio into text.
The text is processed by an AI agent powered by a language model.
The agent’s response is converted back into audio using a text-to-speech (TTS) model and played to the user.

Voice Agents are commonly used in real-time applications such as automated telephone customer support.

In this hackathon, Small Language Models (SLMs) are defined as language models with fewer than ~10 billion parameters. Compared to large models, SLMs are significantly cheaper to run and can often operate efficiently on consumer-grade edge devices. When fine-tuned for specific tasks, SLMs can even outperform much larger models such as GPT-5.4 or Claude Opus 4.6.

Voice Agents have strict latency requirements because they operate in real time. Delays in transcription, reasoning, or speech synthesis can significantly degrade the conversational experience. Because SLMs are smaller and faster to run, they can reduce response times and improve overall responsiveness. In many cases, an architecture composed of multiple specialized SLMs working together may achieve lower latency and better performance than a single large general-purpose model.

The Challenge

Your task in this hackathon is to build a Voice Agent where one or more fine-tuned SLMs play a central role.

Requirements:

You must use Oumi to fine-tune the models in your solution and 🌟 star the Oumi GitHub repo 🌟.
There are no restrictions on the application domain, but the agent should address a specific, realistic use case.
There is no need to justify the use of SLMs with evaluations, although the latency benefits should be clear
There is no requirement to use open-weight models, although this is highly encouraged.

Submissions will be evaluated based on:

Creativity
Real-world impact
Technical quality

Agent Submodels

Here are some suggestions for how you can modularize an agent into task specific models, each of which could be implemented as a fine-tuned SLM:

Guardrails and LLM-Judges (are my inputs and outputs valid, safe, and relevant?)
Query rewriting (how could this query be rewritten for more effective knowledge retrieval?)
Execution routing (which step in the workflow should I take next given the user query?)
Retrieval routing (which of my data sources - vector/graph database(s) etc. - should I search given the user query?)
Model routing (should this query go to the powerful LLM or simpler SLM?)
Planner (develop a plan, i.e. multiple steps, to achieve the intended outcome)
Verifier (what would happen if we carried out the plan - is it a good idea?)
Executors (convert the plan into a sequence of tool calls)
Memory management (are there any relevant facts in the query that would be useful in the future?)

Starter Template

We have included a starter template Voice Agent in this repo under /template. See the README there for instruction on how to install and use.

The template is intended to allow participants to focus on the "agent" part of the voice agent and not have to worry about the STT, TTS, and audio pipeline parts. There are no requirements to use this code, although it may help you to build faster.

Submission Showcase

After judging is complete, we will add interesting submissions to this section.

Additional Resources

Blog
- Small Fine-tuned Models are All You Need - by Stefan Webb
Oumi Open-Source Stack
Agent frameworks
Other libraries
- Pipecat
- vLLM
- vLLM-MLX
- Ollama
Papers
- “Small Language Models are the Future of Agentic AI”
- “Agentic AI Needs a Systems Theory”

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
template		template
.gitignore		.gitignore
README.md		README.md
voice-agent-example.png		voice-agent-example.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hack with Oumi: SLMs for Voice Agents

Problem Statement

The Challenge

Agent Submodels

Starter Template

Submission Showcase

Additional Resources

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Hack with Oumi: SLMs for Voice Agents

Problem Statement

The Challenge

Agent Submodels

Starter Template

Submission Showcase

Additional Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages