#102 vLLM subclass implementation as inference backend and example updated. by solankinitish · Pull Request #107 · vitalops/datatune

solankinitish · 2026-06-09T17:07:12Z

Closes #102

Problem

The existing VLLM implementation (added in #106) was incomplete — it imported httpx mid-class to fetch max_model_len from the vLLM server at init time, making instantiation fail without a live server and introducing an undeclared dependency.

Changes

Reimplemented VLLM as a clean subclass following the same pattern as Ollama — api_base and max_tokens passed explicitly, no network calls at init
max_tokens defaults to 4096 and can be overridden by the user to match their model's context length
Rate limiting and batch distribution handled by the existing LLM infrastructure
Added vLLM to the LLM provider section in examples/Getting_started.ipynb

Usage

from datatune.llm.llm import VLLM

llm = VLLM(
    model_name="mistralai/Mistral-7B-Instruct-v0.1",
    api_base="http://localhost:8000/v1",
    max_tokens=4096
)

vitalops#102)

fix(vllm): simplify VLLM backend, remove httpx dependency, add example (

2395fbe

vitalops#102)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

#102 vLLM subclass implementation as inference backend and example updated.#107

#102 vLLM subclass implementation as inference backend and example updated.#107
solankinitish wants to merge 1 commit into
vitalops:mainfrom
solankinitish:issue-102-vllm-backend

solankinitish commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

solankinitish commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant