feat(workspaces): add opinionated llm-inference workspace by mcd01 · Pull Request #176 · exalsius/exalsius-cli

mcd01 · 2026-02-02T16:00:07Z

Re-integrate LLM Inference Workspace (llm-d based, experimental)

Summary

This PR re-integrates the LLM inference workspace, now built on top of the llm-d reference stack.

The feature should be considered experimental.
Compared to other “exalsius-native” workspaces, the setup is not yet as streamlined, but it already enables Exalsius members to deploy and serve LLMs directly inside a workspace.

The goal is to provide an early, practical way to run model inference workloads while we iterate on UX and integration.

What’s included

Workspace template integration

Add new integrated template: LLM_D
Add default editing comments for the template
Register template alongside existing ones (Jupyter, Marimo, Dev Pod, Dist Training)

New configurator

Introduce LLMInferenceConfigurator
Automatically configures:
- Hugging Face token secret
- Model URI (hf://<repo>/<model>)
- Model labels
- InferencePool label matching
- Tensor parallelism (= num_gpus)
- Accelerator type (AMD / NVIDIA)

CLI command

New command:

exls workspaces deploy llm-inference

Options:

--huggingface-token
--model-name
--num-gpus
--wait-for-ready

Includes:

Model name validation (<repo>/<model>)
GPU count validation
Resource selection based on cluster
Standard confirmation + deployment flow

Example usage

exls workspaces deploy llm-inference \
  --model-name Qwen/Qwen3-1.7B \
  --hf-token $HF_TOKEN \
  --num-gpus 2 \
  --wait-for-ready

Behavior

The configurator automatically translates inputs into template variables:

Input	Effect
model name	Sets model artifact URI + labels
HF token	Creates auth secret reference
num GPUs	Sets tensor parallelism
GPU vendor	Sets accelerator type (nvidia/amd)

This removes most manual editing from the deployment process.

Helm chart

This workspace integrates the existing chart:

https://github.com/exalsius/exalsius-workspace-hub/tree/main/workspace-templates/llm-inference/llm-d-model

Notes

Experimental feature
Not yet as polished as other workspace types
Intended for early adopters and internal testing
Backwards compatible (pure addition)

…erence configurator

…llback

…back

srnbckr · 2026-02-12T17:11:56Z

I've rebased the branch and directly fixed some minor issues. Please have a look at my commits and see if that's how you would expect it.

We could discuss to add a short interactive flow to e.g. ask the user for the huggingface token and model name instead of requiring it (similar to how it's done for the marimo or jupyter workspaces) but otherwise I would approve this PR.

mcd01 requested a review from alek-thunder February 6, 2026 17:06

mcd01 self-assigned this Feb 6, 2026

mcd01 force-pushed the feat/llm_inference_workspace branch from ec23cb7 to 7aa45fe Compare February 6, 2026 17:08

mcd01 marked this pull request as ready for review February 6, 2026 17:12

mcd01 changed the title ~~feat: add opinionated llm-inference workspace~~ feat(workspaces): add opinionated llm-inference workspace Feb 6, 2026

mcd01 added 9 commits February 12, 2026 14:18

feat: add opinionated llm-inference workspace

6d4c4cd

fix: correct level for labels

4d03f08

feat: small changes

3bfcd62

feat(workspaces): include derived gpu-count and gpu-vendor in llm-inf…

0bd7a8a

…erence configurator

refactor: remove legacy k8s label

de9f276

feat: update docs and refine llm-inference configuration

6d67185

refactor: remove configuration of fullnameOverride

8c3db15

fix: update pyright

27357f7

fix: pyright findings

cd0a266

srnbckr force-pushed the feat/llm_inference_workspace branch from bff3596 to cd0a266 Compare February 12, 2026 13:20

srnbckr added 3 commits February 12, 2026 17:58

fix(llm-inference): align cluster_id argument with resource-naming ca…

90e6fa7

…llback

fix(workspaces): use typer.BadParameter in model name validation call…

6d13712

…back

refactor(workspaces): use deep_merge in LLMInferenceConfigurator

0559dab

feat(workspaces): prompt for hf-token and model-name in llm-inference

05c6afe

mcd01 merged commit 0f9428b into main Feb 13, 2026
1 check failed

mcd01 deleted the feat/llm_inference_workspace branch February 13, 2026 13:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(workspaces): add opinionated llm-inference workspace#176

feat(workspaces): add opinionated llm-inference workspace#176
mcd01 merged 13 commits intomainfrom
feat/llm_inference_workspace

mcd01 commented Feb 2, 2026 •

edited

Loading

Uh oh!

srnbckr commented Feb 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mcd01 commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Re-integrate LLM Inference Workspace (llm-d based, experimental)

Summary

What’s included

Workspace template integration

New configurator

CLI command

Example usage

Behavior

Helm chart

Notes

Uh oh!

srnbckr commented Feb 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mcd01 commented Feb 2, 2026 •

edited

Loading