Skip to content

fix: terminate MLflow server when Caldera exits#8

Open
ChenFryd wants to merge 1 commit into
mitre:mainfrom
autonet-internal:fix/mlflow-not-shutdown-on-exit
Open

fix: terminate MLflow server when Caldera exits#8
ChenFryd wants to merge 1 commit into
mitre:mainfrom
autonet-internal:fix/mlflow-not-shutdown-on-exit

Conversation

@ChenFryd

Copy link
Copy Markdown

Description

hook.py started the MLflow server with subprocess.Popen() but discarded the process reference,
leaving MLflow running as an orphan after Caldera shuts down.

MLflow internally spawns worker processes via Python multiprocessing. When the MLflow parent dies
without cleanup, those workers get reparented to init (PID=1) and continue listening on port 5000
indefinitely. Repeated Caldera restarts accumulate these orphaned processes.

The fix stores the Popen reference in _mlflow_proc and registers an atexit handler
(_shutdown_mlflow) that calls terminate() on exit, with a 5-second timeout before force kill().

Closes #7

Type of change

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

  1. Started Caldera with the MCP plugin enabled
  2. Confirmed MLflow was listening: ss -tlnp | grep 5000
  3. Stopped Caldera (Ctrl+C)
  4. Confirmed MLflow was no longer listening: ss -tlnp | grep 5000

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works

The MLflow server was started with subprocess.Popen() but the process
reference was discarded, leaving it running as an orphan after Caldera
shuts down.

Store the Popen reference in _mlflow_proc and register an atexit handler
that terminates it gracefully (with a 5-second timeout before force-kill)
when the Caldera process exits.
@github-actions

Copy link
Copy Markdown

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MLflow server is not terminated when Caldera exits

1 participant