Skip to content

NEU-Solution/production_cluster

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

production_cluster

  1. Hosting mlflow server:
export AWS_ACCESS_KEY_ID=
export AWS_SECRET_ACCESS_KEY=
export MLFLOW_S3_ENDPOINT_URL=

mlflow server \
  --backend-store-uri sqlite:///mlflow.db \
  --default-artifact-root \
  --host 0.0.0.0 \
  --port 5000
  1. Build docker image for vllm serving:
sudo docker build -t lora-vllm .
  1. [Optional] export http for mlflow (using ngrok or zrok):
   ngrok http 5000
  1. test docker image:
docker run --rm \
  -e MODEL_NAME="thinking" \
  -e MODEL_VERSION="5" \
  -e MODEL_ALIAS="champion" \
  -e MLFLOW_TRACKING_URI="" \
  -e AWS_ACCESS_KEY_ID="" \
  -e AWS_SECRET_ACCESS_KEY="" \
  -e MLFLOW_S3_ENDPOINT_URL="" \
  -e MLFLOW_TRACKING_USERNAME="admin" \
  -e MLFLOW_TRACKING_PASSWORD="" \
  -e VLLM_LOGGING_LEVEL=DEBUG \
  --gpus all \
  -p 8000:8000 \
  vllm
  1. Hosting deployment API:
uvicorn serve_vllm_api:app --host 0.0.0.0 --port 6789
  1. Send API for hosting
curl -X POST http://localhost:6789/start-vllm -H "Content-Type: application/json" -d '{"MODEL_NAME": "initial-sft", "MODEL_VERSION": "latest", "MLFLOW_TRACKING_URI": "", "AWS_ACCESS_KEY_ID": "", "AWS_SECRET_ACCESS_KEY": "", "MLFLOW_S3_ENDPOINT_URL": "", "VLLM_LOGGING_LEVEL": "DEBUG"}'
  1. Now see magic on port 8000
python test_streaming.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors