You may see more details in the Kubeflow examples mnist. The difference is that the example end-to-end takes Kubeflow Fairing to build docker image and launch TFJob for distributed training, and then create a InferenceService (KFServing CRD) to deploy model service.
This example guides you through:
- Taking an example TensorFlow model and modifying it to support distributed training.
- Using
Kubeflow Fairingto build docker image and launch aTFJobto train model. - Using
Kubeflow Fairingto createInferenceService(KFServing CR) to deploy the trained model. - Cleaning up the
TFJobandInferenceServiceusingkubeflow-tfjobandkfservingSDK client.
-
Launch a Jupyter notebook
-
Open the notebook mnist_e2e_on_prem.ipynb
-
Follow the notebook to train and deploy MNIST on Kubeflow.