Introduction
We are hearing lot about MLOps for deployment of Machine Learning Models. Data Scientists faces lot of challenges with Software Engineering and DevOps related processes. The MLOps provides solution for these challenges and provide a methodology for scalable deployment by following the best practices of Software Engineering and DevOps.
We will explore one such tool - Kubeflow for MLOps and get ourself familiar with it.
Machine Learning Project Workflow
Data Science project has mainly 2 phases:
- Experimentation
- Inference
- Data Prep
- Model Training
- Prediction
- Service Mangement
Kubeflow
Deploy Kubeflow on Minikube
$ minikube start --cpus 4 --memory 8096 --disk-size=40g 😄 minikube v1.23.2 on Darwin 11.7.2 ✨ Using the docker driver based on existing profile 👍 Starting control plane node minikube in cluster minikube 🚜 Pulling base image ... 🎉 minikube 1.28.0 is available! Download it: https://github.com/kubernetes/minikube/releases/tag/v1.28.0 💡 To disable this notice, run: 'minikube config set WantUpdateNotification false' 🔄 Restarting existing docker container for "minikube" ... 🐳 Preparing Kubernetes v1.22.2 on Docker 20.10.8 ... 🔎 Verifying Kubernetes components... ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5 ▪ Using image kubernetesui/dashboard:v2.3.1 ▪ Using image kubernetesui/metrics-scraper:v1.0.7 🌟 Enabled addons: storage-provisioner, default-storageclass, dashboard ❗ /usr/local/bin/kubectl is version 1.24.0, which may have incompatibilites with Kubernetes 1.22.2. ▪ Want kubectl v1.22.2? Try 'minikube kubectl -- get pods -A' 🏄 Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
🤔 Verifying dashboard health ... 🚀 Launching proxy ... 🤔 Verifying proxy health ... http://127.0.0.1:60665/api/v1/namespaces/kubernetes-dashboard/services/http:kubernetes-dashboard:/proxy/$
minikube dashboard --url
$ export PIPELINE_VERSION=1.8.5 $ kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION" $ kubectl wait --for condition=established --timeout=60s crd/applications.app.k8s.io $ kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/env/platform-agnostic-pns?ref=$PIPELINE_VERSION"
Forwarding from 127.0.0.1:8080 -> 3000 Forwarding from [::1]:8080 -> 3000 Handling connection for 8080 Handling connection for 8080 Handling connection for 8080 Handling connection for 8080$
kubectl port-forward -n kubeflow svc/ml-pipeline-ui 8080:80
http://localhost:8080/