Turning your OpenShift Pipelines into an MLOps pipeline involves adapting your existing CI/CD workflows to accommodate the unique requirements of machine learning (ML) model development, deployment, and monitoring. MLOps extends DevOps practices to include the full ML lifecycle, ensuring that models are reproducible, scalable, and maintainable.
Table of Contents
Introduction
Machine Learning Operations (MLOps) is the practice of applying DevOps principles to machine learning workflows. It addresses common challenges such as:
- Automation: Reducing manual efforts in data preprocessing, model training, and deployment.
- Version Control: Tracking changes in models, code, and datasets.
- Reproducibility: Ensuring consistent results across different environments.
- Monitoring: Continuously monitoring model performance in production.
OpenShift Pipelines is a Kubernetes-native CI/CD framework based on Tekton. It allows you to build, test, and deploy across cloud providers or on-premise systems.
KitOps is a tool that simplifies the packaging and deployment of AI models and their dependencies into portable units called ModelKits.
By combining these tools, you can create a robust MLOps pipeline that automates your machine learning workflows.
Prerequisites
Before you begin, ensure you have the following:
- OpenShift Account: Access to OpenShift with administrative privileges. You can use the OpenShift Developer Sandbox.
- KitOps: Installed locally. Installation Guide.
- Container Registry: Such as Jozu Hub, Docker Hub, or GitHub Container Registry.
- GitHub Account: For hosting your code repository.
- HuggingFace Account: To access pre-trained models.
- Git CLI: Installed on your local machine.
- Basic Knowledge: Familiarity with Kubernetes, Docker, and YAML files.
Step-by-Step Guide
Step 1: Install KitOps
First, install KitOps on your local machine.
For Linux:
wget https://github.com/jozu-ai/kitops/releases/latest/download/kitops-linux-x86_64.tar.gz
tar -xzvf kitops-linux-x86_64.tar.gz
sudo mv kit /usr/local/bin/
Verify the Installation:
kit version
Step 2: Set Up a Container Registry
You’ll need a container registry to store your packaged models.
Using Jozu Hub:
- Create an Account: Sign up at Jozu Hub.
- Create a Repository: Name it appropriately, e.g.,
my-ml-model
. - Authenticate: Log in from your terminal.
kit login jozu.ml
You’ll be prompted for your username (email) and password.
Step 3: Download a Model from HuggingFace
Choose a pre-trained model from HuggingFace. For this guide, we’ll use the Qwen2-0.5B-Instruct-GGUF model.
Download the Model and Related Files:
mkdir my-ml-project && cd my-ml-project
# Download the model file
wget https://huggingface.co/Qwen/Qwen2-0.5B-Instruct-GGUF/resolve/main/qwen2-0_5b-instruct-q2_k.gguf
# Download the LICENSE and README
wget https://huggingface.co/Qwen/Qwen2-0.5B-Instruct-GGUF/resolve/main/LICENSE
wget https://huggingface.co/Qwen/Qwen2-0.5B-Instruct-GGUF/resolve/main/README.md
Step 4: Pack Your ModelKit with KitOps
Organize your files and create a Kitfile
to define your ModelKit.
Organize Files:
mkdir models docs
mv qwen2-0_5b-instruct-q2_k.gguf models/
mv LICENSE README.md docs/
Create a Kitfile
:
manifestVersion: 1.0.0
package:
name: qwen2-0.5B
version: 2.0.0
description: The instruction-tuned 0.5B Qwen2 large language model.
authors: [Your Name]
model:
name: qwen2-0_5b-instruct-q2_k
path: models/qwen2-0_5b-instruct-q2_k.gguf
description: The model downloaded from HuggingFace.
code:
- path: docs/LICENSE
description: License file.
- path: docs/README.md
description: Readme file.
Pack the ModelKit:
kit pack . -t jozu.ml/<your-jozu-username>/<your-repo-name>:latest
Replace <your-jozu-username>
and <your-repo-name>
with your actual Jozu Hub username and repository name.
Step 5: Push Your ModelKit to the Container Registry
Push the packaged ModelKit to your container registry.
kit push jozu.ml/<your-jozu-username>/<your-repo-name>:latest
Verify the Push:
Log in to Jozu Hub and confirm that your ModelKit appears in your repository.
Step 6: Create an OpenShift Pipeline
We’ll create a Tekton pipeline in OpenShift to automate the deployment process.
Set Up Your GitHub Repository:
1. Initialize Git:
git init
git add .
git commit -m "Initial commit"
2. Create a GitHub Repository and push your code.
git remote add origin https://github.com/<your-github-username>/<your-repo-name>.git
git branch -M main
git push -u origin main
3. Create Tekton Pipeline Definition:
Create a directory named .tekton
and add a file called pipeline.yaml
.
mkdir .tekton
touch .tekton/pipeline.yaml
4. Contents of pipeline.yaml
:
apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
name: qwen-openshift-pipeline
spec:
workspaces:
- name: shared-workspace
tasks:
- name: git-clone
taskRef:
name: git-clone
params:
- name: url
value: 'https://github.com/<your-github-username>/<your-repo-name>.git'
- name: revision
value: 'main'
workspaces:
- name: output
workspace: shared-workspace
- name: install-kitops
runAfter:
- git-clone
taskSpec:
steps:
- name: install-kit
image: docker.io/library/golang:1.17
script: |
#!/bin/bash
wget https://github.com/jozu-ai/kitops/releases/latest/download/kitops-linux-x86_64.tar.gz
tar -xzvf kitops-linux-x86_64.tar.gz
mv kit /usr/local/bin/
kit version
workspaces:
- name: shared-workspace
workspace: shared-workspace
- name: login-jozu
runAfter:
- install-kitops
taskSpec:
steps:
- name: login
image: docker.io/library/golang:1.17
script: |
#!/bin/bash
kit login jozu.ml -u '<your-jozu-username>' -p '<your-jozu-password>'
workspaces:
- name: shared-workspace
workspace: shared-workspace
- name: pack-and-push
runAfter:
- login-jozu
taskSpec:
steps:
- name: pack-and-push
image: docker.io/library/golang:1.17
script: |
#!/bin/bash
cd $(workspaces.shared-workspace.path)
kit pack . -t jozu.ml/<your-jozu-username>/<your-repo-name>:latest
kit push jozu.ml/<your-jozu-username>/<your-repo-name>:latest
workspaces:
- name: shared-workspace
workspace: shared-workspace
Note:
- Replace placeholders like
<your-jozu-username>
and<your-jozu-password>
with your actual credentials. - In a production environment, avoid hardcoding passwords. Use Kubernetes Secrets to manage sensitive information.
Apply the Pipeline to OpenShift:
Log in to OpenShift CLI:
oc login
Apply the Pipeline:
oc apply -f .tekton/pipeline.yaml
Step 7: Run and Validate Your Pipeline
Start the Pipeline Run:
tkn pipeline start qwen-openshift-pipeline -w name=shared-workspace,volumeClaimTemplateFile=./pvc.yaml
Example pvc.yaml
:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: shared-workspace-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
This command starts the pipeline and uses a Persistent Volume Claim (PVC) for shared storage.
Monitor the Pipeline:
You can monitor the pipeline run using:
tkn pipeline logs qwen-openshift-pipeline -f
Alternatively, use the OpenShift web console to view the pipeline run and logs.
Step 8: Deploy and Test Your Model
Create a Deployment in OpenShift:
- Navigate to the OpenShift Web Console and go to the Developer perspective.
- Add a Deployment:
- Click on +Add.
- Choose Container Image.
- Enter the image name:
jozu.ml/<your-jozu-username>/<your-repo-name>:latest
. - Give the application a name, e.g.,
qwen-model
. - Optional: Expose a service by checking Create a route to the application.
Configure Environment Variables (If Needed):
If your application requires specific environment variables, configure them in the deployment settings.
Verify the Deployment:
Once deployed, you can:
- Check Pods: Ensure the pods are running without errors.
- Access the Application: Use the provided route to access the application.
Test the Model:
If your model exposes an API or UI:
- API Testing: Use tools like
curl
or Postman to send requests. - Web Interface: Interact with the model via the web UI.
Conclusion
By following this guide, you’ve transformed your OpenShift Pipelines into an effective MLOps pipeline. You’ve learned how to:
- Automate Model Packaging: Using KitOps to package models and dependencies.
- Use Container Registries: Storing and retrieving ModelKits from Jozu Hub.
- Create CI/CD Pipelines: Automating the build and deployment process with OpenShift Pipelines.
- Deploy and Validate Models: Running your models in a production-like environment and interacting with them.
This setup enhances reproducibility, scalability, and maintainability of your machine learning workflows.
Additional Resources
- KitOps Documentation: KitOps GitHub Repository
- OpenShift Pipelines: OpenShift Pipelines Documentation
- Tekton Pipelines: Tekton Documentation
- Jozu Hub: Jozu Hub Website
- HuggingFace Models: HuggingFace Model Repository
- OpenShift CLI: OC CLI Installation
Note on Security: Always handle sensitive information like passwords and API keys securely. Use Kubernetes Secrets and avoid hardcoding credentials in scripts or configuration files.