How to Turn OpenShift Pipelines into an MLOps Pipeline with KitOps

Turning your OpenShift Pipelines into an MLOps pipeline involves adapting your existing CI/CD workflows to accommodate the unique requirements of machine learning (ML) model development, deployment, and monitoring. MLOps extends DevOps practices to include the full ML lifecycle, ensuring that models are reproducible, scalable, and maintainable.

How to Turn OpenShift Pipelines into an MLOps Pipeline with KitOps
How to Turn OpenShift Pipelines into an MLOps Pipeline with KitOps

Table of Contents

  1. Introduction
  2. Prerequisites
  3. Step-by-Step Guide
  4. Conclusion
  5. Additional Resources

Introduction

Machine Learning Operations (MLOps) is the practice of applying DevOps principles to machine learning workflows. It addresses common challenges such as:

  • Automation: Reducing manual efforts in data preprocessing, model training, and deployment.
  • Version Control: Tracking changes in models, code, and datasets.
  • Reproducibility: Ensuring consistent results across different environments.
  • Monitoring: Continuously monitoring model performance in production.

OpenShift Pipelines is a Kubernetes-native CI/CD framework based on Tekton. It allows you to build, test, and deploy across cloud providers or on-premise systems.

KitOps is a tool that simplifies the packaging and deployment of AI models and their dependencies into portable units called ModelKits.

By combining these tools, you can create a robust MLOps pipeline that automates your machine learning workflows.

Prerequisites

Before you begin, ensure you have the following:

  • OpenShift Account: Access to OpenShift with administrative privileges. You can use the OpenShift Developer Sandbox.
  • KitOps: Installed locally. Installation Guide.
  • Container Registry: Such as Jozu Hub, Docker Hub, or GitHub Container Registry.
  • GitHub Account: For hosting your code repository.
  • HuggingFace Account: To access pre-trained models.
  • Git CLI: Installed on your local machine.
  • Basic Knowledge: Familiarity with Kubernetes, Docker, and YAML files.

Step-by-Step Guide

Step 1: Install KitOps

First, install KitOps on your local machine.

For Linux:

wget https://github.com/jozu-ai/kitops/releases/latest/download/kitops-linux-x86_64.tar.gz
tar -xzvf kitops-linux-x86_64.tar.gz
sudo mv kit /usr/local/bin/

Verify the Installation:

kit version

Step 2: Set Up a Container Registry

You’ll need a container registry to store your packaged models.

Using Jozu Hub:

  1. Create an Account: Sign up at Jozu Hub.
  2. Create a Repository: Name it appropriately, e.g., my-ml-model.
  3. Authenticate: Log in from your terminal.
kit login jozu.ml

You’ll be prompted for your username (email) and password.

Step 3: Download a Model from HuggingFace

Choose a pre-trained model from HuggingFace. For this guide, we’ll use the Qwen2-0.5B-Instruct-GGUF model.

Download the Model and Related Files:

mkdir my-ml-project && cd my-ml-project

# Download the model file
wget https://huggingface.co/Qwen/Qwen2-0.5B-Instruct-GGUF/resolve/main/qwen2-0_5b-instruct-q2_k.gguf

# Download the LICENSE and README
wget https://huggingface.co/Qwen/Qwen2-0.5B-Instruct-GGUF/resolve/main/LICENSE
wget https://huggingface.co/Qwen/Qwen2-0.5B-Instruct-GGUF/resolve/main/README.md

Step 4: Pack Your ModelKit with KitOps

Organize your files and create a Kitfile to define your ModelKit.

Organize Files:

mkdir models docs
mv qwen2-0_5b-instruct-q2_k.gguf models/
mv LICENSE README.md docs/

Create a Kitfile:

manifestVersion: 1.0.0
package:
  name: qwen2-0.5B
  version: 2.0.0
  description: The instruction-tuned 0.5B Qwen2 large language model.
  authors: [Your Name]
model:
  name: qwen2-0_5b-instruct-q2_k
  path: models/qwen2-0_5b-instruct-q2_k.gguf
  description: The model downloaded from HuggingFace.
code:
  - path: docs/LICENSE
    description: License file.
  - path: docs/README.md
    description: Readme file.

Pack the ModelKit:

kit pack . -t jozu.ml/<your-jozu-username>/<your-repo-name>:latest

Replace <your-jozu-username> and <your-repo-name> with your actual Jozu Hub username and repository name.

Step 5: Push Your ModelKit to the Container Registry

Push the packaged ModelKit to your container registry.

kit push jozu.ml/<your-jozu-username>/<your-repo-name>:latest

Verify the Push:

Log in to Jozu Hub and confirm that your ModelKit appears in your repository.

Step 6: Create an OpenShift Pipeline

We’ll create a Tekton pipeline in OpenShift to automate the deployment process.

Set Up Your GitHub Repository:

1. Initialize Git:

git init
git add .
git commit -m "Initial commit"

2. Create a GitHub Repository and push your code.

git remote add origin https://github.com/<your-github-username>/<your-repo-name>.git
git branch -M main
git push -u origin main

3. Create Tekton Pipeline Definition:

Create a directory named .tekton and add a file called pipeline.yaml.

mkdir .tekton
touch .tekton/pipeline.yaml

4. Contents of pipeline.yaml:

apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
  name: qwen-openshift-pipeline
spec:
  workspaces:
    - name: shared-workspace
  tasks:
    - name: git-clone
      taskRef:
        name: git-clone
      params:
        - name: url
          value: 'https://github.com/<your-github-username>/<your-repo-name>.git'
        - name: revision
          value: 'main'
      workspaces:
        - name: output
          workspace: shared-workspace

    - name: install-kitops
      runAfter:
        - git-clone
      taskSpec:
        steps:
          - name: install-kit
            image: docker.io/library/golang:1.17
            script: |
              #!/bin/bash
              wget https://github.com/jozu-ai/kitops/releases/latest/download/kitops-linux-x86_64.tar.gz
              tar -xzvf kitops-linux-x86_64.tar.gz
              mv kit /usr/local/bin/
              kit version
      workspaces:
        - name: shared-workspace
          workspace: shared-workspace

    - name: login-jozu
      runAfter:
        - install-kitops
      taskSpec:
        steps:
          - name: login
            image: docker.io/library/golang:1.17
            script: |
              #!/bin/bash
              kit login jozu.ml -u '<your-jozu-username>' -p '<your-jozu-password>'
      workspaces:
        - name: shared-workspace
          workspace: shared-workspace

    - name: pack-and-push
      runAfter:
        - login-jozu
      taskSpec:
        steps:
          - name: pack-and-push
            image: docker.io/library/golang:1.17
            script: |
              #!/bin/bash
              cd $(workspaces.shared-workspace.path)
              kit pack . -t jozu.ml/<your-jozu-username>/<your-repo-name>:latest
              kit push jozu.ml/<your-jozu-username>/<your-repo-name>:latest
      workspaces:
        - name: shared-workspace
          workspace: shared-workspace

Note:

  • Replace placeholders like <your-jozu-username> and <your-jozu-password> with your actual credentials.
  • In a production environment, avoid hardcoding passwords. Use Kubernetes Secrets to manage sensitive information.

Apply the Pipeline to OpenShift:

Log in to OpenShift CLI:

oc login

Apply the Pipeline:

oc apply -f .tekton/pipeline.yaml

Step 7: Run and Validate Your Pipeline

Start the Pipeline Run:

tkn pipeline start qwen-openshift-pipeline -w name=shared-workspace,volumeClaimTemplateFile=./pvc.yaml

Example pvc.yaml:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: shared-workspace-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

This command starts the pipeline and uses a Persistent Volume Claim (PVC) for shared storage.

Monitor the Pipeline:

You can monitor the pipeline run using:

tkn pipeline logs qwen-openshift-pipeline -f

Alternatively, use the OpenShift web console to view the pipeline run and logs.

Step 8: Deploy and Test Your Model

Create a Deployment in OpenShift:

  1. Navigate to the OpenShift Web Console and go to the Developer perspective.
  2. Add a Deployment:
    • Click on +Add.
    • Choose Container Image.
    • Enter the image name: jozu.ml/<your-jozu-username>/<your-repo-name>:latest.
    • Give the application a name, e.g., qwen-model.
    • Optional: Expose a service by checking Create a route to the application.

Configure Environment Variables (If Needed):

If your application requires specific environment variables, configure them in the deployment settings.

Verify the Deployment:

Once deployed, you can:

  • Check Pods: Ensure the pods are running without errors.
  • Access the Application: Use the provided route to access the application.

Test the Model:

If your model exposes an API or UI:

  • API Testing: Use tools like curl or Postman to send requests.
  • Web Interface: Interact with the model via the web UI.

Conclusion

By following this guide, you’ve transformed your OpenShift Pipelines into an effective MLOps pipeline. You’ve learned how to:

  • Automate Model Packaging: Using KitOps to package models and dependencies.
  • Use Container Registries: Storing and retrieving ModelKits from Jozu Hub.
  • Create CI/CD Pipelines: Automating the build and deployment process with OpenShift Pipelines.
  • Deploy and Validate Models: Running your models in a production-like environment and interacting with them.

This setup enhances reproducibility, scalability, and maintainability of your machine learning workflows.

Additional Resources

Note on Security: Always handle sensitive information like passwords and API keys securely. Use Kubernetes Secrets and avoid hardcoding credentials in scripts or configuration files.

Leave a Comment

Comments

No comments yet. Why don’t you start the discussion?

    Comments