Skip to content

dsp-developers/data-science-pipelines-operator

 
 

Repository files navigation

Data Science Pipelines Operator

The Data Science Pipelines Operator (DSPO) is an OpenShift Operator that is used to deploy single namespace scoped Data Science Pipeline stacks onto individual OCP namespaces.

Table of Contents

  1. Quickstart
    1. Pre-requisites
    2. Deploy the Operator via ODH
    3. Deploy the Operator standalone
    4. Deploy DSPA instance
      1. Deploy another DSPA instance
      2. Deploy a DSPA with custom credentials
      3. Deploy a DSPA with External Object Storage
  2. DataSciencePipelinesApplication Component Overview
  3. Using a DataSciencePipelinesApplication
    1. Using the Graphical UI
    2. Using the API
  4. Cleanup
    1. Cleanup ODH Installation
    2. Cleanup Standalone Installation
  5. Run tests
  6. Metrics

Quickstart

To get started you will first need to satisfy the following pre-requisites:

Pre-requisites

  1. An OpenShift cluster that is 4.9 or higher.
  2. You will need to be logged into this cluster as cluster admin via oc client.
  3. The OpenShift Cluster must have OpenShift Pipelines 1.8 or higher installed. We recommend channel pipelines-1.8 on OCP 4.10 and pipelines-1.9 or pipelines-1.10 for OCP 4.11, 4.12 and 4.13. Instructions here.
  4. Based on installation type you will need one of the following:
    1. For Standalone method: You will need to have Kustomize version 4.5+ installed
    2. For ODH method: The Open Data Hub operator needs to be installed. You can install it via OperatorHub.

Deploy the Operator via ODH

On a cluster with ODH installed, create a namespace where you would like to install DSPO:

DSPO_NS=data-science-pipelines-operator
oc new-project ${DSPO_NS}

Then deploy the following KfDef into the namespace created above:

cat <<EOF | oc apply -f -
apiVersion: kfdef.apps.kubeflow.org/v1
kind: KfDef
metadata:
   name: data-science-pipelines-operator
   namespace: ${DSPO_NS}
spec:
   applications:
      - kustomizeConfig:
           repoRef:
              name: manifests
              path: data-science-pipelines-operator/
        name: data-science-pipelines-operator
   repos:
      - name: manifests
        uri: "https://github.com/opendatahub-io/odh-manifests/tarball/master"
EOF

Confirm the pods are successfully deployed and reach running state:

oc get pods -n ${DSPO_NS}

Once all pods are ready, we can proceed to deploying the first Data Science Pipelines (DSP) instance. Instructions here.

Deploy the Operator standalone

First clone this repository:

WORKING_DIR=$(mktemp -d)
git clone https://github.com/opendatahub-io/data-science-pipelines-operator.git ${WORKING_DIR}

If you already have the repository cloned, set WORKING_DIR= to its absolute location accordingly.

DSPO can be installed in any namespace, we'll deploy it in the following namespace, you may update this environment variable accordingly.

DSPO_NS=data-science-pipelines-operator

# Create the namespace if it doesn't already exist
oc new-project ${DSPO_NS}

Now we will navigate to the DSPO manifests then build and deploy them to this namespace.

cd ${WORKING_DIR}
make deploy OPERATOR_NS=${DSPO_NS}

Confirm the pods are successfully deployed and reach running state:

oc get pods -n ${DSPO_NS}

Once all pods are ready, we can proceed to deploying the first Data Science Pipelines (DSP) instance. Instructions here.

Deploy DSP instance

We'll deploy the first instance in the following namespace.

# You may update this value to the namespace you would like to use.
DSP_Namespace=test-ds-project-1
oc new-project ${DSP_Namespace}

DSPO introduces a new custom resource to your cluster called DataSciencePipelinesApplication. This resource is where you configure your DSP components, your DB and Storage configurations, and so forth. For now, we'll use the sample one provided, feel free to inspect this sample resource to see other configurable options.

cd ${WORKING_DIR}/config/samples
kustomize build . | oc -n ${DSP_Namespace} apply -f -

Note: the sample CR used here deploys a minio instance so DSP may work out of the box this is unsupported in production environments and we recommend to provide your own object storage connection details via spec.objectStorage.externalStorage see ${WORKING_DIR}/config/samples/dspa_simple_external_storage.yaml for an example.

Confirm all pods reach ready state by running:

oc get pods -n ${DSP_Namespace}

For instructions on how to use this DSP instance refer to these instructions: here.

Deploy another DSP instance

You can use the DSPO to deploy multiple DataSciencePipelinesApplication instances in different OpenShift namespaces, for example earlier we deployed a DataSciencePipelinesApplication resource named sample. We can use this again and deploy it to a different namespace:

DSP_Namespace_2=test-ds-project-2
oc new-project ${DSP_Namespace_2}
cd ${WORKING_DIR}/config/samples
kustomize build . | oc -n ${DSP_Namespace_2} apply -f -

Deploy a DSP with custom credentials

Using DSPO you can specify custom credentials for Database and Object storage. If specifying external connections, this is required. You can also provide secrets for the built in MariaDB and Minio deployments. To see a sample configuration you can simply investigate and deploy the following path:

DSP_Namespace_3=test-ds-project-3
oc new-project ${DSP_Namespace_3}
cd ${WORKING_DIR}/config/samples/custom-configs
kustomize build . | oc -n ${DSP_Namespace_3} apply -f -

Notice the introduction of 2 secrets testdbsecret, teststoragesecret and 2 configmaps custom-ui-configmap and custom-artifact-script. The secrets allow you to provide your own credentials for the DB and MariaDB connections.

These can be configured by the end user as needed.

Deploy a DSP with external Object Storage

To specify a custom Object Storage (example an AWS s3 bucket) you will need to provide DSPO with your S3 credentials in the form of a k8s Secret, see an example of such a secret here config/samples/external-object-storage/storage-creds.yaml.

DSPO can deploy a DSPA instance and use this S3 bucket for storing its metadata and pipeline artifacts. A sample configuration for a DSPA that does this is found in config/samples/external-object-storage, you can update this as needed, and deploy this DSPA by running the following:

DSP_Namespace_3=test-ds-project-4
oc new-project ${DSP_Namespace_4}
cd ${WORKING_DIR}/config/samples/external-object-storage
kustomize build . | oc -n ${DSP_Namespace_3} apply -f -

DataSciencePipelinesApplication Component Overview

When a DataSciencePipelinesApplication is deployed, the following components are deployed in the target namespace:

  • APIServer
  • Persistence Agent
  • Scheduled Workflow controller

If specified in the DataSciencePipelinesApplication resource, the following components may also be additionally deployed:

  • MariaDB
  • Minio
  • MLPipelines UI
  • MLMD (ML Metadata)

To understand how these components interact with each other please refer to the upstream Kubeflow Pipelines Architectural Overview documentation.

Deploying Optional Components

MariaDB

To deploy a standalone MariaDB metadata database (rather than providing your own database connection details), simply add a mariaDB item under the spec.database in your DSPA definition with an deploy key set to true. All other fields are defaultable/optional, see All Fields DSPA Example for full details. Note that this component is mutually exclusive with externally-provided databases (defined by spec.database.externalDB).

apiVersion: datasciencepipelinesapplications.opendatahub.io/v1alpha1
kind: DataSciencePipelinesApplication
metadata:
  name: sample
spec:
   ...
  database:
    mariaDB:   # mutually exclusive with externalDB
      deploy: true

Minio

To deploy a Minio Object Storage component (rather than providing your own object storage connection details), simply add a minio item under the spec.objectStorage in your DSPA definition with an image key set to a valid minio component container image. All other fields are defaultable/optional, see All Fields DSPA Example for full details. Note that this component is mutually exclusive with externally-provided object stores (defined by spec.objectStorage.externalStorage).

apiVersion: datasciencepipelinesapplications.opendatahub.io/v1alpha1
kind: DataSciencePipelinesApplication
metadata:
  name: sample
spec:
   ...
  objectStorage:
    minio:  # mutually exclusive with externalStorage
      deploy: true
      # Image field is required
      image: 'quay.io/opendatahub/minio:RELEASE.2019-08-14T20-37-41Z-license-compliance'

ML Pipelines UI

To deploy the standalone DS Pipelines UI component, simply add a spec.mlpipelineUI item to your DSPA with an image key set to a valid ui component container image. All other fields are defaultable/optional, see All Fields DSPA Example for full details.

apiVersion: datasciencepipelinesapplications.opendatahub.io/v1alpha1
kind: DataSciencePipelinesApplication
metadata:
  name: sample
spec:
   ...
  mlpipelineUI:
    deploy: true
    # Image field is required
    image: 'quay.io/opendatahub/odh-ml-pipelines-frontend-container:beta-ui'

ML Metadata

To deploy the ML Metadata artifact linage/metadata component, simply add a spec.mlmd item to your DSPA with deploy set to true. All other fields are defaultable/optional, see All Fields DSPA Example for full details.

apiVersion: datasciencepipelinesapplications.opendatahub.io/v1alpha1
kind: DataSciencePipelinesApplication
metadata:
  name: sample
spec:
   ...
   mlmd:
      deploy: true

Using a DataSciencePipelinesApplication

When a DataSciencePipelinesApplication is deployed, use the MLPipelines UI endpoint to interact with DSP, either via a GUI or via API calls.

You can retrieve this route by running the following in your terminal:

DSP_CR_NAME=sample
DSP_Namespace=test-ds-project-1
echo https://$(oc get routes -n ${DSP_Namespace} ds-pipeline-ui-${DSP_CR_NAME} --template={{.spec.host}})

Using the Graphical UI

Note the UI presented below is the upstream Kubeflow Pipelines UI, this is not supported in DSP and will be replaced with the ODH Dashboard UI. Until then, this UI can be deployed via DSPO for experimentation/development purposes. Note however that this UI is not a supported feature of DSPO/ODH.

Navigate to the route retrieved in the last step. You will be presented with the MLPipelines UI. In this walkthrough we will upload a pipeline and start a run based off it.

To start, click the "Upload Pipeline" button.

Choose a file, you can use the flipcoin example. Download this example and select it for the first step. Then click "Create" for the second step.

Once done, you can now use this Pipeline to create a Run, do this by pressing "+ Create Run".

On this page specify the Pipeline we just uploaded if it's not already auto selected. Similarly with the version, with the example there will only be one version listed. Give this Run a name, or keep the default as is. Then click "Start".

Once you click start you will be navigated to the Runs page where you can see your previous runs that are have been executed, or currently executing. You can click the most recently started Run to view it's execution graph.

You should see something similar to the following once the Run completes.

Click the first "flip-coin" step. This step produces an output message either "heads" or "tails", confirm that you can see these logs after clicking this step and navigating to "Logs."

Using the API

Note: By default we use kfp-tekton 1.5.x for this section so you will need kfp-tekton v1.5.x sdk installed in your environment

In the previous step we submitted a generated Pipeline yaml via the GUI. We can also submit the Pipeline code directly either locally or via a notebook.

You can find the Pipeline code example here. We can submit this to the DSP API Server by including this code in the following Python script:

cd ${WORKING_DIR}/docs/example_pipelines
touch execute_pipeline.py

We can utilize the flip coin example in this location and submit it directly to the API server by doing the following:

# Add this to execute_pipeline.py we created earlier
import os
import kfp_tekton
from condition import flipcoin_pipeline
token = os.getenv("OCP_AUTH_TOKEN")
route = os.getenv("DSP_ROUTE")
client = kfp_tekton.TektonClient(host=route, existing_token=token)

client.create_run_from_pipeline_func(pipeline_func=flipcoin_pipeline, arguments={})

Note: If you are in an unsecured cluster, you may encounter CERTIFICATE_VERIFY_FAILED error, to work around this you can pass in the self-signed certs to the kfp-tekton client. For instance, if running inside a notebook or pod on the same cluster, you can do the following:

...
cert = "/run/secrets/kubernetes.io/serviceaccount/ca.crt"
client = kfp_tekton.TektonClient(host=route, existing_token=token, ssl_ca_cert=cert)
client.create_run_from_pipeline_func(pipeline_func=flipcoin_pipeline, arguments={})

Retrieve your token and DSP route:

# This is the namespace you deployed the DataSciencePipelinesApplication Custom Resource
DSP_Namespace=test-ds-project-1
# This is the metadata.name of that DataSciencePipelinesApplication Custom Resource
DSP_CR_NAME=sample
export DSP_ROUTE="https://$(oc get routes -n ${DSP_Namespace} ds-pipeline-ui-${DSP_CR_NAME} --template={{.spec.host}})"
export OCP_AUTH_TOKEN=$(oc whoami --show-token)

And finally execute this script and submit the flip coin example:

python execute_pipeline.py

You can navigate to the UI again and find your newly created run there, or you could amend the script above and list the runs via client.list_runs().

Cleanup

To remove a DataSciencePipelinesApplication from your custer, run:

# Replace environment variables accordingly
oc delete ${YOUR_DSPIPELINE_NAME} -n ${YOUR_DSPIPELINES_NAMESPACE}

The DSPO will clean up all manifests associated with each DataSciencePipelinesApplication instance.

Or you can remove all DataSciencePipelinesApplication instances in your whole cluster by running the following:

oc delete DataSciencePipelinesApplication --all -A

Depending on how you installed DSPO, follow the instructions below accordingly to remove the operator:

Cleanup ODH Installation

To uninstall DSPO via ODH run the following:

KFDEF_NAME=data-science-pipelines-operator
oc delete kfdef ${KFDEF_NAME} -n ${DSPO_NS}

Cleanup Standalone Installation

To clean up standalone DSPO deployment:

# WORKING_DIR must be the root of this repository's clone
cd ${WORKING_DIR}
make undeploy OPERATOR_NS=${DSPO_NS}
oc delete project ${DSPO_NS}

Run tests

Simply clone the directory and execute make test.

To run it without make you can run the following:

tmpFolder=$(mktemp -d)
go install sigs.k8s.io/controller-runtime/tools/setup-envtest@latest
export KUBEBUILDER_ASSETS=$(${GOPATH}/bin/setup-envtest use 1.25.0 --bin-dir ${tmpFolder}/bin -p path)
go test ./... -coverprofile cover.out

# once $KUBEBUILDER_ASSETS you can also run the full test suite successfully by running:
pre-commit run --all-files

You can find a more permanent location to install setup-envtest into on your local filesystem and export KUBEBUILDER_ASSETS into your .bashrc or equivalent. By doing this you can always run pre-commit run --all-files without having to repeat these steps.

Metrics

The Data Science Pipelines Operator exposes standard operator-sdk metrics for controller monitoring purposes.
In addition to these metrics, DSPO also exposes several custom metrics for monitoring the status of the DataSciencePipelinesApplications that it owns.

They are as follows:

  • data_science_pipelines_application_apiserver_ready - Gauge that indicates if the DSPA's APIServer is in a Ready state (1 => Ready, 0 => Not Ready)
  • data_science_pipelines_application_persistenceagent_ready - Gauge that indicates if the DSPA's PersistenceAgent is in a Ready state (1 => Ready, 0 => Not Ready)
  • data_science_pipelines_application_scheduledworkflow_ready - Gauge that indicates if the DSPA's ScheduledWorkflow manager is in a Ready state (1 => Ready, 0 => Not Ready)
  • data_science_pipelines_application_ready - Gauge that indicates if the DSPA is in a fully Ready state (1 => Ready, 0 => Not Ready)

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Go 70.2%
  • Shell 12.8%
  • Makefile 8.6%
  • Python 5.7%
  • Dockerfile 2.7%