Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrating Python model REST protocol test on triton for Kserve ( UI -> API ) #2133

Merged
merged 6 commits into from
Jan 10, 2025
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 60 additions & 1 deletion ods_ci/tests/Resources/CLI/ModelServing/llm.resource
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
... vllm-runtime=${LLM_RESOURCES_DIRPATH}/serving_runtimes/vllm_servingruntime_{{protocol}}.yaml
... ovms-runtime=${LLM_RESOURCES_DIRPATH}/serving_runtimes/ovms_servingruntime_{{protocol}}.yaml
... caikit-standalone-runtime=${LLM_RESOURCES_DIRPATH}/serving_runtimes/caikit_standalone_servingruntime_{{protocol}}.yaml # robocop: disable
... triton-kserve-runtime=${LLM_RESOURCES_DIRPATH}/serving_runtimes/triton_servingruntime_{{protocol}}.yaml # robocop: disable
${DOWNLOAD_PVC_FILEPATH}= ${LLM_RESOURCES_DIRPATH}/download_model_in_pvc.yaml
${DOWNLOAD_PVC_FILLED_FILEPATH}= ${LLM_RESOURCES_DIRPATH}/download_model_in_pvc_filled.yaml

Expand Down Expand Up @@ -140,7 +141,7 @@
[Arguments] ${isvc_name} ${model_storage_uri} ${model_format}=caikit ${serving_runtime}=caikit-tgis-runtime
... ${kserve_mode}=${NONE} ${sa_name}=${DEFAULT_BUCKET_SA_NAME} ${canaryTrafficPercent}=${EMPTY} ${min_replicas}=1
... ${scaleTarget}=1 ${scaleMetric}=concurrency ${auto_scale}=${NONE}
... ${requests_dict}=&{EMPTY} ${limits_dict}=&{EMPTY} ${overlays}=${EMPTY}
... ${requests_dict}=&{EMPTY} ${limits_dict}=&{EMPTY} ${overlays}=${EMPTY} ${version}=${EMPTY}

Check notice

Code scanning / Robocop

There is too many arguments per continuation line ({{ arguments_count }} / {{ max_arguments_count }}) Note test

There is too many arguments per continuation line (4 / 1)
IF '${auto_scale}' == '${NONE}'
${scaleTarget}= Set Variable ${EMPTY}
${scaleMetric}= Set Variable ${EMPTY}
Expand All @@ -153,6 +154,7 @@
Set Test Variable ${scaleMetric}
Set Test Variable ${canaryTrafficPercent}
Set Test Variable ${model_format}
Set Test Variable ${version}

Check notice

Code scanning / Robocop

{{ set_variable_keyword }} can be replaced with VAR Note test

Set Test Variable can be replaced with VAR

Check warning

Code scanning / Robocop

Don't use test/task variables Warning test

Don't use test/task variables

Check warning

Code scanning / Robocop

Test, suite and global variables should be uppercase Warning test

Test, suite and global variables should be uppercase
Set Test Variable ${serving_runtime}
IF len($overlays) > 0
FOR ${index} ${overlay} IN ENUMERATE @{overlays}
Expand Down Expand Up @@ -414,6 +416,46 @@
END
END

Setup Test Variables # robocop: off=too-many-calls-in-keyword
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the keyword already exists in

Setup Test Variables # robocop: off=too-many-calls-in-keyword
; why not move it up and extend what is needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rnetser It does not come under the scope of this PR; it will be checked and updated in upcoming PRs.

[Documentation] Sets up variables for the Suite
[Arguments] ${model_name} ${kserve_mode}=Serverless ${use_pvc}=${FALSE} ${use_gpu}=${FALSE}

Check notice

Code scanning / Robocop

There is too many arguments per continuation line ({{ arguments_count }} / {{ max_arguments_count }}) Note test

There is too many arguments per continuation line (4 / 1)
... ${model_path}=${model_name}
Set Test Variable ${model_name}

Check notice

Code scanning / Robocop

{{ set_variable_keyword }} can be replaced with VAR Note test

Set Test Variable can be replaced with VAR

Check warning

Code scanning / Robocop

Don't use test/task variables Warning test

Don't use test/task variables

Check warning

Code scanning / Robocop

Test, suite and global variables should be uppercase Warning test

Test, suite and global variables should be uppercase
${models_names}= Create List ${model_name}

Check notice

Code scanning / Robocop

{{ create_keyword }} can be replaced with VAR Note test

Create List can be replaced with VAR
Set Test Variable ${models_names}

Check notice

Code scanning / Robocop

{{ set_variable_keyword }} can be replaced with VAR Note test

Set Test Variable can be replaced with VAR

Check warning

Code scanning / Robocop

Don't use test/task variables Warning test

Don't use test/task variables

Check warning

Code scanning / Robocop

Test, suite and global variables should be uppercase Warning test

Test, suite and global variables should be uppercase
Set Test Variable ${model_path}

Check notice

Code scanning / Robocop

{{ set_variable_keyword }} can be replaced with VAR Note test

Set Test Variable can be replaced with VAR

Check warning

Code scanning / Robocop

Don't use test/task variables Warning test

Don't use test/task variables

Check warning

Code scanning / Robocop

Test, suite and global variables should be uppercase Warning test

Test, suite and global variables should be uppercase
Set Test Variable ${test_namespace} ${TEST_NS}-${model_name}

Check notice

Code scanning / Robocop

{{ set_variable_keyword }} can be replaced with VAR Note test

Set Test Variable can be replaced with VAR

Check warning

Code scanning / Robocop

Don't use test/task variables Warning test

Don't use test/task variables

Check warning

Code scanning / Robocop

Test, suite and global variables should be uppercase Warning test

Test, suite and global variables should be uppercase
IF ${use_pvc}
Set Test Variable ${storage_uri} pvc://${model_name}-claim/${model_path}

Check notice

Code scanning / Robocop

{{ set_variable_keyword }} can be replaced with VAR Note test

Set Test Variable can be replaced with VAR

Check warning

Code scanning / Robocop

Don't use test/task variables Warning test

Don't use test/task variables

Check warning

Code scanning / Robocop

Test, suite and global variables should be uppercase Warning test

Test, suite and global variables should be uppercase
ELSE
Set Test Variable ${storage_uri} s3://${S3.BUCKET_1.NAME}/${model_path}

Check notice

Code scanning / Robocop

{{ set_variable_keyword }} can be replaced with VAR Note test

Set Test Variable can be replaced with VAR

Check warning

Code scanning / Robocop

Don't use test/task variables Warning test

Don't use test/task variables

Check warning

Code scanning / Robocop

Test, suite and global variables should be uppercase Warning test

Test, suite and global variables should be uppercase
END
IF ${use_gpu}
${supported_gpu_type}= Convert To Lowercase ${GPU_TYPE}
Set Runtime Image ${supported_gpu_type}
IF "${supported_gpu_type}" == "nvidia"
${limits}= Create Dictionary nvidia.com/gpu=1

Check notice

Code scanning / Robocop

{{ create_keyword }} can be replaced with VAR Note test

Create Dictionary can be replaced with VAR
ELSE IF "${supported_gpu_type}" == "amd"
${limits}= Create Dictionary amd.com/gpu=1

Check notice

Code scanning / Robocop

{{ create_keyword }} can be replaced with VAR Note test

Create Dictionary can be replaced with VAR
ELSE
FAIL msg=Provided GPU type is not yet supported. Only nvidia and amd gpu type are supported
END
Set Test Variable ${limits}

Check notice

Code scanning / Robocop

{{ set_variable_keyword }} can be replaced with VAR Note test

Set Test Variable can be replaced with VAR

Check warning

Code scanning / Robocop

Don't use test/task variables Warning test

Don't use test/task variables

Check warning

Code scanning / Robocop

Test, suite and global variables should be uppercase Warning test

Test, suite and global variables should be uppercase
ELSE
Set Test Variable ${limits} &{EMPTY}

Check notice

Code scanning / Robocop

{{ set_variable_keyword }} can be replaced with VAR Note test

Set Test Variable can be replaced with VAR

Check warning

Code scanning / Robocop

Don't use test/task variables Warning test

Don't use test/task variables

Check warning

Code scanning / Robocop

Test, suite and global variables should be uppercase Warning test

Test, suite and global variables should be uppercase
END
IF "${KSERVE_MODE}" == "RawDeployment" # robocop: off=inconsistent-variable-name
Set Test Variable ${use_port_forwarding} ${TRUE}

Check notice

Code scanning / Robocop

{{ set_variable_keyword }} can be replaced with VAR Note test

Set Test Variable can be replaced with VAR

Check warning

Code scanning / Robocop

Don't use test/task variables Warning test

Don't use test/task variables

Check warning

Code scanning / Robocop

Test, suite and global variables should be uppercase Warning test

Test, suite and global variables should be uppercase
ELSE
Set Test Variable ${use_port_forwarding} ${FALSE}

Check notice

Code scanning / Robocop

{{ set_variable_keyword }} can be replaced with VAR Note test

Set Test Variable can be replaced with VAR

Check warning

Code scanning / Robocop

Don't use test/task variables Warning test

Don't use test/task variables

Check warning

Code scanning / Robocop

Test, suite and global variables should be uppercase Warning test

Test, suite and global variables should be uppercase
END
Set Log Level NONE
Set Test Variable ${access_key_id} ${S3.AWS_ACCESS_KEY_ID}

Check notice

Code scanning / Robocop

{{ set_variable_keyword }} can be replaced with VAR Note test

Set Test Variable can be replaced with VAR

Check warning

Code scanning / Robocop

Don't use test/task variables Warning test

Don't use test/task variables

Check warning

Code scanning / Robocop

Test, suite and global variables should be uppercase Warning test

Test, suite and global variables should be uppercase
Set Test Variable ${access_key} ${S3.AWS_SECRET_ACCESS_KEY}

Check notice

Code scanning / Robocop

{{ set_variable_keyword }} can be replaced with VAR Note test

Set Test Variable can be replaced with VAR

Check warning

Code scanning / Robocop

Don't use test/task variables Warning test

Don't use test/task variables

Check warning

Code scanning / Robocop

Test, suite and global variables should be uppercase Warning test

Test, suite and global variables should be uppercase
Set Test Variable ${endpoint} ${MODELS_BUCKET.ENDPOINT}

Check notice

Code scanning / Robocop

{{ set_variable_keyword }} can be replaced with VAR Note test

Set Test Variable can be replaced with VAR

Check warning

Code scanning / Robocop

Don't use test/task variables Warning test

Don't use test/task variables

Check warning

Code scanning / Robocop

Test, suite and global variables should be uppercase Warning test

Test, suite and global variables should be uppercase
Set Test Variable ${region} ${MODELS_BUCKET.REGION}

Check notice

Code scanning / Robocop

{{ set_variable_keyword }} can be replaced with VAR Note test

Set Test Variable can be replaced with VAR

Check warning

Code scanning / Robocop

Don't use test/task variables Warning test

Don't use test/task variables

Check warning

Code scanning / Robocop

Test, suite and global variables should be uppercase Warning test

Test, suite and global variables should be uppercase
Set Log Level INFO

Compile Deploy And Query LLM model
[Documentation] Group together the test steps for preparing, deploying
... and querying a model
Expand Down Expand Up @@ -909,3 +951,20 @@
${rc} ${out}= Run And Return Rc And Output
... oc patch servingruntime ${runtime} -n ${namespace} --type='json' -p='[{"op": "remove", "path": "/spec/containers/0/args/1"}]'
Should Be Equal As Integers ${rc} ${0} msg=${out}


Set Runtime Image

Check warning

Code scanning / Robocop

Invalid number of empty lines between keywords ({{ empty_lines }}/{{ allowed_empty_lines }}) Warning test

Invalid number of empty lines between keywords (2/1)
[Documentation] Sets up runtime variables for the Suite
[Arguments] ${gpu_type}
IF "${RUNTIME_IMAGE}" == "${EMPTY}"
IF "${gpu_type}" == "nvidia"
Set Test Variable ${runtime_image} quay.io/modh/vllm@sha256:c86ff1e89c86bc9821b75d7f2bbc170b3c13e3ccf538bf543b1110f23e056316

Check warning

Code scanning / Robocop

Line is too long ({{ line_length }}/{{ allowed_length }}) Warning test

Line is too long (142/120)

Check notice

Code scanning / Robocop

{{ set_variable_keyword }} can be replaced with VAR Note test

Set Test Variable can be replaced with VAR

Check warning

Code scanning / Robocop

Don't use test/task variables Warning test

Don't use test/task variables

Check warning

Code scanning / Robocop

Test, suite and global variables should be uppercase Warning test

Test, suite and global variables should be uppercase
ELSE IF "${gpu_type}" == "amd"
Set Test Variable ${runtime_image} quay.io/modh/vllm@sha256:10f09eeca822ebe77e127aad7eca2571f859a5536a6023a1baffc6764bcadc6e

Check warning

Code scanning / Robocop

Line is too long ({{ line_length }}/{{ allowed_length }}) Warning test

Line is too long (142/120)

Check notice

Code scanning / Robocop

{{ set_variable_keyword }} can be replaced with VAR Note test

Set Test Variable can be replaced with VAR

Check warning

Code scanning / Robocop

Don't use test/task variables Warning test

Don't use test/task variables

Check warning

Code scanning / Robocop

Test, suite and global variables should be uppercase Warning test

Test, suite and global variables should be uppercase
ELSE
FAIL msg=Provided GPU type is not yet supported. Only nvidia and amd gpu type are supported

Check warning

Code scanning / Robocop

{{ bad_indent_msg }} Warning test

Line is over-indented
END
ELSE
Log To Console msg= Using the image provided from terminal

Check warning

Code scanning / Robocop

{{ bad_indent_msg }} Warning test

Line is under-indented
END

Check warning

Code scanning / Robocop

File has too many lines ({{ lines_count }}/{{max_allowed_count }}) Warning test

File has too many lines (967/400)
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ spec:
volumeMounts: []
modelFormat:
name: ${model_format}
version: ${version}
runtime: ${serving_runtime}
storageUri: ${model_storage_uri}
volumes: []
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
apiVersion: serving.kserve.io/v1alpha1
kind: ServingRuntime
metadata:
name: triton-kserve-runtime
spec:
annotations:
prometheus.kserve.io/path: /metrics
prometheus.kserve.io/port: "8002"
containers:
- args:
- tritonserver
- --model-store=/mnt/models
- --grpc-port=9000
- --http-port=8080
- --allow-grpc=true
- --allow-http=true
image: nvcr.io/nvidia/tritonserver:23.05-py3
name: kserve-container
resources:
limits:
cpu: "1"
memory: 2Gi
requests:
cpu: "1"
memory: 2Gi
ports:
- containerPort: 8080
protocol: TCP
protocolVersions:
- v2
- grpc-v2
supportedModelFormats:
- autoSelect: true
name: tensorrt
version: "8"
- autoSelect: true
name: tensorflow
version: "1"
- autoSelect: true
name: tensorflow
version: "2"
- autoSelect: true
name: onnx
version: "1"
- name: pytorch
version: "1"
- autoSelect: true
name: triton
version: "2"
- autoSelect: true
name: xgboost
version: "1"
- autoSelect: true
name: python
version: "1"
Original file line number Diff line number Diff line change
Expand Up @@ -339,7 +339,7 @@
${rc} ${url}= Run And Return Rc And Output
... oc get ksvc ${model_name}-predictor -n ${project_title} -o jsonpath='{.status.url}'
Should Be Equal As Integers ${rc} 0
${curl_cmd}= Set Variable curl -s ${url}${end_point} -d ${inference_input}
${curl_cmd}= Set Variable curl -s ${url}${end_point} -d ${inference_input} --cacert openshift_ca_istio_knative.crt

Check warning

Code scanning / Robocop

Line is too long ({{ line_length }}/{{ allowed_length }}) Warning test

Line is too long (132/120)
ELSE IF '${kserve_mode}' == 'RawDeployment'
${url}= Set Variable http://localhost:${service_port}${end_point}
${curl_cmd}= Set Variable curl -s ${url} -d ${inference_input} --cacert openshift_ca_istio_knative.crt
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
*** Settings ***
Documentation Suite of test cases for Triton in Kserve
Library OperatingSystem
Library ../../../../libs/Helpers.py
Resource ../../../Resources/Page/ODH/JupyterHub/HighAvailability.robot
Resource ../../../Resources/Page/ODH/ODHDashboard/ODHModelServing.resource
Resource ../../../Resources/Page/ODH/ODHDashboard/ODHDataScienceProject/Projects.resource
Resource ../../../Resources/Page/ODH/ODHDashboard/ODHDataScienceProject/DataConnections.resource
Resource ../../../Resources/Page/ODH/ODHDashboard/ODHDataScienceProject/ModelServer.resource
Resource ../../../Resources/Page/ODH/ODHDashboard/ODHDashboardSettingsRuntimes.resource
Resource ../../../Resources/Page/ODH/Monitoring/Monitoring.resource
Resource ../../../Resources/OCP.resource
Resource ../../../Resources/CLI/ModelServing/modelmesh.resource
Resource ../../../Resources/Common.robot
Resource ../../../Resources/CLI/ModelServing/llm.resource
Suite Setup Suite Setup
Suite Teardown Suite Teardown
Test Tags Kserve

Check warning

Code scanning / Robocop

Invalid number of empty lines between sections ({{ empty_lines }}/{{ allowed_empty_lines }}) Warning test

Invalid number of empty lines between sections (1/2)
*** Variables ***
${PYTHON_MODEL_NAME}= python
${EXPECTED_INFERENCE_REST_OUTPUT_PYTHON}= {"model_name":"python","model_version":"1","outputs":[{"name":"OUTPUT0","datatype":"FP32","shape":[4],"data":[0.921442985534668,0.6223347187042236,0.8059385418891907,1.2578542232513428]},{"name":"OUTPUT1","datatype":"FP32","shape":[4],"data":[0.49091365933418274,-0.027157962322235107,-0.5641784071922302,0.6906309723854065]}]}

Check warning

Code scanning / Robocop

Line is too long ({{ line_length }}/{{ allowed_length }}) Warning test

Line is too long (375/120)
${INFERENCE_REST_INPUT_PYTHON}= @tests/Resources/Files/triton/kserve-triton-python-rest-input.json
${KSERVE_MODE}= Serverless # Serverless
${PROTOCOL}= http
${TEST_NS}= tritonmodel

Check notice

Code scanning / Robocop

Variable '{{ name }}' is assigned but not used Note test

Variable '${TEST_NS}' is assigned but not used
${DOWNLOAD_IN_PVC}= ${FALSE}
${MODELS_BUCKET}= ${S3.BUCKET_1}

Check notice

Code scanning / Robocop

Variable '{{ name }}' is assigned but not used Note test

Variable '${MODELS_BUCKET}' is assigned but not used
${LLM_RESOURCES_DIRPATH}= tests/Resources/Files/llm
${INFERENCESERVICE_FILEPATH}= ${LLM_RESOURCES_DIRPATH}/serving_runtimes/base/isvc.yaml

Check notice

Code scanning / Robocop

Variable '{{ name }}' is assigned but not used Note test

Variable '${INFERENCESERVICE_FILEPATH}' is assigned but not used
${INFERENCESERVICE_FILEPATH_NEW}= ${LLM_RESOURCES_DIRPATH}/serving_runtimes/isvc
${INFERENCESERVICE_FILLED_FILEPATH}= ${INFERENCESERVICE_FILEPATH_NEW}/isvc_filled.yaml
${KSERVE_RUNTIME_REST_NAME}= triton-kserve-runtime


*** Test Cases ***
Test Python Model Rest Inference Via API (Triton on Kserve) # robocop: off=too-long-test-case

Check warning

Code scanning / Robocop

Test case '{{ test_name }}' has too many keywords inside ({{ keyword_count }}/{{ max_allowed_count }}) Warning test

Test case 'Test Python Model Rest Inference Via API (Triton on Kserve)' has too many keywords inside (11/10)
[Documentation] Test the deployment of python model in Kserve using Triton
[Tags] Tier2 RHOAIENG-16912
Setup Test Variables model_name=${PYTHON_MODEL_NAME} use_pvc=${FALSE} use_gpu=${FALSE}
... kserve_mode=${KSERVE_MODE} model_path=triton/model_repository/
Set Project And Runtime runtime=${KSERVE_RUNTIME_REST_NAME} protocol=${PROTOCOL} namespace=${test_namespace}

Check warning

Code scanning / Robocop

Line is too long ({{ line_length }}/{{ allowed_length }}) Warning test

Line is too long (123/120)
... download_in_pvc=${DOWNLOAD_IN_PVC} model_name=${PYTHON_MODEL_NAME}
... storage_size=100Mi memory_request=100Mi
${requests}= Create Dictionary memory=1Gi

Check notice

Code scanning / Robocop

{{ create_keyword }} can be replaced with VAR Note test

Create Dictionary can be replaced with VAR
Compile Inference Service YAML isvc_name=${PYTHON_MODEL_NAME}
... sa_name=models-bucket-sa
... model_storage_uri=${storage_uri}
... model_format=python serving_runtime=${KSERVE_RUNTIME_REST_NAME}
... version="1"
... limits_dict=${limits} requests_dict=${requests} kserve_mode=${KSERVE_MODE}
Deploy Model Via CLI isvc_filepath=${INFERENCESERVICE_FILLED_FILEPATH}
... namespace=${test_namespace}
# File is not needed anymore after applying
Remove File ${INFERENCESERVICE_FILLED_FILEPATH}
Wait For Pods To Be Ready label_selector=serving.kserve.io/inferenceservice=${PYTHON_MODEL_NAME}
... namespace=${test_namespace}
Comment on lines +40 to +57
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this block is repeated in multiple tests; create a test template and re-use

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will discuss with @tarukumar and create a template.

${pod_name}= Get Pod Name namespace=${test_namespace}
... label_selector=serving.kserve.io/inferenceservice=${PYTHON_MODEL_NAME}
${service_port}= Extract Service Port service_name=${PYTHON_MODEL_NAME}-predictor protocol=TCP
... namespace=${test_namespace}
IF "${KSERVE_MODE}"=="RawDeployment"
Start Port-forwarding namespace=${test_namespace} pod_name=${pod_name} local_port=${service_port}
... remote_port=${service_port} process_alias=triton-process
END
Verify Model Inference With Retries model_name=${PYTHON_MODEL_NAME} inference_input=${INFERENCE_REST_INPUT_PYTHON}

Check warning

Code scanning / Robocop

Line is too long ({{ line_length }}/{{ allowed_length }}) Warning test

Line is too long (123/120)
... expected_inference_output=${EXPECTED_INFERENCE_REST_OUTPUT_PYTHON} project_title=${test_namespace}
... deployment_mode=Cli kserve_mode=${KSERVE_MODE} service_port=${service_port}
... end_point=/v2/models/${model_name}/infer retries=3
[Teardown] Run Keywords
... Clean Up Test Project test_ns=${test_namespace}
... isvc_names=${models_names} wait_prj_deletion=${FALSE} kserve_mode=${KSERVE_MODE}
... AND
... Run Keyword If "${KSERVE_MODE}"=="RawDeployment" Terminate Process triton-process kill=true


*** Keywords ***
Suite Setup
[Documentation] Suite setup keyword
Set Library Search Order SeleniumLibrary
Skip If Component Is Not Enabled kserve
RHOSi Setup
Load Expected Responses
Set Default Storage Class In GCP default=ssd-csi

Suite Teardown
[Documentation] Suite teardown keyword
Set Default Storage Class In GCP default=standard-csi
RHOSi Teardown

Loading