Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert ci-shark-ai.yml to use pkgci_shark_ai.yml so that we only build packages once #780

Merged
merged 29 commits into from
Jan 10, 2025
Merged
Show file tree
Hide file tree
Changes from 25 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
6713ecb
initial commit using scotts code
renxida Jan 8, 2025
015b6ef
give build_packages call write permission
renxida Jan 8, 2025
9d4f2b6
make artifact versions match
renxida Jan 8, 2025
afa818f
use bash to do version string substitution instead
renxida Jan 8, 2025
6b52eff
move setup_venv.py to proper location
renxida Jan 8, 2025
bfaf0b8
run on default runners instead
renxida Jan 8, 2025
efb182e
enable exec on setup_venv.py
renxida Jan 8, 2025
5dde682
remove concurrency settings from callee workflow
renxida Jan 8, 2025
63ef830
add back shark-ai package build
renxida Jan 8, 2025
d7f5f4c
remove accidentally added-back concurrency settings
renxida Jan 8, 2025
f651dad
match cpu pytorch requirements.txt to iree-turbine's
renxida Jan 8, 2025
a1f10d6
match rocm pytorch requirements.txt to iree-turbine's
renxida Jan 8, 2025
6254c9f
remove --pre from pinned iree reqs
renxida Jan 8, 2025
cf81dfa
run on mi250 again
renxida Jan 8, 2025
f7b99f6
remove old shark-ai ci file
renxida Jan 8, 2025
35eb2d2
missed newline before EOF
renxida Jan 8, 2025
e4f0e48
remove hardcoded py3.11
renxida Jan 9, 2025
a970f52
ci job name: nightly -> pinned to match filename change
renxida Jan 9, 2025
1810850
Merge branch 'main' into sharkai-pkgci1
renxida Jan 9, 2025
6a336a3
pytorch-rocm-requirements match iree-turbine but no torch audio and v…
renxida Jan 9, 2025
92aa1a5
cache uv in same filesystem as workspace
renxida Jan 9, 2025
b1fa99a
test on regular ubuntu runners for a bit
renxida Jan 9, 2025
fe243d2
Merge branch 'main' into sharkai-pkgci1
renxida Jan 9, 2025
2d13751
back on mi250
renxida Jan 9, 2025
80ba233
comments
renxida Jan 9, 2025
9e5fbe0
Pin caching to commit used in rest of repo
renxida Jan 9, 2025
1397879
add some files to hash with uv caching
renxida Jan 10, 2025
0820813
clean up the clean up step
renxida Jan 10, 2025
4282730
try running on mi300x-4
renxida Jan 10, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 0 additions & 66 deletions .github/workflows/ci-shark-ai.yml

This file was deleted.

39 changes: 39 additions & 0 deletions .github/workflows/pkgci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Copyright 2024 Advanced Micro Devices, Inc.
#
# Licensed under the Apache License v2.0 with LLVM Exceptions.
# See https://llvm.org/LICENSE.txt for license information.
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

name: PkgCI

on:
workflow_dispatch:
renxida marked this conversation as resolved.
Show resolved Hide resolved
pull_request:
push:
branches:
- main

permissions:
contents: read

concurrency:
# A PR number if a pull request and otherwise the commit hash. This cancels
# queued and in-progress runs for the same PR (presubmit) or commit
# (postsubmit). The workflow name is prepended to avoid conflicts between
# different workflows.
group: ${{ github.workflow }}-${{ github.event.number || github.sha }}
cancel-in-progress: true

jobs:
build_packages:
name: Build Packages
uses: ./.github/workflows/build_packages.yml
permissions:
contents: write
with:
build_type: "dev"

test_shark_ai:
name: Test shark-ai
needs: [build_packages]
uses: ./.github/workflows/pkgci_shark_ai.yml
99 changes: 99 additions & 0 deletions .github/workflows/pkgci_shark_ai.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# Copyright 2024 Advanced Micro Devices, Inc.
#
# Licensed under the Apache License v2.0 with LLVM Exceptions.
# See https://llvm.org/LICENSE.txt for license information.
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

name: PkgCI - shark-ai

on:
workflow_call:
inputs:
artifact_run_id:
type: string
default: ""
workflow_dispatch:
inputs:
artifact_run_id:
type: string
description: "Id for a workflow run that produced dev packages"
default: ""

jobs:
test_shortfin_llm_server:
name: "Integration Tests - Shortfin LLM Server"
strategy:
matrix:
version: [3.11]
fail-fast: false
runs-on: nodai-amdgpu-mi250-x86-64
# runs-on: ubuntu-latest # everything else works but this throws an "out of resources" during model loading
# TODO: make a copy of this that runs on standard runners with tiny llama instead of a 8b model
defaults:
run:
shell: bash
env:
PACKAGE_DOWNLOAD_DIR: ${{ github.workspace }}/.packages
VENV_DIR: ${{ github.workspace }}/.venv
steps:
- name: "Checkout Code"
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: "Setting up Python"
id: setup_python
uses: actions/setup-python@0b93645e9fea7318ecaed2b359559ac225c90a2b # v5.3.0
with:
python-version: ${{matrix.version}}

- name: Set Python version without dot
run: |
echo "PY_VERSION_NO_DOT=$(echo ${{ matrix.version }} | tr -d '.')" >> $GITHUB_ENV

- name: Setup UV caching
run: |
CACHE_DIR="${GITHUB_WORKSPACE}/.uv-cache"
echo "UV_CACHE_DIR=${CACHE_DIR}" >> $GITHUB_ENV
mkdir -p "${CACHE_DIR}"

- name: Cache UV packages
uses: actions/cache@v3
renxida marked this conversation as resolved.
Show resolved Hide resolved
with:
path: .uv-cache
key: ${{ runner.os }}-uv-py${{ matrix.version }}-${{ hashFiles('requirements-iree-pinned.txt') }}
renxida marked this conversation as resolved.
Show resolved Hide resolved

- name: Download sharktank artifacts
uses: actions/download-artifact@fa0a91b85d4f404e444e00e005971372dc801d16 # v4.1.8
with:
name: snapshot-sharktank-linux-x86_64-cp${{ env.PY_VERSION_NO_DOT }}-cp${{ env.PY_VERSION_NO_DOT }}
path: ${{ env.PACKAGE_DOWNLOAD_DIR }}

- name: Download shortfin artifacts
uses: actions/download-artifact@fa0a91b85d4f404e444e00e005971372dc801d16 # v4.1.8
with:
name: snapshot-shortfin-linux-x86_64-cp${{ env.PY_VERSION_NO_DOT }}-cp${{ env.PY_VERSION_NO_DOT }}
path: ${{ env.PACKAGE_DOWNLOAD_DIR }}

- name: Download shark-ai artifacts
uses: actions/download-artifact@fa0a91b85d4f404e444e00e005971372dc801d16 # v4.1.8
with:
name: snapshot-shark-ai-linux-x86_64-cp${{ env.PY_VERSION_NO_DOT }}-cp${{ env.PY_VERSION_NO_DOT }}
path: ${{ env.PACKAGE_DOWNLOAD_DIR }}

- name: Setup venv
run: |
./build_tools/pkgci/setup_venv.py ${VENV_DIR} \
--artifact-path=${PACKAGE_DOWNLOAD_DIR} \
--fetch-gh-workflow=${{ inputs.artifact_run_id }}

- name: Install pinned IREE packages
run: |
source ${VENV_DIR}/bin/activate
uv pip install -r requirements-iree-pinned.txt

- name: Run LLM Integration Tests
run: |
source ${VENV_DIR}/bin/activate
pytest -v -s app_tests/integration_tests/llm/shortfin --log-cli-level=INFO

- name: Clean up repo to make next checkout faster
run: |
git clean -ffdx
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Going to remove this and collect this into a new issue and ask Sai for help.

Loading
Loading