-
Notifications
You must be signed in to change notification settings - Fork 26
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add role to install NVIDIA DOCA on top of an existing "fat" image (#492)
* add doca role run by fatimage * add workflow to test doca build * make packer inventory groups clearer and allow defining no extra * update packer workflows for new packer config * define builds entirely via matrix * WIP: do DOCA CI build on top of current fat image * fixup matrix for changes * fix doca workflow typo * use current fatimage for doca test build * enable fatimage to be used for volume-backed builds * bump CI image * doca workflow: clean up image and only run on relevant changes * remove commented-out code * add DOCA README * fix DOCA role actually running * tidyup DOCA play * include doca packages in image summary * fix squid being selected for any stackhopc build VM * fix nightly build concurrency * re-add squid back to Stackhpc builder group * remove debugging exit * update image build docs * update packer docs
- Loading branch information
Showing
14 changed files
with
323 additions
and
155 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,132 @@ | ||
name: Test DOCA extra build | ||
on: | ||
workflow_dispatch: | ||
push: | ||
branches: | ||
- main | ||
paths: | ||
- 'environments/.stackhpc/terraform/cluster_image.auto.tfvars.json' | ||
- 'ansible/roles/doca/**' | ||
- '.github/workflows/doca' | ||
pull_request: | ||
paths: | ||
- 'environments/.stackhpc/terraform/cluster_image.auto.tfvars.json' | ||
- 'ansible/roles/doca/**' | ||
- '.github/workflows/doca' | ||
|
||
jobs: | ||
doca: | ||
name: doca-build | ||
concurrency: | ||
group: ${{ github.workflow }}-${{ github.ref }}-${{ matrix.build.image_name }} # to branch/PR + OS | ||
cancel-in-progress: true | ||
runs-on: ubuntu-22.04 | ||
strategy: | ||
fail-fast: false # allow other matrix jobs to continue even if one fails | ||
matrix: # build RL8, RL9 | ||
build: | ||
- image_name: openhpc-doca-RL8 | ||
source_image_name_key: RL8 # key into environments/.stackhpc/terraform/cluster_image.auto.tfvars.json | ||
inventory_groups: doca | ||
- image_name: openhpc-doca-RL9 | ||
source_image_name_key: RL9 | ||
inventory_groups: doca | ||
env: | ||
ANSIBLE_FORCE_COLOR: True | ||
OS_CLOUD: openstack | ||
CI_CLOUD: ${{ vars.CI_CLOUD }} # default from repo settings | ||
ARK_PASSWORD: ${{ secrets.ARK_PASSWORD }} | ||
|
||
steps: | ||
- uses: actions/checkout@v2 | ||
|
||
- name: Load current fat images into GITHUB_ENV | ||
# see https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/workflow-commands-for-github-actions#example-of-a-multiline-string | ||
run: | | ||
{ | ||
echo 'FAT_IMAGES<<EOF' | ||
cat environments/.stackhpc/terraform/cluster_image.auto.tfvars.json | ||
echo EOF | ||
} >> "$GITHUB_ENV" | ||
- name: Record settings | ||
run: | | ||
echo CI_CLOUD: ${{ env.CI_CLOUD }} | ||
echo FAT_IMAGES: ${FAT_IMAGES} | ||
- name: Setup ssh | ||
run: | | ||
set -x | ||
mkdir ~/.ssh | ||
echo "${{ secrets[format('{0}_SSH_KEY', env.CI_CLOUD)] }}" > ~/.ssh/id_rsa | ||
chmod 0600 ~/.ssh/id_rsa | ||
shell: bash | ||
|
||
- name: Add bastion's ssh key to known_hosts | ||
run: cat environments/.stackhpc/bastion_fingerprints >> ~/.ssh/known_hosts | ||
shell: bash | ||
|
||
- name: Install ansible etc | ||
run: dev/setup-env.sh | ||
|
||
- name: Write clouds.yaml | ||
run: | | ||
mkdir -p ~/.config/openstack/ | ||
echo "${{ secrets[format('{0}_CLOUDS_YAML', env.CI_CLOUD)] }}" > ~/.config/openstack/clouds.yaml | ||
shell: bash | ||
|
||
- name: Setup environment | ||
run: | | ||
. venv/bin/activate | ||
. environments/.stackhpc/activate | ||
- name: Build fat image with packer | ||
id: packer_build | ||
run: | | ||
set -x | ||
. venv/bin/activate | ||
. environments/.stackhpc/activate | ||
cd packer/ | ||
packer init . | ||
PACKER_LOG=1 packer build \ | ||
-on-error=${{ vars.PACKER_ON_ERROR }} \ | ||
-var-file=$PKR_VAR_environment_root/${{ env.CI_CLOUD }}.pkrvars.hcl \ | ||
-var "source_image_name=${{ fromJSON(env.FAT_IMAGES)['cluster_image'][matrix.build.source_image_name_key] }}" \ | ||
-var "image_name=${{ matrix.build.image_name }}" \ | ||
-var "inventory_groups=${{ matrix.build.inventory_groups }}" \ | ||
openstack.pkr.hcl | ||
- name: Get created image names from manifest | ||
id: manifest | ||
run: | | ||
. venv/bin/activate | ||
IMAGE_ID=$(jq --raw-output '.builds[-1].artifact_id' packer/packer-manifest.json) | ||
while ! openstack image show -f value -c name $IMAGE_ID; do | ||
sleep 5 | ||
done | ||
IMAGE_NAME=$(openstack image show -f value -c name $IMAGE_ID) | ||
echo "image-name=${IMAGE_NAME}" >> "$GITHUB_OUTPUT" | ||
echo "image-id=$IMAGE_ID" >> "$GITHUB_OUTPUT" | ||
echo $IMAGE_ID > image-id.txt | ||
echo $IMAGE_NAME > image-name.txt | ||
- name: Make image usable for further builds | ||
run: | | ||
. venv/bin/activate | ||
openstack image unset --property signature_verified "${{ steps.manifest.outputs.image-id }}" | ||
- name: Delete image for automatically-run workflows | ||
run: | | ||
. venv/bin/activate | ||
openstack image delete "${{ steps.manifest.outputs.image-id }}" | ||
if: ${{ github.event_name != 'workflow_dispatch' }} | ||
|
||
- name: Upload manifest artifact | ||
uses: actions/upload-artifact@v4 | ||
with: | ||
name: image-details-${{ matrix.build.image_name }} | ||
path: | | ||
./image-id.txt | ||
./image-name.txt | ||
overwrite: true |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -66,3 +66,5 @@ roles/* | |
!roles/lustre/** | ||
!roles/dnf_repos/ | ||
!roles/dnf_repos/** | ||
!roles/doca/ | ||
!roles/doca/** |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.