Skip to content

Move nccl-tests-on-k8s logic into a reusable workflow #76

Move nccl-tests-on-k8s logic into a reusable workflow

Move nccl-tests-on-k8s logic into a reusable workflow #76

Workflow file for this run

name: NCCL on Kubernetes
on:
schedule:
- cron: '30 8 * * *'
pull_request:
types:
- opened
- reopened
- ready_for_review
- synchronize
paths-ignore:
- '**.md'
workflow_dispatch:
inputs:
# Note that cuda-dl-base installs the NCCL tests, while the vanilla nvidia/cuda
# images do not; when JAX-Toolbox moves to using cuda-dl-base this workflow ought
# to be modified to test one of the JAX-Toolbox containers.
CONTAINER:
type: string

Check failure on line 19 in .github/workflows/nccl-k8s.yaml

View workflow run for this annotation

GitHub Actions / NCCL on Kubernetes

Invalid workflow file

The workflow is not valid. NVIDIA/JAX-Toolbox/.github/workflows/_test_nccl.yaml@fb699a07f0c1c5e1c064b24aeaa0d3ca1fb53f68 (Line: 19, Col: 3): Error calling workflow 'NVIDIA/JAX-Toolbox/.github/workflows/_build.yaml@fb699a07f0c1c5e1c064b24aeaa0d3ca1fb53f68'. The workflow is requesting 'actions: write', but is only allowed 'actions: none'.
description: Container to test, this is assumed to already contain the NCCL tests e.g. cuda-dl-base or derived
default: ''
required: false
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
permissions:
actions: write # to cancel previous workflows
contents: read # to fetch code
packages: write # to upload container
jobs:
nccl-tests:
uses: ./.github/workflows/_test_nccl.yaml
with:
CONTAINER: ${{ inputs.CONTAINER || 'nvcr.io/nvidia/cuda-dl-base:24.12-cuda12.6-devel-ubuntu24.04' }}
secrets: inherit