Skip to content

Actions: NVIDIA/JAX-Toolbox

NCCL on Kubernetes

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
124 workflow runs
124 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Remove V100 from test environment
NCCL on Kubernetes #99: Pull request #1238 synchronize by DwarKapex
January 10, 2025 21:57 32s vkozlov/remove-v100
January 10, 2025 21:57 32s
[nsys-jax] Add ratio of hidden communication time to total communication time
NCCL on Kubernetes #98: Pull request #1241 opened by sfvaroglu
January 10, 2025 18:07 12m 54s sevin/comm_time
January 10, 2025 18:07 12m 54s
Add JetStream to MaxText container
NCCL on Kubernetes #97: Pull request #1058 synchronize by DwarKapex
January 10, 2025 16:58 7m 27s vkozlov/jetstream-4-maxtext
January 10, 2025 16:58 7m 27s
Remove deprecated flag xla_gpu_enable_triton_softmax_fusion.
NCCL on Kubernetes #96: Pull request #1240 opened by sergachev
January 10, 2025 16:23 7m 14s triton_softmax_fusion_flag
January 10, 2025 16:23 7m 14s
nsys-jax: optimise data loading and .zip creation
NCCL on Kubernetes #94: Pull request #1193 synchronize by olupton
January 10, 2025 11:09 8m 14s olupton/nsys-jax-python-opt
January 10, 2025 11:09 8m 14s
Allow nsys 2024.7 installation.
NCCL on Kubernetes #93: Pull request #1176 synchronize by olupton
January 10, 2025 11:04 7m 18s olupton/nsys-2024.7
January 10, 2025 11:04 7m 18s
Allow nsys 2024.7 installation.
NCCL on Kubernetes #92: Pull request #1176 synchronize by olupton
January 10, 2025 10:56 7m 19s olupton/nsys-2024.7
January 10, 2025 10:56 7m 19s
NCCL on Kubernetes
NCCL on Kubernetes #91: Scheduled
January 10, 2025 08:36 7m 3s main
January 10, 2025 08:36 7m 3s
Remove V100 from test environment
NCCL on Kubernetes #88: Pull request #1238 opened by DwarKapex
January 9, 2025 22:50 11m 52s vkozlov/remove-v100
January 9, 2025 22:50 11m 52s
Add JetStream to MaxText container
NCCL on Kubernetes #87: Pull request #1058 synchronize by DwarKapex
January 9, 2025 21:44 7m 12s vkozlov/jetstream-4-maxtext
January 9, 2025 21:44 7m 12s
Add JetStream to MaxText container
NCCL on Kubernetes #86: Pull request #1058 synchronize by DwarKapex
January 9, 2025 19:27 8m 46s vkozlov/jetstream-4-maxtext
January 9, 2025 19:27 8m 46s
CI: run MaxText tests on AWS with NGC release candidate images
NCCL on Kubernetes #85: Pull request #1237 opened by olupton
January 9, 2025 15:45 36m 36s olupton/eks-maxtext-25.01
January 9, 2025 15:45 36m 36s
Only set CUDA_DEVICE_MAX_CONNECTIONS=1 for Hopper/cc9.0 runs
NCCL on Kubernetes #84: Pull request #1236 synchronize by olupton
January 9, 2025 15:43 32m 15s olupton/25.01/max-connections
January 9, 2025 15:43 32m 15s
CI: run NCCL tests on AWS with NGC release candidate images
NCCL on Kubernetes #83: Pull request #1234 synchronize by olupton
January 9, 2025 15:15 52m 56s olupton/reusable-k8s-nccl-25.01
January 9, 2025 15:15 52m 56s
ruff: format with 0.9.0
NCCL on Kubernetes #81: Pull request #1235 opened by olupton
January 9, 2025 14:14 7m 52s olupton/ruff-0.9.0
January 9, 2025 14:14 7m 52s
Move nccl-tests-on-k8s logic into a reusable workflow
NCCL on Kubernetes #80: Pull request #1233 synchronize by olupton
January 9, 2025 14:06 10m 34s olupton/reusable-k8s-nccl
January 9, 2025 14:06 10m 34s
Move nccl-tests-on-k8s logic into a reusable workflow
NCCL on Kubernetes #79: Pull request #1233 ready_for_review by olupton
January 9, 2025 14:05 1m 0s olupton/reusable-k8s-nccl
January 9, 2025 14:05 1m 0s
Move nccl-tests-on-k8s logic into a reusable workflow
NCCL on Kubernetes #77: Pull request #1233 synchronize by olupton
January 9, 2025 11:17 8m 40s olupton/reusable-k8s-nccl
January 9, 2025 11:17 8m 40s
Move nccl-tests-on-k8s logic into a reusable workflow
NCCL on Kubernetes #76: Pull request #1233 synchronize by olupton
January 9, 2025 11:17 3s olupton/reusable-k8s-nccl
January 9, 2025 11:17 3s