Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add remote-test option for E2E #876

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ArangoGutierrez
Copy link
Collaborator

No description provided.

@ArangoGutierrez ArangoGutierrez self-assigned this Jan 23, 2025
@ArangoGutierrez ArangoGutierrez added the testing issue/PR to fix/edit/create/enhance a project unit/e2e test label Jan 23, 2025
@ArangoGutierrez
Copy link
Collaborator Author

Next step is to have this to run during PR's, merge events and Tag cuts

return localRunScript(script)
}

func localRunScript(script string) (string, error) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if I should move everything from this line forward to the utils.go file

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could move all the script specifics to a separate file, but not utils.go.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I should have been more clear. I think we should separate logic according to functionality / domain and name the files accordingly. Using tools.go as a catch-all does not achieve this.

set -xe
`

var dockerInstallTemplate = `
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here is that in the future, we will have different methods of installation to test, for example, CDI vs non-CDI.


: ${IMAGE:={{.Image}}}

sudo ln -s /var/run/nvidia-container-toolkit/toolkit/nvidia-container-runtime-hook /usr/bin/nvidia-container-runtime-hook
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to add a comment as to why this is required?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, more documentation never hurts

.PHONY: test
test:
E2E_IMAGE_NAME ?= ghcr.io/nvidia/container-toolkit
E2E_IMAGE_TAG ?= latest
Copy link
Member

@elezar elezar Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

latest is never a valid tag in this repo. Can we rather error out if this isn't set for the remote case.

E2E_SSH_HOST ?=

.PHONY: local-test remote-test
# Local test assumes that the container-toolkit is already installed on the host
Copy link
Member

@elezar elezar Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the distinction? Even when running locally, I may want to install the toolkit from an image that I've just built.

Comment on lines 39 to 40
imageRepo string
imageTag string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this warrant a type? (same for the ssh info below)?

var _ = Describe("docker", Ordered, func() {
// Install the NVIDIA Container Toolkit
BeforeAll(func(ctx context.Context) {
if sshKey != "" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned, we should probably allow the isntallation logic to be run locally too.

// Install the NVIDIA Container Toolkit
BeforeAll(func(ctx context.Context) {
if sshKey != "" {
installScript, err := getInstallScript(dockerInstallTemplate, fmt.Sprintf("%s:%s", imageRepo, imageTag))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense to return an Installer instead which implements Install() error?

--restart-mode=systemd
`

type DockerInstall struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type isn't used.

"text/template"
)

const Shebang = `#! /usr/bin/env bash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it not more confusing to have to remember to add the Shebang each time? Should we not use include this in the template / script.

docker run --pid=host --rm -i --privileged \
-v /:/host \
-v /var/run/docker.sock:/var/run/docker.sock \
-v /var/run/nvidia-container-toolkit:/var/run/nvidia-container-toolkit \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use a temporary path here since we ideally want to remove the config again after the tests.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing issue/PR to fix/edit/create/enhance a project unit/e2e test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants