-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow Nvidia driver script to set recommendations for LD_PRELOAD
#754
base: 2023.06-software.eessi.io
Are you sure you want to change the base?
Allow Nvidia driver script to set recommendations for LD_PRELOAD
#754
Conversation
Instance
|
Instance
|
Instance
|
Example output: [rocky@ip-172-31-27-81 software-layer]$ ./scripts/gpu_support/nvidia/link_nvidia_host_libraries.sh --ld-preload --no-download
Found NVIDIA GPU driver version 545.23.08
Found host CUDA version 12.3
Using default list of libraries
Matched 48 CUDA Libraries
When attempting to use LD_PRELOAD we exclude anything related to graphics
libXext.so.6 is NOT in the provided preload list, filtering /lib64/libGL.so.1.
libXext.so.6 is NOT in the provided preload list, filtering /lib64/libGL.so.
libXext.so.6 is NOT in the provided preload list, filtering /lib64/libGLX_nvidia.so.0.
libXext.so.6 is NOT in the provided preload list, filtering /lib64/libGLX.so.0.
libXext.so.6 is NOT in the provided preload list, filtering /lib64/libGLX.so.
libwayland-server.so.0 is NOT in the provided preload list, filtering /lib64/libnvidia-egl-wayland.so.1.
libXext.so.6 is NOT in the provided preload list, filtering /lib64/libnvidia-fbc.so.1.
libXext.so.6 is NOT in the provided preload list, filtering /lib64/libnvidia-fbc.so.
libXNVCtrl.so.0 is NOT in the provided preload list, filtering /lib64/libnvidia-gtk3.so.545.23.08.
The recommended way to use LD_PRELOAD is to only use it when you need to:
export EESSI_GPU_LD_PRELOAD="/lib64/libcuda.so.1:/lib64/libcuda.so:/lib64/libcudadebugger.so.1:/lib64/libnvcuvid.so.1:/lib64/libnvcuvid.so:/lib64/libnvidia-cfg.so.1:/lib64/libnvidia-cfg.so:/lib64/libnvidia-eglcore.so.545.23.08:/lib64/libnvidia-encode.so.1:/lib64/libnvidia-encode.so:/lib64/libnvidia-glcore.so.545.23.08:/lib64/libnvidia-glsi.so.545.23.08:/lib64/libnvidia-glvkspirv.so.545.23.08:/lib64/libnvidia-gpucomp.so.545.23.08:/lib64/libnvidia-ml.so.1:/lib64/libnvidia-ml.so:/lib64/libnvidia-nvvm.so.4:/lib64/libnvidia-nvvm.so:/lib64/libnvidia-opencl.so.1:/lib64/libnvidia-opticalflow.so.1:/lib64/libnvidia-ptxjitcompiler.so.1:/lib64/libnvidia-ptxjitcompiler.so:/lib64/libnvidia-rtcore.so.545.23.08:/lib64/libnvidia-tls.so.545.23.08:/lib64/libnvoptix.so.1:/lib64/libOpenCL.so.1"
export EESSI_OVERRIDE_GPU_CHECK="1"
Then you can set LD_PRELOAD only when you want to run a GPU application, e.g.,
LD_PRELOAD="$EESSI_GPU_LD_PRELOAD" device_query |
@ocaisa There's duplicate entries here, |
# Filter out all symlinks and libraries that have missing library dependencies under EESSI | ||
filtered_libraries=() | ||
for library in "${matched_libraries[@]}"; do | ||
if [ ! -L "$library" ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is too aggressive, instead we should just resolve the symlink and remove duplicate entries
This is resulting in about 400MB of preload: {EESSI 2023.06} [rocky@ip-172-31-20-85 software-layer]$ IFS=':'; for path in $EESSI_GPU_LD_PRELOAD; do ls -lh $path; done; unset IFS
-rwxr-xr-x 1 root root 29M Nov 6 2023 /usr/lib64/libcuda.so.545.23.08
-rwxr-xr-x 1 root root 11M Nov 6 2023 /usr/lib64/libcudadebugger.so.545.23.08
-rwxr-xr-x 1 root root 9.6M Nov 6 2023 /usr/lib64/libnvcuvid.so.545.23.08
-rwxr-xr-x 1 root root 269K Nov 6 2023 /usr/lib64/libnvidia-cfg.so.545.23.08
-rwxr-xr-x 1 root root 566K Nov 6 2023 /usr/lib64/libnvidia-glsi.so.545.23.08
-rwxr-xr-x 1 root root 8.7M Nov 6 2023 /usr/lib64/libnvidia-glvkspirv.so.545.23.08
-rwxr-xr-x 1 root root 42M Nov 7 2023 /usr/lib64/libnvidia-gpucomp.so.545.23.08
-rwxr-xr-x 1 root root 1.9M Nov 6 2023 /usr/lib64/libnvidia-ml.so.545.23.08
-rwxr-xr-x 1 root root 83M Nov 7 2023 /usr/lib64/libnvidia-nvvm.so.545.23.08
-rwxr-xr-x 1 root root 24M Nov 6 2023 /usr/lib64/libnvidia-opencl.so.545.23.08
-rwxr-xr-x 1 root root 26M Nov 6 2023 /usr/lib64/libnvidia-ptxjitcompiler.so.545.23.08
-rwxr-xr-x 1 root root 103M Nov 7 2023 /usr/lib64/libnvidia-rtcore.so.545.23.08
-rwxr-xr-x 1 root root 19K Nov 6 2023 /usr/lib64/libnvidia-tls.so.545.23.08
-rwxr-xr-x 1 root root 58M Nov 7 2023 /usr/lib64/libnvoptix.so.545.23.08
-rwxr-xr-x 1 root root 131K Apr 12 2021 /usr/lib64/libOpenCL.so.1.0.0 |
…er into update_driver_script
@boegel I've played with this a lot today and I'm happy with the functionality now: {EESSI 2023.06} [rocky@ip-172-31-20-85 software-layer]$ ./scripts/gpu_support/nvidia/link_nvidia_host_libraries.sh --no-download --ld-preload
Found host CUDA version 7.5
Found NVIDIA GPU driver version 545.23.08
Using default list of libraries
Matched 48 CUDA Libraries
When attempting to use LD_PRELOAD we exclude anything related to graphics
Match found for libcuda.so for CUDA compat libraries
Match found for libcudadebugger.so for CUDA compat libraries
libGLdispatch.so.0 is NOT in the provided preload list, filtering /lib64/libEGL.so.1
libGLdispatch.so.0 is NOT in the provided preload list, filtering /lib64/libEGL.so
libGLdispatch.so.0 is NOT in the provided preload list, filtering /lib64/libGLESv1_CM.so.1
libGLdispatch.so.0 is NOT in the provided preload list, filtering /lib64/libGLESv1_CM.so
libGLdispatch.so.0 is NOT in the provided preload list, filtering /lib64/libGLESv2.so.2
libGLdispatch.so.0 is NOT in the provided preload list, filtering /lib64/libGLESv2.so
libGLX.so.0 is NOT in the provided preload list, filtering /lib64/libGL.so.1
libGLX.so.0 is NOT in the provided preload list, filtering /lib64/libGL.so
libXext.so.6 is NOT in the provided preload list, filtering /lib64/libGLX_nvidia.so.0
libXext.so.6 is NOT in the provided preload list, filtering /lib64/libGLX.so.0
libXext.so.6 is NOT in the provided preload list, filtering /lib64/libGLX.so
libwayland-server.so.0 is NOT in the provided preload list, filtering /lib64/libnvidia-egl-wayland.so.1
libnvcuvid.so.1 is NOT in the provided preload list, filtering /lib64/libnvidia-encode.so.1
libnvcuvid.so.1 is NOT in the provided preload list, filtering /lib64/libnvidia-encode.so
libGL.so.1 is NOT in the provided preload list, filtering /lib64/libnvidia-fbc.so.1
libGL.so.1 is NOT in the provided preload list, filtering /lib64/libnvidia-fbc.so
libXNVCtrl.so.0 is NOT in the provided preload list, filtering /lib64/libnvidia-gtk3.so.545.23.08
Match found for libnvidia-nvvm.so for CUDA compat libraries
libnvcuvid.so.1 is NOT in the provided preload list, filtering /lib64/libnvidia-opticalflow.so.1
Match found for libnvidia-ptxjitcompiler.so for CUDA compat libraries
libGLdispatch.so.0 is NOT in the provided preload list, filtering /lib64/libOpenGL.so.0
libGLdispatch.so.0 is NOT in the provided preload list, filtering /lib64/libOpenGL.so
The recommended way to use LD_PRELOAD is to only use it when you need to.
A minimal preload which should work in most cases:
export EESSI_GPU_COMPAT_LD_PRELOAD="/usr/lib64/libcuda.so.545.23.08:/usr/lib64/libcudadebugger.so.545.23.08:/usr/lib64/libnvidia-nvvm.so.545.23.08:/usr/lib64/libnvidia-ptxjitcompiler.so.545.23.08"
A corner-case full preload (which is hard on memory) for exceptional use:
export EESSI_GPU_LD_PRELOAD="/usr/lib64/libcuda.so.545.23.08:/usr/lib64/libcudadebugger.so.545.23.08:/usr/lib64/libEGL_nvidia.so.545.23.08:/usr/lib64/libGLdispatch.so.0.0.0:/usr/lib64/libGLESv1_CM_nvidia.so.545.23.08:/usr/lib64/libGLESv2_nvidia.so.545.23.08:/usr/lib64/libnvcuvid.so.545.23.08:/usr/lib64/libnvidia-cfg.so.545.23.08:/usr/lib64/libnvidia-eglcore.so.545.23.08:/usr/lib64/libnvidia-glcore.so.545.23.08:/usr/lib64/libnvidia-glsi.so.545.23.08:/usr/lib64/libnvidia-glvkspirv.so.545.23.08:/usr/lib64/libnvidia-gpucomp.so.545.23.08:/usr/lib64/libnvidia-ml.so.545.23.08:/usr/lib64/libnvidia-nvvm.so.545.23.08:/usr/lib64/libnvidia-opencl.so.545.23.08:/usr/lib64/libnvidia-ptxjitcompiler.so.545.23.08:/usr/lib64/libnvidia-rtcore.so.545.23.08:/usr/lib64/libnvidia-tls.so.545.23.08:/usr/lib64/libnvoptix.so.545.23.08:/usr/lib64/libOpenCL.so.1.0.0"
export EESSI_OVERRIDE_GPU_CHECK="1"
Then you can set LD_PRELOAD only when you want to run a GPU application, e.g.,
LD_PRELOAD="$EESSI_GPU_COMPAT_LD_PRELOAD" device_query |
bot: build repo:eessi.io-2023.06-software arch:x86_64/generic |
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
New job on instance
|
LD_PRELOAD
Also tested the script within
|
Accepted all except one Co-authored-by: TopRichard <121792457+TopRichard@users.noreply.github.com>
@TopRichard This will need to be re-tested now to make sure the changes haven't had an unintended impact |
No description provided.