-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA error causes the micp_localization node to die #2
Comments
The complete configuration file for referencing: base_frame: base_link sensors: |
I changed the CUDA-toolkit version from 12.6 to 11.8 and the issue dissappeard. |
Hi @Mh-Magdy, thanks for testing. However, that's weird. Normally, it should run with any cuda version. So I would say it's still an issue. So I will reopen it as a reminder for me to check this. Could you give me some more info about your setup that you used?
With this I think I could reproduce the error and hopefully fix it soon. (Or someone else) |
Hiii @amock 👋
There are another minor issues that faced me recently after the update, i will report them to you in more details but i will give you a hint about them now: When i perform some edits on the sensor parameters for example changing the number of horizontal samples to match my real sensor (theta_inc, theta_N) if backed is optix the package fails to run and if i change it to embree it works fine. I will capture any issues like these and give you details on the issue and my environment/setup as well as the config to help you reproduce the errors. Thank you Alexander |
Hi @Mh-Magdy, I have finally found some time to deal with your issue. First I tried to resemble your setup:
In the first terminal I started the example simulation by executing: roslaunch rmcl_example start_robot.launch then I changed the roslaunch rmcl_example rmcl_micp.launch In the RViz window I set an initial pose guess and everything went fine. So unfortunately, I could not reproduce the error you described. Could you maybe try the exact same procedure on your system? Otherwise I am not sure what is wrong on your system :/ Perhaps you could also check if rmagine alone is working. There are some benchmark executables in it. Or perhaps you could check if CUDA is working for other projects. Best |
Hi Alexander @amock , I hope you are doing well. I’d like to share some observations from a recent experiment related to this issue. In the initial experiments that led to opening this issue, I ran the entire software suite (rmcl, mesh_nav, and Gazebo) inside a Docker container that utilized the host machine’s GPU driver. However, due to my limited experience with Docker, I occasionally misconfigured environment variables and mismanaged Docker layers. As a result, the container did not properly utilize the NVIDIA GPU driver libraries—particularly liboptixnv.so, which is required by rmagine beside the downloaded optix headers to generate the rmagine-optix executable. Although the rmagine-optix executable appeared to build successfully without errors, it actually caused multiple runtime issues, including the errors discussed above. I hope these observations help clarify the challenges and contribute to finding a solution. Thank you for your collaboration and support. Best regards, |
Hi Muhammad @Mh-Magdy, The last months, we were working on putting everything into Docker images as well -- partly because our situation forced us to do so. But as nice side effect, we are planning to upload preconfigured Docker files and provide intructions how to use them. It comes with the next update which brings features like object pose tracking and convenience tools for scan filtering. Preview: https://www.youtube.com/watch?v=9i3B1ayvMn4 . Thanks for sharing your insights; they might help us with our Docker setup! And yes, this libnvoptix thing is quite important. (If anyone from NVIDIA is reading this, please consider integrating this library to Jetpack.) Best |
First of all thank you for this great package and the amazing work.
I'm running micp node with combining unit = cpu and backed optix and everything is OK. I changed the combining unit to gpu i got the following error:
I edited it back to cpu and i got the same error, also changed the backend to embree same error :(
could you help me please to skip this error?
The text was updated successfully, but these errors were encountered: