Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In the drone project,the hover_eval.py is extremely slow #530

Open
zhrli opened this issue Jan 10, 2025 · 11 comments
Open

In the drone project,the hover_eval.py is extremely slow #530

zhrli opened this issue Jan 10, 2025 · 11 comments

Comments

@zhrli
Copy link

zhrli commented Jan 10, 2025

02c95c1bf06d01e9ad7f9b7e2e43ec2
After training , I ran the 'python hover_eval.py -e drone-hovering --ckpt 500 --record' in docker, and it is just very slow.
By the way, I also ran the locomotion in docker, it could work normally.

@KafuuChikai
Copy link
Contributor

你可以尝试修改一下hover_eval.py?

  • env_cfg["visualize_target"] = True这一行注释
  • 删去参数--record

以上两点是与locomotion不同的feature,其他应该都是一致的。然后运行

python hover_eval.py -e drone-hovering --ckpt 500

如果还是不行,可以试试不使用docker?

@KafuuChikai
Copy link
Contributor

KafuuChikai commented Jan 10, 2025

可以提供一下你在traing时候的数据吗?

我看看在training的时候计算效率是否与我的一致,我的测试平台:

  • ubuntu 20.04
  • AMD Ryzen 7950x
  • RTX 4090 D
  • DDR5 64G

仿真速度1,200,000 step/s, 1分30秒左右完成训练

@zhrli
Copy link
Author

zhrli commented Jan 10, 2025

  • env_cfg["visualize_target"]

image
正常速度了

加 --record 就慢了

@KafuuChikai
Copy link
Contributor

  • env_cfg["visualize_target"]

image 正常速度了

加 --record 就慢了

所以是--record导致的吗?

我提供以下两个方向排查具体问题:

  1. 我写的视频录制功能是参考examples/tutorials/visualization.py,你看看这个是否能正常运行?

  2. hover_eval.py里,你可以查看一下代码块

        with torch.no_grad():
            if args.record:
                env.cam.start_recording()
                for _ in range(max_sim_step):
                    actions = policy(obs)
                    obs, _, rews, dones, infos = env.step(actions)
                    env.cam.render()
                env.cam.stop_recording(save_to_filename="video.mp4", fps=env_cfg["max_visualize_FPS"])
            else:
                for _ in range(max_sim_step):
                    actions = policy(obs)
                    obs, _, rews, dones, infos = env.step(actions)

    record的flag在这里触发,你可以看一下具体是哪一行导致效率变低?

@zhrli
Copy link
Author

zhrli commented Jan 10, 2025

可以提供一下你在traing时候的数据吗?

我看看在training的时候计算效率是否与我的一致,我的测试平台:

  • ubuntu 20.04
  • AMD Ryzen 7950x
  • RTX 4090 D
  • DDR5 64G

仿真速度1,200,000 step/s, 1分30秒左右完成训练

base) lizhaorui@ubuntu:~/DL/Genesis/examples$ nvidia-smi
Fri Jan 10 16:02:14 2025
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.113.01 Driver Version: 535.113.01 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 4090 Off | 00000000:18:00.0 Off | Off |
| 30% 30C P2 57W / 450W | 652MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce RTX 4090 Off | 00000000:C3:00.0 On | Off |
| 58% 31C P8 31W / 450W | 221MiB / 24564MiB | 17% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 1749295 G /usr/lib/xorg/Xorg 4MiB |
| 0 N/A N/A 2623347 C python 632MiB |
| 1 N/A N/A 1749295 G /usr/lib/xorg/Xorg 209MiB |
+---------------------------------------------------------------------------------------+
(base) lizhaorui@ubuntu:~/DL/Genesis/examples$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Jun_13_19:16:58_PDT_2023
Cuda compilation tools, release 12.2, V12.2.91
Build cuda_12.2.r12.2/compiler.32965470_0

好像遇到我这个问题的人应该很多,有更新CUDA的 ,有加代码的 。 这个例子我在MACpro M3 上没跑起来,段错误。

@KafuuChikai
Copy link
Contributor

可以提供一下你在traing时候的数据吗?
我看看在training的时候计算效率是否与我的一致,我的测试平台:

  • ubuntu 20.04
  • AMD Ryzen 7950x
  • RTX 4090 D
  • DDR5 64G

仿真速度1,200,000 step/s, 1分30秒左右完成训练

base) lizhaorui@ubuntu:~/DL/Genesis/examples$ nvidia-smi Fri Jan 10 16:02:14 2025 +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.113.01 Driver Version: 535.113.01 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 4090 Off | 00000000:18:00.0 Off | Off | | 30% 30C P2 57W / 450W | 652MiB / 24564MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ | 1 NVIDIA GeForce RTX 4090 Off | 00000000:C3:00.0 On | Off | | 58% 31C P8 31W / 450W | 221MiB / 24564MiB | 17% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 1749295 G /usr/lib/xorg/Xorg 4MiB | | 0 N/A N/A 2623347 C python 632MiB | | 1 N/A N/A 1749295 G /usr/lib/xorg/Xorg 209MiB | +---------------------------------------------------------------------------------------+ (base) lizhaorui@ubuntu:~/DL/Genesis/examples$ nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Tue_Jun_13_19:16:58_PDT_2023 Cuda compilation tools, release 12.2, V12.2.91 Build cuda_12.2.r12.2/compiler.32965470_0

好像遇到我这个问题的人应该很多,有更新CUDA的 ,有加代码的 。 这个例子我在MACpro M3 上没跑起来,段错误。

我的CUDA版本比你还新,12.4

这个就和引擎有关了,还是等后续更新吧,兼容性还有待提升

@formoree
Copy link

Contributor

CUDA版本对genesis的运行效果是不是有大的影响?
我之前租用GPU,CUDA 11.8无法调用GPU进行运行,12.2以上就可以

@formoree
Copy link

#203 具体问题是上面这个issue

@zhrli
Copy link
Author

zhrli commented Jan 10, 2025

Contributor

CUDA版本对genesis的运行效果是不是有大的影响? 我之前租用GPU,CUDA 11.8无法调用GPU进行运行,12.2以上就可以

所以拉了docker 但是docker好像有的运行也不行

@zhrli
Copy link
Author

zhrli commented Jan 10, 2025

  • env_cfg["visualize_target"]

image 正常速度了
加 --record 就慢了

所以是--record导致的吗?

我提供以下两个方向排查具体问题:

  1. 我写的视频录制功能是参考examples/tutorials/visualization.py,你看看这个是否能正常运行?

  2. hover_eval.py里,你可以查看一下代码块

        with torch.no_grad():
            if args.record:
                env.cam.start_recording()
                for _ in range(max_sim_step):
                    actions = policy(obs)
                    obs, _, rews, dones, infos = env.step(actions)
                    env.cam.render()
                env.cam.stop_recording(save_to_filename="video.mp4", fps=env_cfg["max_visualize_FPS"])
            else:
                for _ in range(max_sim_step):
                    actions = policy(obs)
                    obs, _, rews, dones, infos = env.step(actions)

    record的flag在这里触发,你可以看一下具体是哪一行导致效率变低?

visualization不能正常运行,在linux和macos上都跑了,linux是慢,mac上报错 opencv的错 , 估计就是cuda版本太低了

@KafuuChikai
Copy link
Contributor

那么可能是引擎或者docker的问题,你可以提出新的issue问问?关于camera.render() @zhouxian

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants