Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Feature request) Select GPU #9

Closed
Astatin3 opened this issue Oct 28, 2024 · 7 comments
Closed

(Feature request) Select GPU #9

Astatin3 opened this issue Oct 28, 2024 · 7 comments

Comments

@Astatin3
Copy link
Contributor

With this project, is it possible to choose which gpu the training will take place on? It seems to default to the 0th gpu on the computer, which is not always the most performant one.

@ArthurBrussee
Copy link
Owner

Ah yes, that would be good. I think ideally this would be handled by Burn (or rather https://github.com/tracel-ai/cubecl), as more people would need this.

That will require some more backend work:

When those have made some progress I'll update Brush to use this, and allow you to select a device. Hopefully that works!

Ps: Can you let me know what devices you have? Do you have multiple dedicated GPUs? Or is there a bug where it selects an integrated GPU over a dedicated one?

@Astatin3
Copy link
Contributor Author

Astatin3 commented Oct 29, 2024

I have an intel gpu at index 0, and an nVidia gpu at index 1. This was actually a problem I had with windows aswell. Windows thought the integrated gpu was the high performance one, and the dedicated one was more power efficient.

Thank you so much!

@ArthurBrussee
Copy link
Owner

Ok! I think that's technically more a bug with wgpu (or windows?), as the logic should already be to pick the dedicated GPU, but either way good to have an override! egui PR just landed so now just the Burn one. Burn is having some issues at head but hope to have those resolved soon

@ArthurBrussee
Copy link
Owner

Ok I resolved the last few issues in the Burn update, and landed the selection! After this PR 820deb5

You can set the CUBECL_WGPU_DEFAULT_DEVICE env variable to pick between GPUs like this:

CUBECL_WGPU_DEFAULT_DEVICE=DiscreteGpu(1)  // Use discrete GPU index 0
CUBECL_WGPU_DEFAULT_DEVICE=IntegratedGpu(0) // Use integrated GPU index 0

Again I suspect there's a bug in wgpu so they might both be reported as dedicated / integrated, not sure.

I've added a display of what GPU is being used: 809c1d8

Lmk if that works!

@Astatin3
Copy link
Contributor Author

Astatin3 commented Nov 1, 2024

Thank you! It appears to select the correct device now. But it has another error unfortunately, It fails with: There was no valid format for the surface at all.

(base) astatin3@acer:~/Documents/GitHub/brush$ CUBECL_WGPU_DEFAULT_DEVICE="DiscreteGpu(0)" RUST_BACKTRACE=1 cargo run
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.26s
     Running `target/debug/brush_bin`
[2024-11-01T17:21:08Z ERROR wgpu_hal::vulkan::adapter] get_physical_device_surface_support: Initialization of an object has failed
[2024-11-01T17:21:08Z ERROR eframe::native::run] Exiting because of error: WGPU error: There was no valid format for the surface at all.
thread 'main' panicked at crates/brush-desktop/src/main.rs:33:10:
called `Result::unwrap()` on an `Err` value: Wgpu(NoSurfaceFormatsAvailable)
stack backtrace:

   0: rust_begin_unwind
             at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panicking.rs:662:5
   1: core::panicking::panic_fmt
             at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/panicking.rs:74:14
   2: core::result::unwrap_failed
             at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/result.rs:1677:5
   3: core::result::Result<T,E>::unwrap
             at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/result.rs:1102:23
   4: brush_bin::main
             at ./crates/brush-desktop/src/main.rs:28:9
   5: core::ops::function::FnOnce::call_once
             at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

@ArthurBrussee
Copy link
Owner

Hmmm well that would also explain why it wasn't picked before - it seems wgpu doesn't believe it can support rendering the surface / window. Do you have other graphical applications that do run on this GPU? I'm not sure how integrated + dedicate GPU stuff works on Linux.

@Astatin3
Copy link
Contributor Author

Astatin3 commented Nov 3, 2024

Sorry for the late response,
This should be the only program that is running on that GPU.

I believe this is solved, I have run into Linux support problems.

@Astatin3 Astatin3 closed this as completed Nov 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants