Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add parallelismFactorHint to Spawn[F] #4199

Open
bpholt opened this issue Dec 12, 2024 · 4 comments
Open

Add parallelismFactorHint to Spawn[F] #4199

bpholt opened this issue Dec 12, 2024 · 4 comments
Milestone

Comments

@bpholt
Copy link
Member

bpholt commented Dec 12, 2024

Someone asked in Discord if there is a way to ask the CE runtime how many CPUs/compute threads it has access to, so that they can set parallelism factors appropriately. It's available in IOApp as computeWorkerThreadCount but this is specific to IOApp and therefore not easily usable from all the places in c.e.std that could benefit from it.

Once it's added to Spawn, new methods or overrides should be added setting default parallelism factors accordingly. For example, in addition to Random.scalaUtilRandomN, there should be a variant that sets N to the computeWorkerThreadCount.

There was some discussion of this in Discord, which I will attempt to summarize:

  1. Daniel originally suggested it as concurrencyFactorHint: Option[Int] (or parallelismFactorHint: Option[Int] in the interest of consistent terminology).
  2. Arman suggested avoiding the Option box by defaulting to 0, to which Daniel "didn't totally object," because "technically anything ≤ 0 is semantically invalid anyway" and "if we go with 0 as the default then the fallback could be to tap the runtime anyway"
  3. There was also some discussion about whether this should be F[Int] to reflect the reality that the number can change in some circumstances, but that opens up quite a rabbit hole, so it may not be worth it? If F[Int] is used, should there be some kind of notification protocol to let data structures optimized for a given value rebalance themselves if the value changes?
@durban
Copy link
Contributor

durban commented Dec 12, 2024

For Spawn[IO], would this return availableProcessors() or the size of the WSTP? (Because the two is not necessarily the same.)

@djspiewak
Copy link
Member

The latter. The idea here is that the hint would help the user build downstream data structures which have striping strategies which are sensitive to the maximum true parallelism. Dispatcher and Random are two decent examples within std.

@durban
Copy link
Contributor

durban commented Dec 12, 2024

In that case, I can't really see how to implement it as a : Int... return the size of what WSTP? There might not even exist one. (While if we're doing it as a : F[Int], we'd use the one we're running on. Although, that still could be a non-WSTP Executor.)

@djspiewak
Copy link
Member

This is a good point. I think this probably needs to be F[Int], and we should just blindly use the runtime that we're running on rather than trying to account for evalOn.

@djspiewak djspiewak added this to the v3.7.0 milestone Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants