Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Parallel SAMIN #843

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft

Conversation

MatFi
Copy link

@MatFi MatFi commented Aug 14, 2020

My problems often involve functions that are expensive to evaluate. However, Julia IMHO offers hardly any choice of native packages with parallel working methods for (global) optimization (besides BlackBoxOptim.jl). Now I made your SAMIN method cluster-compatible and would like contribute this to Optim.jl. What do you think ?

To do:

  • Worker based parallelization
  • Threaded parallelization (needed?)
  • Correct implementation of f_calls-stats (now it just uses the iteration counter)

From my first benchmarks I can see that the pay-off comes when the f_call costs more than 1e-5 s (all workers on the same machine)

image

@ChrisRackauckas
Copy link
Contributor

This is great! However, you doing too much work. Instead of trying to implement every form of parallelism, which you won't do (what about GPUs? TPUs? ...), it would be good if this was just a batch interface, i.e. give the user an array of arrays or a matrix of x and have them return a whole vector of objective functions. If you do it like that, all forms of parallelism are implemented.

@codecov
Copy link

codecov bot commented Aug 14, 2020

Codecov Report

Attention: Patch coverage is 82.71605% with 14 lines in your changes missing coverage. Please review.

Project coverage is 81.65%. Comparing base (4497296) to head (9ddf656).
Report is 139 commits behind head on master.

Files Patch % Lines
src/multivariate/solvers/constrained/samin.jl 82.71% 14 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #843      +/-   ##
==========================================
+ Coverage   81.48%   81.65%   +0.17%     
==========================================
  Files          43       43              
  Lines        2684     2720      +36     
==========================================
+ Hits         2187     2221      +34     
- Misses        497      499       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pkofod
Copy link
Member

pkofod commented Aug 14, 2020

I have made something like what Chris mentions for Particle Swarm that you can have a look at https://gist.github.com/pkofod/c6f0dc28588e1d65b521ba785405aff2

@pkofod
Copy link
Member

pkofod commented Aug 14, 2020

I also have the same algorithm in a slightly different flavor here: https://github.com/pkofod/NLSolvers.jl/blob/bc8b1941218fea9349dc9797ed32bb577c59289c/src/optimize/randomsearch/particleswarm.jl#L48 the idea there is that the objective type determines how the objective receives the states, either as a batch or one at a time through the batched_value call.

@MatFi
Copy link
Author

MatFi commented Aug 14, 2020

it would be good if this was just a batch interface

I definitely see the advantages of doing this, but keep in mind that some functions can vary a lot in timing (e.g. integration of stiff differential equations). Therefore I would like to keep the possibility of asynchronous evaluation, because a batch based implementation is too much idle in such cases (Of course, it is always possible to synchronize the asynchronous procedure with a full batch to allow for both).

If you do it like that, all forms of parallelism are implemented.

Actually the implementation is essentially shifted to the user then, but of course one can cover the most common cases using a suitable dispatch.

I have made something like what Chris mentions for Particle Swarm that you can have a look at

Thanks, I was already about to ask for examples... Extremely helpful 👍

@ChrisRackauckas
Copy link
Contributor

I definitely see the advantages of doing this, but keep in mind that some functions can vary a lot in timing (e.g. integration of stiff differential equations). Therefore I would like to keep the possibility of asynchronous evaluation, because a batch based implementation is too much idle in such cases (Of course, it is always possible to synchronize the asynchronous procedure with a full batch to allow for both).

That's exactly the reason why it should be on the caller's side. Sometimes that's the case, and integration of stiff stochastic differential equations I've seen whopping 3 orders of magnitude timing differences with the same equation (due to switching off a steady state). However, you can't rely on this because the spawn cost for threads is high, if you have an ODE that finishes in like 1ms the spawning a task per thread for dynamic scheduling or using pmap is far too slow. So you can't just have "threading", you need

  1. threading with dynamic scheduling
  2. threading with static scheduling
  3. threading with partial static scheduling (clumping but not the number of threads)
  4. distributed dynamic
  5. distributed static
  6. distributed + threads in all combinations
  7. GPU via CuArray
  8. GPU via DiffEqGPU
  9. GPU via KernelAbstractions
  10. MPI (since Distributed doesn't scale all that well)
  11. MPI+CUDA (the Clima setup)

Those are 10 forms of parallelism we are actively using in projects right now with stiff differential equations, all for different purposes due to the trade-offs. It's just so much easier to tell users who want asynchronous to just loop over @spawn than to try and handle parallelism efficiently, since that means something different to every user.

@lrnv
Copy link

lrnv commented Nov 12, 2020

Hi,

Would it be difficult to integrate @pkofod gist into Optim.jl ? I dont think i have enough comprehension of Julia's internals to do it myself, as the code @pkofod produced is quite different from the content of the ParticleSwarm.jl file in this repo...

But if you think it is doable easily and point me to the right direction, i might do it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants