Laser envelope solver (finally!) #743

MaxThevenet · 2022-05-23T11:21:08Z

This PR proposes an implementation of a laser envelope solver. The plasma response to an analytic laser pulse was already implemented. Now, the propagation of the laser pulse in a plasma is also included. The model is based on Benedetti's 2017/2018 article.

Features

The complex envelope of current and previous time steps is stored as a 3D array (currently with same bounds and resolution).
The 3D array can be stored on host or on device memory, as chosen by a runtime parameter. The copies from/to the 3D array is implemented.
Runs on Nvidia GPUs (and CPUs).
Runs in parallel.
The envelope is dumped in openPMD files.
A CI test checks the evolution of a laser pulse in vacuum (and compares with theory).

Structure

The 3D array is Laser::m_F; 2D slices (FArrayBoxes) are stored in Laser::m_slices, labelled n**j** where n stands for the time step, j for the longitudinal slice. n00 is time step n, nm1 is n-1 and np1 is n+1. A similar notation is used for slice j. In general, fields are stored with 2 Real components, one for the real part, the other for the imaginary part, except inside the FFT solver (where fields are directly stored as Complex arrays).

A new Poisson solver is implemented to solve the C2C Poisson equation with periodic BC. Although this causes duplication with the existing solver, it reduces complexity (the capability to abstract the type of the source array of the forward FFT to be either Real, as needed everywhere, or Complex, as needed for the laser pulse, would take significant templating and obfuscate the code). This could be reconsidered. The solver is directly implemented in Laser.cpp in Laser::AdvanceSliceFFT, with the required abstraction for portability in Laser.H.

Two solvers are implemented, a MG solver and a FFT solver, to advance a laser slice by 1 time step. This solver operation computes slice j at step n+1 using previous slices and previous time steps: $s_{j}^{n+1} = f(s_{j+1,j+2}^{n+1}, s_{j,j+1,j+2}^{n}, s_{j,j+1,j+2}^{n+1})$. This is integrated within the loop over slices. The management of these slices is largely done in Laser::Copy.

For parallel runs, the whole 3D array on the current box has to be communicated. This is done similar to beam communication.

A new quantity chi has to be deposited (in PlasmaCurrentDepositionInner.H) for the plasma response. This is essentially the plasma density divided by the Lorentz factor. It is used in the laser solver.

Performance

The CI test with resolution 1024 1024 500 for 7 time steps gives the following time (in vacuum, so the changes would not be that dramatic with a plasma):

$ grep "total time" output*txt
output.old_ranks.1_host.0.txt:TinyProfiler total time across processes [min...avg...max]: 17.49 ... 17.49 ... 17.49
output.old_ranks.1_host.1.txt:TinyProfiler total time across processes [min...avg...max]: 35.19 ... 35.19 ... 35.19
output.old_ranks.4_host.0.txt:TinyProfiler total time across processes [min...avg...max]: 20.34 ... 21.88 ... 23.19
output.old_ranks.4_host.1.txt:TinyProfiler total time across processes [min...avg...max]: 26.69 ... 28.66 ... 30.36

and memory usage

$ grep "Free  GPU global memory" output*txt
output.old_ranks.1_host.0.txt:Free  GPU global memory (MB) spread across MPI: [23391 ... 23391]
output.old_ranks.1_host.1.txt:Free  GPU global memory (MB) spread across MPI: [39361 ... 39361]
output.old_ranks.4_host.0.txt:Free  GPU global memory (MB) spread across MPI: [35265 ... 35265]
output.old_ranks.4_host.1.txt:Free  GPU global memory (MB) spread across MPI: [39225 ... 39353]

when changing the number of ranks and whether the laser envelope is on host or device. As expected, storing the 3D laser envelope on host makes the code slower, but uses less memory.

Remains to be done

See #804.

first attempt of laser IO

…om several time steps etc.

…so far). But amplitude evolves wrongly.

SeverinDiederichs

Great! See a couple of comments below.

Further problems discussed offline.

A few more items to be added to the to-do list:

Initialization of a laser profile via the parser (non-Gaussian),
the possibility to propagate the laser backwards,
the possibility to load and restart the laser.

docs/source/run/parameters.rst

examples/laser/inputs_SI

src/laser/Laser.cpp

src/particles/pusher/FieldGather.H

tests/laser_blowout_wake_explicit.1Rank.sh

tests/laser_blowout_wake_explicit.SI.1Rank.sh

src/utils/AdaptiveTimeStep.H

src/laser/Laser.H

src/fields/Fields.cpp

src/particles/pusher/FieldGather.H

MaxThevenet · 2022-10-27T21:19:43Z

Alright, thanks for all the comments! I believe all are either solved or listed in the todo above.

SeverinDiederichs · 2022-10-27T22:08:14Z

Could you please comment on the status of point 1 on the to do list? Was this resolved at least for serial runs now? Or is this still present?

Otherwise, I think we can merge soon and move the to do list to an issue.

MaxThevenet · 2022-10-28T05:48:14Z

Point 1 above is still relevant. This is not a surprise: the Notify/Wait code full is written for 3d_on_host = 0, and makes no sense for 3d_on_host = 1. I'll try to dedicate some time today to fix it. If not, we can merge and take care of it in a subsequent PR.
For serial runs, it is fixed indeed with an exit condition to the wait/notify functions.

MaxThevenet · 2022-10-28T08:41:47Z

Option 3d_on_host could be accelerated, but it behaves as expected (it does reduce a lot the memory footprint). I added some info on that in the PR description. Therefore, I think this PR is good to go. I split point 1 in the todo list into 2 more detailed points. I also updated the doc to mention that the MG solver is currently less stable (offline chat with @SeverinDiederichs).

SeverinDiederichs

🎉 Awesome!
Let's merge this now and move the to do list to an issue and work on it on separate PRs 🚀

MaxThevenet added 5 commits May 22, 2022 13:18

store 3D laser on host and do H2D & D2H copies

4d4ea64

Serial run with dummy laser advance works

efff6ac

start parallelization, still buggy

d88c6bb

fix parallelization

1a45de0

re-enable tiling

69e5fe5

MaxThevenet added GPU Related to GPU acceleration Parallelization Longitudinal and transverse MPI decomposition pipeline Specific to the implementation of the new pipeline component: laser envelope About the laser envelope solver labels May 23, 2022

MaxThevenet added 21 commits May 25, 2022 09:36

first attempt of laser IO

4fbf710

fix and cleaning

77df63f

minor

16f8ed0

Merge pull request #2 from MaxThevenet/laser2

b9b2546

first attempt of laser IO

add a few synchronize for GPU

d4ec315

clean up useless comments and print statements

ca46258

Option to allocate 3D array on host or device

e2a8c8f

minor comment update

e3d7bc5

Update arrays and data for realistic update: need MUCH more slices fr…

7df7381

…om several time steps etc.

fix merge conflicts

c2a92fc

fix merge conflicts

e384017

properly handly imaginary and real part of laser envelope

ab6de23

compute all arrays needed in the MG solve for the laser advance

657c0ce

call solber

c237eb3

debugging: only with k0=0 and dz=inf does the MG solver converge

1531c16

temporary test functions for debugging

6293d2a

tmp chance

f280641

Merge branch 'development' into laser

6139f90

new MG solver allows to switch back to actual implementation (vacuum …

37b3fae

…so far). But amplitude evolves wrongly.

add new version with on-axis phase evaluation only

36da378

works serial, on cpu, in vacuum

3c62ea4

MaxThevenet added 11 commits October 24, 2022 16:32

fix AMD compilation

f38fd0f

remove unused variable

76d403e

laser CI on 2 ranks, and with checksum benchmark

2837c93

explicit flag to turn on laser

b5d9ca4

fix checksum comparison

592f297

cleaning and update doc

aa7454a

make sure only 1 plasma species. >1 not yet supported

9392e91

typo

0779306

doc

573200a

clean up error message and doc for solver type

3dae1a3

remove unused parameter in Laser::Copy

045e267

MaxThevenet changed the title ~~[WIP] Laser envelope solver~~ Laser envelope solver (finally!) Oct 24, 2022

MaxThevenet requested a review from SeverinDiederichs October 24, 2022 18:19

SeverinDiederichs reviewed Oct 24, 2022

View reviewed changes

MaxThevenet added 4 commits October 25, 2022 13:55

fix merge conflict

b739640

fix SP on CUDA with FFT Laser solver

7d3f251

implement changes from review

675bd75

no wait and notify if running on only 1 rank

6e3e9db

AlexanderSinn reviewed Oct 27, 2022

View reviewed changes

src/laser/Laser.H Outdated Show resolved Hide resolved

AlexanderSinn reviewed Oct 27, 2022

View reviewed changes

src/fields/Fields.cpp Outdated Show resolved Hide resolved

AlexanderSinn reviewed Oct 27, 2022

View reviewed changes

src/particles/pusher/FieldGather.H Outdated Show resolved Hide resolved

some cleaning

962a469

more doc

8ef4699

SeverinDiederichs approved these changes Oct 28, 2022

View reviewed changes

MaxThevenet merged commit 0741836 into Hi-PACE:development Oct 28, 2022

MaxThevenet deleted the laser branch October 28, 2022 09:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Laser envelope solver (finally!) #743

Laser envelope solver (finally!) #743

MaxThevenet commented May 23, 2022 •

edited

Loading

SeverinDiederichs left a comment •

edited

Loading

MaxThevenet commented Oct 27, 2022

SeverinDiederichs commented Oct 27, 2022

MaxThevenet commented Oct 28, 2022 •

edited

Loading

MaxThevenet commented Oct 28, 2022

SeverinDiederichs left a comment

Laser envelope solver (finally!) #743

Laser envelope solver (finally!) #743

Conversation

MaxThevenet commented May 23, 2022 • edited Loading

Features

Structure

Performance

Remains to be done

SeverinDiederichs left a comment • edited Loading

Choose a reason for hiding this comment

MaxThevenet commented Oct 27, 2022

SeverinDiederichs commented Oct 27, 2022

MaxThevenet commented Oct 28, 2022 • edited Loading

MaxThevenet commented Oct 28, 2022

SeverinDiederichs left a comment

Choose a reason for hiding this comment

MaxThevenet commented May 23, 2022 •

edited

Loading

SeverinDiederichs left a comment •

edited

Loading

MaxThevenet commented Oct 28, 2022 •

edited

Loading