-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use the openPMD API for the PhaseSpace plugin (formerly libsplash) #3468
Use the openPMD API for the PhaseSpace plugin (formerly libsplash) #3468
Conversation
For the fule nameingn Can we close the file each time and open the file if required already existing. As you wrote you can call this pr for each species and for different directions. Since each species is an independent plugin instance it is not possible in PIC to figure out which instance mist close the file. If this would be to complicated we should keep the file naming even if they have redundante information which are already in the file. |
Then that's probably what we are going to have to do. Not all our backends support a ReadWrite mode, so this would not work in general. |
@PrometheusPi @steindev @wmles @Anton-Le Could you please help and encrypt the meanings of the I would say |
Phase space information is particle information which combines one component of a particle's momentum vector The phase space is the aggregation of this information for all particles for the chosen momentum and position component. But the phase space is not a pure scatter plot of (r_i, p_j) for all particles.
|
Thanks for the detailed description!
In combination with the openPMD standard files, this would then mean that we would set:
Hence,
I was not talking about the simulation dimensions, but the dimensions of the dataset (i.e. the momentum and the position dimension).
From the implementation, that looks like you're right: const float_64 unit = UNIT_CHARGE / UNIT_VOLUME; I should additionally use custom attributes for |
This requires to know
The upcoming 2.0.0. standard allows
If
I would love to have the original window size in the data. The Offset is already in |
The
Yep, I'll leave them in then. Although I need to touch those Python scripts anyway ;)
Ah, that's good to know. I will need to check whether the openPMD API will let me implement it in that way ahead of time. If it doesn't (which I'm afraid it will), some custom solution will have to do in the meantime.
Can you specify what you mean by this? Make a global (non-window) dataset, but only write to those positions that are covered by the window?
The |
Just the extent/size of the spatial grid. In order to set the extent of the axis of the spatial component to the original simulation window extent when plotting. |
Ah, alright. Yeah that should be no problem. |
To get back to this again, since I don't have such in-depth knowledge on the internals of PIC: const SubGrid<simDim>& subGrid = Environment<simDim>::get().SubGrid();
const std::uint64_t rGlobalSize = subGrid.getGlobalDomain().size[axis_element.space];
splash::Dimensions globalPhaseSpace_size(hBuffer.size().x(), rGlobalSize, 1); If I'm interpreting this correctly, there is no moving window involved in that? |
Update: It will not allow me to do this:
|
Yes, together with the
Now I am not sure what you mean. As far as I remember, information on the simulation grid size has never been part of the information given in phase space output. One could just infer the extent of the phase space's spatial axis from the data, and this extent is given by the positions of the particles. But these will usually not be at the simulation volume boundary. Therefore there was no chance to infer the simulation grid size (along the chosen spatial axis) from the phase space output. Your questions sounds like the simulation grid size along the chosen spatial axis will be in the phase space output anyways when using openPMD-api. If this is the case, this would be all I asked for. |
0eb22e4
to
a040766
Compare
I have now added four new attributes: globalDomainOffset, globalDomainSize, totalDomainOffset and totalDomainSize, as found in A current comparison of the attributes written by the old PhaseSpace plugin: As stated in the updated documentation on the PhaseSpace plugin, the attributes |
Excellent! I don't understand the difference between the pair Btw. I would do a test simulation with the new plugin this week. Besides the source code from this PR, do I need anything else to run this updated plugin on hemera at HZDR? |
To be fair, I don't know what the difference is either. Hence why I simply forwarded the terms that PIConGPU uses there. If someone with more knowledge on the internals of PIConGPU could quickly check (@psychocoderHPC ?), that would be great.
More custom attributes can always be added, sure ;)
Great! You will need openPMD API 0.12.0 or newer, and either ADIOS2 (2.6.0 or newer) or Parallel HDF5. HDF5 is the default output, so I suggest you use that (otherwise you will have to use |
@steindev @franzpoeschel PIConGPU domain notation is explained on this Wiki page. |
Thanks for clarifying! I think, I'll leave both naming levels in the output then and link the Wiki page in the documentation? |
@franzpoeschel Do you currently wait for more input from our side? If not I would suggest to clean the pr to bring it into a finale state. |
Yep, I'd need some feedback from testers @psychocoderHPC. @steindev Did you do a test run yet? |
@wmles @Anton-Le @PrometheusPi Please help with testing. If there is no feedback until 03.02.2021 we will merge this pr with the risk of breaking the plugin. |
This branch removes the Splash output. |
Thanks, we need to push the changes into this PR. |
eaf0ec9
to
e67699d
Compare
@franzpoeschel before we merge this PR we need to wait until #3516 is in the dev branch and rebase this PR against dev. Currently, the CI is not activating OpenPMD. |
@franzpoeschel #3516 is now merged, so no issues there. |
Basic dataset dumping via openPMD Fix dataset dimensions const Write some more attributes Build PhaseSpace only if openPMD is available Fix integer signedness Use using Create a new file for each species, enforce file-based iteration layout Update naming once more Make filenames dumped by openPMD identical to the ones formerly dumped by splash. Give splash some legacy filename. Move Python scripts to openPMD: get_iterations() Move Python scripts to openPMD: _get_for_iteration Cleanup python to make CI happy Fix indexing Update attributes naming Formatting Write _global_start and _global_offset Go back to using scalar record components Write some otherwise-defaulted openPMD attributes Update documentation Allow reading from any openPMD backend in the python scripts Python formatting Add (global|total)Domain(Size|Offset) Attributes Write globalDomainAxisLabels Document custom attributes and fix some errors Remove libsplash output from phasespace plugin
e67699d
to
325032b
Compare
Ok, I rebased it |
Coming back to the error where h5 files are not written with the new plugin. (The one I received earlier.) Do you have a way to check whether this is an issue with the openPMD plugin, @franzpoeschel? The error only occurs when writing phase space output with the new plugin to The Is there anything else I could try to pinpoint the problem? |
I'll try to see whether I can reproduce the error with this |
I can reproduce the file accessing issues with the Hemera openPMD module, a self-built version runs fine. I will try to locate the issue more precisely. EDIT: Self-built openPMD API + HDF5 from Module system crashes as well. Trying completely self-built next. Thanks for the measurements, I will have a look at them. |
Alright let's talk about debugging for a day just to find out I forgot to pass the MPI communicator At least this should also explain the differences you saw, you only ever saw parts of your data at once |
74a3610
to
8034b4f
Compare
@franzpoeschel I think you need to trigger the CI again by |
8034b4f
to
65891d3
Compare
Thanks for the hint, trying that again |
@steindev feel free to merge this PR after your final review |
See issue #3357.
TODO:
Remove libsplash output. Currently, both versions of output are produced. I'll keep libsplash around for a while, this makes for easier comparison.
File naming? The current plugin encodes the species name in the file name and writes separate files per species (e.g.
PhaseSpace_e_all_ypy_100.h5
). Since that information is classically encoded in the dataset path in openPMD, I think we could also write everything into one file (e.g.PhaseSpace_400.h5
). Would require some small refactoring to keep theSeries
handle open between different calls to the PhaseSpace plugin, much like the openPMD plugin already does. (Which is also necessary to support group-based iteration layout, e.g. for streaming).EDIT: As discussed below, we'll stick with the old file naming.
Attributes. The PhaseSpace plugin writes some attributes, and for some of them I had to take a guess concerning their meaning.
Some, I just copied verbatim (sim_unit, p_unit, p_min, p_max, dr, dV).
Others, I took a guess on what they mean within openPMD:
Some support by an actual user of the PhaseSpace plugin would be appreciated.
Also, some openPMD attributes are still left as default, need to fill in those, too.
I'll need to figure out how to express datasets with different physical units and scalings per dimension.
Old:
New:
Update the Python analysis scripts. EDIT: They seem to be fundamentally working, but I'll need to re-apply any changes there that are done to the attribute layout. Also some todos and further testing. Also, read from backends other than HDF5.
Documentation.
Cleanup
Related: openPMD/openPMD-standard#193. Necessary for expressing this data layout properly in openPMD, merged already in the standard, not yet available in the API.