-
Notifications
You must be signed in to change notification settings - Fork 145
Conference call notes 20201223
Kenneth Hoste edited this page Dec 23, 2020
·
5 revisions
(back to Conference calls)
Notes on the 163rd EasyBuild conference call, Wednesday December 23rd 2020 (9:00 UTC - 10:00 CET)
Alphabetical list of attendees (13):
- Sebatian Achilles (Jülich Supercomputing Centre, Germany)
- Damian Alvarez (Jülich Supercomputing Centre, Germany)
- Simon Branford (University of Birmingham, UK)
- Miguel Dias Costa (University of Singapore)
- Alex Domingo (Vrije Universiteit Brussel, Belgium)
- Victor Holanda (CSCS, Switzerland)
- Kenneth Hoste (HPC-UGent, Belgium)
- Samuel Moors (Vrije Universiteit Brussel, Belgium)
- Terje Kvernes (University of Oslo, Norway)
- Mikael Öhman (Chalmers University of Technology, Sweden)
- Åke Sandgren (Umeå University, Sweden)
- Jörg Saßmannshausen (NIHR Biomedical Research Centre, UK)
- Lars Viklund (Umeå University, Sweden)
- update on recent developments
- support for installing/using a toolchain based on Intel oneAPI
- compilers and libraries for AMD toolchain (AOCC & co)
- pros and cons for merging
foss
andfosscuda
toolchains - Q&A
- recent changes
-
framework
-
bug fixes
- (none)
-
enhancements
- (none)
-
bug fixes
-
easyblocks
-
bug fixes
- (nothing special)
-
enhancements
- create versioned symlinks (
cmake3
) for CMake commands (PR #2259)- to avoid that PyTorch picks up
cmake3
from system... - should we provide a function in framework to create symlinks like this
- to avoid that PyTorch picks up
- unify handling of
pylibdirs
and don't add duplicated$PYTHONPATH
inPythonBundle
(PR #2281) - add options to run unit tests to TensorFlow EasyBlock (PR #2263)
- create versioned symlinks (
-
new software
- (nothing special)
-
bug fixes
-
easyconfigs
-
bug fixes
- fix name of source file for GDRCopy v2.1 (PR #11887)
- add patch to fix miscompilation bug on POWER for GCC 8.x and 9.x (PR #11837)
- fix compilation of TensorFlow 2.3.1 with CUDA and glibc 2.26 on POWER (PR #11859)
- this a broader issue (see #11913)
- also affects CuPy, magma, PyTorch, etc.
- 2019a toolchains or more recent with CUDA 10.x on RHEL8 (fixed in CUDA 11.0)
- support for arch-specific patches could come in useful here
- there's a workaround for the segfault with impi on CentOS 8
- see https://github.com/easybuilders/easybuild-easyconfigs/issues/11762
- maybe this issue is fixed in impi 2018 update 8
-
enhancements
- (nothing special)
-
new software
- SeisSol (PR #7194)
- software updates
-
changes
- replace easyconfigs for bpp-core/bpp-phyl/bpp-seq v2.4.1 with a single easyconfig for BioPP v2.4.1 (using Bundle easyblock) (PR #11609)
-
bug fixes
-
framework
- to merge/fix/tackle soon
-
framework (v4.4.0 milestone)
- support additional features in easystack files (see issues #3468, #3512, #3513, #3516)
- directories that don't contain any library files shouldn't be added to
$LD_LIBRARY_PATH
(issue #3504) - EasyBuild may loop forever when out of disk space (issue #3531)
- log files leaking into each other when using
--robot
(issue #3533) - avoid duplicate or useless entries in RPATH (issue #3534)
-
easyblocks (v4.4.0 milestone)
- bug fixes
-
enhancements
- enhance OpenBLAS easyblock to make it aware of
optarch
(PR #1946) - run
motorBike
tutorial case for recent (community) OpenFOAM versions (PR #2201)- needs testing...
- add support for statically linking Bazel (PR #2272)
- set
$PYTHONNOUSERSITE
in PythonBundle.extensions_step (PR #2272) - improve Bazel EasyBlock (PR #2285)
- add support for skipping steps in Extension PythonPackages (PR #2290)
- set
$TF_GPU_COUNT
and$TF_TESTS_PER_GPU
for TensorFlow tests (PR #2292)
- enhance OpenBLAS easyblock to make it aware of
- new software
- easyconfigs (v4.4.0 milestone)
-
framework (v4.4.0 milestone)
- separate components
- enhance
IntelBase
- diff between old & new compiler commands?
- new compilers (
icx
,icpx
,ifx
) are also based on Clang
- new compilers (
- diff in linking to MKL?
- link advisor suggests no
- Sebastian's open PR for AOCC (https://github.com/easybuilders/easybuild-easyconfigs/pull/11868)
- what do we do with the EULA?
- environment variable to accept EULA?
-
export EASYBUILD_ACCEPT_EULA=AOCC,oneAPI
(eb --accept-eula=AOCC
)
- do we need a custom easyblock for AOCC?
- for example to specify version of AMDlibm? (currently we use latest)
- Miguel: we also need a toolchain option to control which libm is being linked (
-lamdm -lm
)- with GCC you also need to make code changes when you want to link to
libamdm.so
- with GCC you also need to make code changes when you want to link to
-
- framework support for using AOCC compiler
- AOCC depends on GCC, so should sit on top of GCCcore
- pre-compiled but it looks in system paths by default
- which GCC can be controlled via an environment variable
- Sebastian: article comparing Clang and AOCC showed little performance benefits
- see https://www.phoronix.com/scan.php?page=article&item=amd-aocc-22
- maybe we should use standard Clang + libamdm?
- what do we do with the EULA?
- What about support for ROCM for AMD GPUs?
- see upcoming talk at FOSDEM HPC devroom
- this could be quite a bit of work, looks like a complex ecosystem
- who has time for this?
- paint point is OpenMPI+UCX with/without CUDA support
- OpenMPI with CUDA support can be used on non-GPU systems
- compatibility of CUDA with recent GCC versions is holding things back a bit
- benefit would be reducing easyconfigs we need, easier to combine software
- Alex: weird situation now is having both toolchains like
gcccuda
+ CUDA included as a dep (and mentioned inversionsuffix
) - for
foss/2021a
we could start without CUDA support- when there's a release of CUDA that's compatible with the GCC in
foss/2021a
, we could start looking into stuff that requires CUDA - that implies being able to swap things like UCX/OpenMPI with a CUDA-capable alternative via
dependencies
- when there's a release of CUDA that's compatible with the GCC in
-
%(cudaver)s
template should be set when having CUDA in toolchain, or when includingCUDAcore
as a dependency- currently only set when
CUDA
is a direct dep - Mikael may look into this
- currently only set when
- Damian: updating from CentOS 7 to 8
- haven't seen too many issues
- installing compatibility libraries for openssl and co helps
- Simon: ABAQUS & ANSYS don't support RHEL8 yet
- UiO is already using ANSYS on CentOS 8 for a while
- Victor: more automation w.r.t. generating easyconfigs for new toolchains