Skip to content

Commit

Permalink
Merge pull request #610 from mbareford/mbareford/linaro-forge-24.0
Browse files Browse the repository at this point in the history
Mbareford/linaro forge 24.0
  • Loading branch information
markgbeckett authored May 27, 2024
2 parents 9408fb4 + 474bb47 commit 23ddc2e
Show file tree
Hide file tree
Showing 9 changed files with 72 additions and 84 deletions.
Binary file modified docs/data-tools/forge-ddt.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
125 changes: 60 additions & 65 deletions docs/data-tools/arm-forge.md → docs/data-tools/forge.md
Original file line number Diff line number Diff line change
@@ -1,79 +1,73 @@
## Arm Forge
## Linaro Forge


!!! note
What were Arm Forge products (DDT and MAP) were acquired by Linaro in 2023.
We may continue to refer to "Arm Forge" to prevent confusion,
particularly in the context of module files.
However, relevant links are now to the
[Linaro web site](https://www.linaroforge.com/)
[Linaro Forge](https://www.linaroforge.com/) provides debugging and profiling tools
for MPI parallel applications, and OpenMP or pthreads multi-threaded applications
(and also hydrid MPI/OpenMP). Forge DDT is the debugger and MAP is the profiler.


[Linaro Forge](https://www.linaroforge.com/)
provides debugging and profiling tools for MPI parallel applications, and
OpenMP or pthreads multi-threaded applications (and also hydrid MPI/OpenMP).
The debugger and profiler are called DDT and MAP, respectively.
### User interface

ARCHER2 has a license for up to 16 nodes (2048 cores) shared between
all users at any one time. (Note, cores are counted by the license, not
MPI processes, threads, or any other software entity.)
There are two ways of running the Forge user interface. If you have a good internet
connection to ARCHER2, the GUI can be run on the front-end (with an X-connection).
Alternatively, one can download a copy of the Forge remote client to your laptop or desktop,
and run it locally. The remote client should be used if at all possible.

There are two ways of running the Forge user interface. If you have a good
internet connection to ARCHER2, the GUI can be run on the front-end (with
an X-connection).
Alternatively, one can download a copy of the Forge remote client to your
laptop or desktop, and run it locally. The remote client should be used if
at all possible.
To download the remote client, see the [Forge download pages](https://www.linaroforge.com/downloadForge/).
Version 24.0 is known to work at the time of writing. A section further down this page explains how to use the remote client,
see [Connecting with the remote client](#connecting-with-the-remote-client).

To download the remote client, see the
[Forge developer download pages](https://www.linaroforge.com/downloadForge/).
Version 22.1.1 is known to work at the time of writing. Connecting with
the remote client is discussed below.
### Licensing

ARCHER2 has a licence for up to 2080 tokens, where a token represents an MPI parallel process.
Running Forge DDT/MAP to debug/profile a code running across 16 nodes using 128 MPI ranks per
node would require 2048 tokens. If you wish to run on more nodes, say 32, then it will be
necessary to reduce the number of tasks per node so as to fall below the maximum number of
tokens allowed.

Please note, Forge licence tokens are shared by all ARCHER2 (and [Cirrus](https://www.cirrus.ac.uk/)) users.

To see how many tokens are in use, you can view the licence server status page by first
setting up an SSH tunnel to the node hosting the licence server.

```bash
ssh <username>@login.archer2.ac.uk -L 4241:dvn04:4241
```

You can now view the status page from within a local browser, see [http://localhost:4241/status.html](http://localhost:4241/status.html).


### One time set-up for using Forge

A preliminary step is required to set up the necessary
Forge configuration files that allow DDT and MAP to initialise its
environment correctly so that it can, for example, interact with
the SLURM queue system. These steps should be performed in the `/work`
the Slurm queue system. These steps should be performed in the `/work`
file system on ARCHER2.

It is recommended that these commands are performed in the top-level work
file system directory for the user account, i.e., `${HOME/home/work}`.

```bash
module load arm/forge
module load forge
cd ${HOME/home/work}
source ${FORGE_DIR}/config-init
```

This will create a directory `${HOME/home/work}/.allinea` that contains the
Running the `source` command will create a directory `${HOME/home/work}/.forge` that contains the
following files.

```output
system.config user.config
```

The directory will also store other relevant files when Forge is run.

!!! warning
The `config-init` script will output a warning, `...failed to read system config`.
Please ignore: subsequent output should indicate that the new configuration
The `config-init` script may output, `Warning: failed to read system config`.
Please ignore as subsequent messages should indicate that the new configuration
files have been created.

Once you have created this directory, you also need to modify the `system.config` file in the directory `${HOME/home/work/.allinea}`, editing the line

```bash
shared directory = ~
```

To instead point to your `${HOME/home/work/.allinea}` directory, i.e. if you are in the `z19` project, that would be:

```bash
shared directory = /work/z19/z19/$USER/.allinea
```
Within the `system.config` file you should find that `shared directory` is set to the equivalent of `${HOME/home/work/.forge}`.
That directory will also store other relevant files when Forge is run.

### Using DDT

Expand Down Expand Up @@ -101,9 +95,9 @@ Such a job can be submitted to the batch system in the usual way. The
relevant command to start the executable is as follows.

```slurm
# ... SLURM batch commands as usual ...
# ... Slurm batch commands as usual ...
module load arm/forge
module load forge
export OMP_NUM_THREADS=16
export OMP_PLACES=cores
Expand All @@ -117,7 +111,7 @@ ddt --verbose --offline --mpi=slurm --np 8 \
```

The parallel launch is delegated to `ddt` and the `--mpi=slurm` option
indicates to `ddt` that the relevant queue system is SLURM
indicates to `ddt` that the relevant queue system is Slurm
(there is no explicit `srun`). It will also be
necessary to state explicitly to `ddt` the number of processes
required (here `--np 8`). For other options see, e.g., `ddt --help`.
Expand All @@ -132,10 +126,10 @@ to examine the state of execution at the point of failure.

#### Interactive debugging: using the client to submit a batch job

You can also start the client interactively (for details of remote launch, see below).
You can also start the client interactively (for details of remote launch, see [Connecting with the remote client](#connecting-with-the-remote-client)).

```bash
module load arm/forge
module load forge
ddt
```

Expand Down Expand Up @@ -177,31 +171,33 @@ section and specify the relevant project budget, see the ***Account*** entry.
The default queue template file configuration uses the short QoS with the
standard time limit of 20 minutes. If something different is required,
one can edit the settings. Alternatively, one can copy the `archer2.qtf` file
(to `${HOME/home/work}/.allinea`) and make the relevant changes. This new
(to `${HOME/home/work}/.forge`) and make the relevant changes. This new
template file can then be specified in the dialog window.

There may be a short delay while the sbatch job starts. Debugging should
then proceed as described in the Allinea documentation.
then proceed as described in the [Linaro Forge documentation](https://docs.linaroforge.com/24.0/html/forge/ddt/index.html).


### Using MAP

Load the `arm/forge` module:
Load the `forge` module:

```bash
module load arm/forge
module load forge
```

#### Compilation and linking
#### Linking

Compilation should take place as usual. However, an additional set of
libraries is required at link time.
MAP uses two small libraries to collect data from your program. These
are called `map-sampler` and `map-sampler-pmpi`. On ARCHER2, the linking
of these libraries is usually done automatically via the LD_PRELOAD
mechanism, but only if your program is dynamically linked. Otherwise, you
will need to link the MAP libraries manually by providing explicit link options.

The path to the additional libraries required will depend on the programming
environment you are using as well as the Cray programming release. Here are
the paths for each of the compiler environments consistent with the
Cray Programming Release (CPE) 21.04 using the default OFI as the low-level
comms protocol:
The library paths specified in the link options will depend on the programming
environment you are using as well as the Cray programming release. Here are the
paths for each of the compiler environments consistent with the Cray Programming
Release (CPE) 22.12 using the default OFI as the low-level comms protocol:

- `PrgEnv-cray`: `${FORGE_DIR}/map/libs/default/cray/ofi`
- `PrgEnv-gnu`: `${FORGE_DIR}/map/libs/default/gnu/ofi`
Expand All @@ -224,9 +220,9 @@ comms protocol, simply replace `ofi` with `ucx` in the library path.
Submit a batch job in the usual way, and include the lines:

```slurm
# ... SLURM batch commands as usual ...
# ... Slurm batch commands as usual ...
module load arm/forge
module load forge
# Ensure the cpus-per-task option is propagated to srun commands
export SRUN_CPUS_PER_TASK=$SLURM_CPUS_PER_TASK
Expand All @@ -245,8 +241,7 @@ file selection dialog box can then be used to locate the `.map` file.
### Connecting with the remote client

If one starts the Forge client on e.g., a laptop, one should see the
main window as
shown above. Select ***Remote Launch*** and then ***Configure*** from the
main window as shown above. Select ***Remote Launch*** and then ***Configure*** from the
drop-down menu. In the ***Configure Remote Connections*** dialog box
click ***Add***. The following window should be displayed. Fill
in the fields as shown. The ***Connection Name*** is just a tag
Expand All @@ -258,7 +253,7 @@ commands on connection. A default script is provided in the location
shown.

```output
/work/y07/shared/utils/core/arm/forge/latest/remote-init
/work/y07/shared/utils/core/forge/latest/remote-init
```

Other settings can be as shown. Remember to click ***OK*** when done.
Expand All @@ -272,7 +267,7 @@ password to connect. A remote connection will allow you to debug,
or view a profile, as discussed above.

If different commands are required on connection, a copy of the
`remote-init` script can be placed in, e.g., `${HOME/home/work}/.allinea`
`remote-init` script can be placed in, e.g., `${HOME/home/work}/.forge`
and edited as necessary. The full path of the new script should then be
specified in the remote launch settings dialog box.
Note that the script changes the directory to the `/work/` file system so
Expand Down
2 changes: 1 addition & 1 deletion docs/data-tools/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ The tools currently available in this section are (software that is installed or
by third-parties rather than the ARCHER2 service are marked with *):

- [AMD &mu;Prof](amd-uprof.md): Profiling tools provided by AMD
- [Arm Forge](arm-forge.md): Provides debugging and profiling tools for MPI parallel applications, and
- [Linaro Forge](forge.md): Provides debugging and profiling tools for MPI parallel applications, and
OpenMP or pthreads mutli-threaded applications (and also hydrid MPI/OpenMP)
- [Darshan](darshan.md): Lightweight IO characterisation and profiling tool
- [Energy Counters](pm-mpi-lib.md): MPI-based library for reading energy counters
Expand Down
8 changes: 4 additions & 4 deletions docs/user-guide/debug.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

The following debugging tools are available on ARCHER2:

- [Arm Forge (DDT)](../data-tools/arm-forge.md) is an easy-to-use graphical
- [Linaro Forge (DDT)](../data-tools/forge.md) is an easy-to-use graphical
interface for source-level debugging of compiled C/C++ or Fortran codes.
It can also be used for non-interactive debugging, and there
is also some limited support for python debugging.
Expand All @@ -27,11 +27,11 @@ The following debugging tools are available on ARCHER2:
side-by-side to analyse differences. (Not currently described in this
documentation.)

## Arm Forge
## Linaro Forge

The Arm Forge tool (now Linaro Forge) provides the DDT parallel debugger. See:
The Linaro Forge tool provides the DDT parallel debugger. See:

- [ARCHER2 Arm Forge documentation](../data-tools/arm-forge.md)
- [ARCHER2 Linaro Forge documentation](../data-tools/forge.md)

## gdb4hpc

Expand Down
2 changes: 1 addition & 1 deletion docs/user-guide/dev-environment.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ The following is a list of the modules and extensions currently available:
aocl: aocl/3.1, aocl/4.0
arm/forge: arm/forge/22.1.3
forge: forge/24.0
atp: atp/3.14.16
Expand Down
7 changes: 0 additions & 7 deletions docs/user-guide/gpu.md
Original file line number Diff line number Diff line change
Expand Up @@ -871,10 +871,6 @@ https://rocm.docs.amd.com/projects/ROCgdb/en/docs-5.2.3/index.html
https://docs.amd.com/projects/HIP/en/docs-5.2.3/how_to_guides/debugging.html#using-rocgdb


!!! Note
The license for Linaro-forge help on ARCHER2 does not include support for GPU debugging.


## Profiling

An initial profiling capability is provided via `rocprof` which is part of the `rocm` module.
Expand All @@ -888,9 +884,6 @@ srun -n 2 --exclusive --nodes=1 --time=00:20:00 --partition=gpu --qos=gpu-exc --

to profile your applicaition. More detail on the use of rocprof can be found [here](https://github.com/ROCm/rocprofiler/tree/rocm-5.2.3).

!!! Note
The license for Linaro-forge help on ARCHER2 does not include support for GPU profiling.


## Performance tuning

Expand Down
8 changes: 4 additions & 4 deletions docs/user-guide/profile.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ ARCHER2. In this section, we discuss the HPE Cray profiling tools,
CrayPat-lite and CrayPat. We also show how to get usage data
on currently running jobs from Slurm batch system.

You can also use [the Arm Forge tool](../data-tools/arm-forge.md)
You can also use [the Linaro Forge tool](../data-tools/forge.md)
to profile applications on ARCHER2.

If you are specifically interested in profiling IO, then you
Expand Down Expand Up @@ -575,11 +575,11 @@ The AMD &mu;Prof tool provides capabilities for low-level profiling on AMD proce

- [AMD &mu;Prof](../data-tools/amd-uprof.md)

## Arm Forge
## Linaro Forge

The Arm Forge tool also provides profiling capabilities. See:
The Linaro Forge tool also provides profiling capabilities. See:

- [ARCHER2 Arm Forge documentation](../data-tools/arm-forge.md)
- [ARCHER2 Linaro Forge documentation](../data-tools/forge.md)

## Darshan IO profiling

Expand Down
2 changes: 1 addition & 1 deletion docs/user-guide/sw-environment.md
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,7 @@ auser@uan01:~> module avail
------------------------------------ /work/y07/shared/archer2-lmod/utils/core -------------------------------------
amd-uprof/3.6.449 darshan-util/3.3.1 imagemagick/7.1.0 reframe/4.1.0
arm/forge/22.1.3 epcc-reframe/0.2 ncl/6.6.2 tcl/8.6.13
forge/24.0 epcc-reframe/0.2 ncl/6.6.2 tcl/8.6.13
bolt/0.7 epcc-setup-env (L) nco/5.0.3 (D) tk/8.6.13
bolt/0.8 (L,D) gct/v6.2.20201212 nco/5.0.5 usage-analysis/1.2
cdo/1.9.9rc1 genmaskcpu/1.0 ncview/2.1.7 visidata/2.1
Expand Down
2 changes: 1 addition & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ nav:
- "Data Analysis and Tools":
- "Overview": data-tools/index.md
- "AMD uProf": data-tools/amd-uprof.md
- "Arm Forge": data-tools/arm-forge.md
- "Linaro Forge": data-tools/forge.md
- "Darshan": data-tools/darshan.md
- "Energy Counters": data-tools/pm-mpi-lib.md
- "Globus": data-tools/globus.md
Expand Down

0 comments on commit 23ddc2e

Please sign in to comment.