TL/UCP: Add linear alltoall and allgather algorithms based on xgvmi ucp get #992

nsarka · 2024-06-24T18:40:07Z

This PR is a follow up to allreduce sliding window. It adds linear alltoall and allgather algorithms based on XGVMI. They will post ucp gets from host to host in a round robin fashion.

janjust

Went over the code with Nick, LGTM

janjust · 2024-06-26T17:14:34Z

I don't think we should do this now, but these algorithms, including sliding-window AR will not need the allgather in the init function when #909 is merged

samnordmann

LGTM overall, thanks! I still need to go voer the main file tl_ucp_dpu_offload.c.

Just a first round of minor review in the meantime

src/components/tl/ucp/allgather/allgather.h

src/components/tl/ucp/allgather/allgather_xgvmi.c

samnordmann · 2024-07-01T15:33:48Z

src/components/tl/ucp/allgather/allgather_xgvmi.c

+
+    req_param.op_attr_mask |= UCP_OP_ATTR_FIELD_MEMH;
+
+    for (i = *posted; i < host_team_size; i++) {


is it possible that posted is not 0 when entering this function?
Would it make sense to put this first loop in a "start" function instead?

On the first entry *posted will be 0, but after that it will be equal to host_team_size. It's possible to make a standalone start function, but I thought it would be less code to reuse ucc_tl_ucp_dpu_xgvmi_rdma_task_post for both alltoall and allgather, even though it isn't that long (10 lines)

samnordmann · 2024-07-01T15:37:10Z

src/components/tl/ucp/allgather/allgather_xgvmi.c

+
+    ucp_worker_progress(tl_ctx->worker.ucp_worker);
+
+    for (i = *completed; i < *posted; i++) {


here *posted is necessarily equal to host_team_size right?

yes, when it gets to this point all the gets will be posted. I think *posted might be clearer, just because we want *posted == *completed at the end. What do you think?

src/components/tl/ucp/alltoall/alltoall.h

src/components/tl/ucp/tl_ucp_coll.h

src/components/tl/ucp/tl_ucp_dpu_offload.h

samnordmann · 2024-07-01T15:49:46Z

Can you update the tests as well?

nsarka · 2024-07-08T21:50:44Z

Can you update the tests as well?

I updated the gtests to test the linear allgather/alltoall

Co-authored-by: samnordmann <snordmann@nvidia.com>

swx-jenkins3 · 2024-12-07T04:21:53Z

Can one of the admins verify this patch?

nsarka force-pushed the nsarka/allgather_xgvmi branch from 3b8cbf2 to ecbd9e1 Compare June 24, 2024 18:40

janjust requested review from Sergei-Lebedev and samnordmann June 24, 2024 18:40

nsarka force-pushed the nsarka/allgather_xgvmi branch from ecbd9e1 to a362467 Compare June 25, 2024 15:48

nsarka marked this pull request as draft June 25, 2024 21:31

nsarka changed the title ~~TL/UCP: Add xgvmi allgather~~ TL/UCP: Add linear alltoall and allgather algorithms based on xgvmi ucp get Jun 26, 2024

nsarka marked this pull request as ready for review June 26, 2024 17:11

janjust approved these changes Jun 26, 2024

View reviewed changes

samnordmann reviewed Jul 1, 2024

View reviewed changes

nsarka force-pushed the nsarka/allgather_xgvmi branch from 8e17a30 to 419a128 Compare July 8, 2024 16:52

nsarka force-pushed the nsarka/allgather_xgvmi branch from 873b576 to ea42acf Compare July 29, 2024 14:07

nsarka and others added 10 commits August 20, 2024 12:21

TL/UCP: Add xgvmi allgather

3175f2d

TL/UCP: Refactor, add alltoall

ae446bf

TL/UCP: Call algs linear xgvmi

50a932c

TL/UCP: Update copyright

03cbf73

Co-authored-by: samnordmann <snordmann@nvidia.com>

TL/UCP: Review feedback

a6a6dd7

CL/DOCA_UROM: Plugin allgather/alltoall

06a3174

TL/UCP: Review feedback

e0e6c6e

TL/UCP: Update gtests

17c8cfc

TL/UCP: Update to use macros

6e83522

TL/UCP: Fix memory leak

925c3bb

nsarka force-pushed the nsarka/allgather_xgvmi branch from ea42acf to 925c3bb Compare August 20, 2024 16:21

janjust added the WIP - Don't Merge label Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TL/UCP: Add linear alltoall and allgather algorithms based on xgvmi ucp get #992

TL/UCP: Add linear alltoall and allgather algorithms based on xgvmi ucp get #992

nsarka commented Jun 24, 2024 •

edited

Loading

janjust left a comment

janjust commented Jun 26, 2024

samnordmann left a comment

samnordmann Jul 1, 2024

nsarka Jul 8, 2024 •

edited

Loading

samnordmann Jul 1, 2024

nsarka Jul 8, 2024

samnordmann commented Jul 1, 2024

nsarka commented Jul 8, 2024

swx-jenkins3 commented Dec 7, 2024


		req_param.op_attr_mask \|= UCP_OP_ATTR_FIELD_MEMH;

		for (i = *posted; i < host_team_size; i++) {


		ucp_worker_progress(tl_ctx->worker.ucp_worker);

		for (i = completed; i < posted; i++) {

TL/UCP: Add linear alltoall and allgather algorithms based on xgvmi ucp get #992

Are you sure you want to change the base?

TL/UCP: Add linear alltoall and allgather algorithms based on xgvmi ucp get #992

Conversation

nsarka commented Jun 24, 2024 • edited Loading

janjust left a comment

Choose a reason for hiding this comment

janjust commented Jun 26, 2024

samnordmann left a comment

Choose a reason for hiding this comment

samnordmann Jul 1, 2024

Choose a reason for hiding this comment

nsarka Jul 8, 2024 • edited Loading

Choose a reason for hiding this comment

samnordmann Jul 1, 2024

Choose a reason for hiding this comment

nsarka Jul 8, 2024

Choose a reason for hiding this comment

samnordmann commented Jul 1, 2024

nsarka commented Jul 8, 2024

swx-jenkins3 commented Dec 7, 2024

nsarka commented Jun 24, 2024 •

edited

Loading

nsarka Jul 8, 2024 •

edited

Loading