Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CeedBasisApplyAdd #1644

Merged
merged 9 commits into from
Aug 13, 2024
Merged

Add CeedBasisApplyAdd #1644

merged 9 commits into from
Aug 13, 2024

Conversation

jeremylt
Copy link
Member

@jeremylt jeremylt commented Aug 6, 2024

WIP, GPU impl and CeedBasisApplyAddAtPoints to come

@jeremylt jeremylt self-assigned this Aug 6, 2024
@jeremylt jeremylt force-pushed the jeremy/basis-apply-add branch 2 times, most recently from 9aa6d80 to 855975c Compare August 6, 2024 22:34
@jeremylt
Copy link
Member Author

jeremylt commented Aug 6, 2024

Ok, design decision point - the usage really only makes sense for CEED_TRANSPOSE especially for why I need this, so I made that an explicit requirement.

I think for GPU I think I'll want separate Apply vs ApplyTranspose kernels in some cases?

@jeremylt jeremylt force-pushed the jeremy/basis-apply-add branch 2 times, most recently from 8178602 to a322f13 Compare August 7, 2024 15:15
@jrwrigh
Copy link
Collaborator

jrwrigh commented Aug 7, 2024

the usage really only makes sense for CEED_TRANSPOSE especially for why I need this, so I made that an explicit requirement.

Agreed that having the transpose as an input, but verifying it to always be transpose works well. Might also include that in the docstring for it, ie:

  @param[in]  t_mode    @ref CEED_TRANSPOSE to apply the transpose, mapping from quadrature points to nodes. CEED_NOTRANSPOSE is not valid for CeedBasisApplyAdd.

@jeremylt jeremylt force-pushed the jeremy/basis-apply-add branch from a322f13 to 652a51b Compare August 7, 2024 18:23
@jeremylt jeremylt force-pushed the jeremy/basis-apply-add branch from eac281b to 5e414ad Compare August 8, 2024 22:06
@jeremylt
Copy link
Member Author

jeremylt commented Aug 8, 2024

Update - CUDA/HIP ref and shared are good, need to tackle MAGMA

@nbeams
Copy link
Contributor

nbeams commented Aug 8, 2024

I'm not able to follow libCEED activity as closely these days, but please ping me if you run into any issues with the MAGMA backend.

@jeremylt
Copy link
Member Author

jeremylt commented Aug 8, 2024

Thanks! The TLDR here is that I want to sum into the target vector with CeedBasisApplyAdd for CEED_TRANSPOSE for CEED_EVAL_INTERP and CEED_EVAL_GRAD. It looks not too tricky for MAGMA at a glance but I haven't dug in on it

@jeremylt jeremylt force-pushed the jeremy/basis-apply-add branch 2 times, most recently from 30a674e to bd66db0 Compare August 9, 2024 20:02
@jeremylt
Copy link
Member Author

jeremylt commented Aug 9, 2024

Only tensor product grad is correctly summing into the target vector with my changes for MAGMA, unclear why but I'll dig more Monday

Copy link
Contributor

@nbeams nbeams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a quick review, found some copy-paste bugs

backends/magma/ceed-magma-basis.c Outdated Show resolved Hide resolved
backends/magma/ceed-magma-basis.c Outdated Show resolved Hide resolved
backends/magma/ceed-magma-basis.c Outdated Show resolved Hide resolved
@jeremylt jeremylt force-pushed the jeremy/basis-apply-add branch from bd66db0 to 18d93ab Compare August 9, 2024 23:01
@nbeams
Copy link
Contributor

nbeams commented Aug 10, 2024

As a side note, looking at the new tests in this PR reminded me that we might want to check on the test coverage for the non-tensor basis in MAGMA. Now that we have two options -- the RTC kernels for lower orders and standard library calls for the rest -- I'm not sure how often, if at all, the "standard" route is being tested. (For t363 I manually lowered MAGMA_NONTENSOR_CUSTOM_KERNEL_MAX_P/MAGMA_NONTENSOR_CUSTOM_KERNEL_MAX_Q to make sure it also passed with the non-custom-kernel path -- and it did -- but of course that won't happen in the usual testing).

@jeremylt jeremylt force-pushed the jeremy/basis-apply-add branch 2 times, most recently from 56c8d78 to f5cd871 Compare August 12, 2024 21:50
@jeremylt jeremylt force-pushed the jeremy/basis-apply-add branch from f5cd871 to 311be37 Compare August 12, 2024 22:06
Copy link
Collaborator

@zatkins-dev zatkins-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I like this refactor. I noticed a few places where it is probably more clear to index an array instead of doing pointer arithmetic, but if you disagree that's fine.

jeremylt and others added 2 commits August 13, 2024 15:08
style - consistently use indexing over pointer arithmatic

Co-authored-by: Zach Atkins <zach.atkins@colorado.edu>
@jeremylt jeremylt force-pushed the jeremy/basis-apply-add branch from 7b32c88 to fbbb68f Compare August 13, 2024 22:03
@jeremylt jeremylt merged commit db2becc into main Aug 13, 2024
23 of 24 checks passed
@jeremylt jeremylt deleted the jeremy/basis-apply-add branch August 13, 2024 22:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants