-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable non-packed inputs for mlir #3541
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #3541 +/- ##
========================================
Coverage 92.17% 92.17%
========================================
Files 512 512
Lines 21393 21393
========================================
Hits 19720 19720
Misses 1673 1673 ☔ View full report in Codecov by Sentry. |
This build is not recommended to merge 🔴 |
🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output |
stride_ratios.end(), | ||
ns.lens().begin() + 1, | ||
[](auto ratio, auto len) { return ratio >= len; }); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@krzysz00 This is checking if the shapes can be used by mlir(ie can be constructed from slice/reshape/transpose/broadcast). I know there are similar checks in mlir. Are you able to check that this is checking the shapes correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/ROCm/rocMLIR/blob/develop/mlir/lib/Dialect/MIGraphX/IR/MIGraphX.cpp#L182 is our logic.
This code feels similar, though I'd need to run through some examples with broadcast dimensions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming this doesn't read to non-packed outputs, seems fine to me
auto last = std::find(ns.strides().begin(), ns.strides().end(), 0); | ||
if(*std::prev(last) != 1) | ||
return false; | ||
std::adjacent_difference(ns.strides().begin(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to possible add a parallel execution policy here or are we not concerned about this difference being to large? Looks like adjacent difference has the ability to be parallelized, or is the intent here since shape is const we want this expression to be const so execution policy doesn't matter?
Disregard, stride_ratio shouldn't be const and order matters here due to back inserter not being valid for parallelism
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good unless @krzysz00 has any concerns
CI is all green not sure why github isn't reporting that correctly in jenkins here.
No description provided.