-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance issues with deepMap #3266
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This looks really promising @dvd101x ! Some thoughts:
|
Thanks Jos!
|
While working on tests I found something very interesting. If a typed function is wrapped in a regular function it seems that the map and forEach function gains significant speed.
It might be due to the |
In the latest version some big improvements were made regarding speed, even in a few cases surpassing By running benchmarks on a MacBook Air M1 with 8 GB the following results were found comparing develop vs this.
The next step is to try to reduce the lines of code while trying to maintain performance as previously commented. Also I will try merging the |
😎 thanks for the updates |
@dvd101x I've just merged #3251, which improves For reference: I compared the current
|
Hi Jos, Thanks for the update In this PR the intention was to see how much performance could be gained while maintaining recursion, as I understand the main improvement of #3251 is this new algorithm that eliminates recursion. I wasn't sure at the beginning but I'm glad that even with all the attempts to fix the issues with I agree that we should merge some of the logic in a way that arrays can also benefit from this new algorithm (if possible). I did test using I'm still reviewing the new algorithm and trying to understand how does it manages the differences when a callback has one, two or three arguments and it's effect in performance. I'm also not sure what is the best way to proceed. I can try something for the following weeks. |
Maybe it is easiest to start in a new PR and split the task in two steps:
|
Maybe you are right. I found another bottleneck in this code and it made the recursion functions faster, but they were still slower than develop. I will attempt first unify logic into a utility, maybe the option of skip zeros will have to be removed or implemented with a special callback. The next step is that the unification eventually is lost, when the dense matrix is stored as a flat array, I assume it might be faster even with the index conversions. I think there is another area of opportunity for the optimize callback utility, including an option to also simplify when the callback has only one signature with the right amount of arguments. Please let me give it a try and eventually close this PR. |
Sounds good, thanks! |
Hi Jos, Just to let you know, I did have some progress with this, but couldn't make it work for all tests. I was thinking of some options to move forwards:
Ideally I don't think it's needed for all to use the same logic, as I think the main goal is for matrices to use flat arrays internally (as tensorflowjs does). |
Thanks for the update David! Good to hear you're planning to work on this next month. |
Hi Jos, This branch maintains the current mapping algorithm for matrices but for arrays it implements the fixes in recursion functions and addresses the issues in deep map. It is currently passing tests, but I would like to add more benchmarks for map and forEach address a reduction of logic following your comment regarding
Previously we discussed to unify the logic for DenseMatrix._forEach which I might be able to resolve in the future, but I would like to include this proposal as such I case it's desirable.
|
Hi Jos,
This PR addresses issue #3262 and introduces improvements to our recursion functions which increases performance in general:
Key changes:
collection.deepMap
andcollection.deepForEach
: are optimized to run the callback with only one argument. Thus avoiding the use ofmatrix.map
.Other changes:
collection.deepMap
: UsesskipZeros
argumentmap
andforEach
recursions for improved speed.Areas for Future Consideration:
func.length
)(array, array, callback)
but for matrices is(matrix.valueOf(), matrix, callback)
.While I previously suggested discussing these changes before submitting a PR, I decided to proceed directly without breaking changes and to facilitate further discussion based on the concrete implementation.
Please review the changes and provide feedback. I'm open to discussing potential modifications or alternative approaches.