Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Dask: sort and argsort #239

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

crusaderky
Copy link
Contributor

@crusaderky crusaderky commented Jan 23, 2025

Crude implementation of sort and argsort for dask.array, which is functionally correct but can be extremely memory and network-intensive.

A better solution would be to implement these two functions in dask.array itself, on top of the shuffle subsystem which is already used for dask.dataframe.DataFrame.sort_values.

FYI @fjetter @phofl @hendrikmakait

@crusaderky
Copy link
Contributor Author

FYI @lucascolley @lithomas1

@crusaderky crusaderky force-pushed the dask_sort branch 3 times, most recently from c0f8617 to 8500867 Compare January 23, 2025 11:21
@crusaderky
Copy link
Contributor Author

@ev-br @lucascolley ready for review and merge.

@crusaderky crusaderky mentioned this pull request Jan 23, 2025
Copy link
Member

@lucascolley lucascolley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you like to get feedback from a Dask expert before we merge this? Or are you confident that it is at least good enough for now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants