Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: astype(..., copy=True) doesn't copy on dask #234

Merged
merged 3 commits into from
Jan 16, 2025

Conversation

lithomas1
Copy link
Contributor

dask.array.astype doesn't respect copy=True, so we should copy manually.

@lithomas1
Copy link
Contributor Author

oops, snuck some misc changes in.

Going to revert, sorry!

@lithomas1
Copy link
Contributor Author

cc @crusaderky @ev-br for review

Copy link
Contributor

@crusaderky crusaderky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also: #235

@@ -39,7 +39,14 @@

isdtype = get_xp(np)(_aliases.isdtype)
unstack = get_xp(da)(_aliases.unstack)
astype = _aliases.astype

def astype(x: Array, dtype: Dtype, /, *, copy: bool = True) -> Array:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks,
I just raised Notimplemented for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please ignore it. astype(x, dtype, device=device(x) must work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, PTAL.

array_api_compat/dask/array/_aliases.py Outdated Show resolved Hide resolved
Comment on lines 275 to 287

@pytest.mark.parametrize("library", wrapped_libraries)
def test_astype_copy(library):
# array-api-tests currently doesn't check copy=True
# makes a copy when dtypes are the same
# so we check that here
xp = import_(library, wrapper=True)
a = xp.asarray([1])
b = xp.astype(a, a.dtype, copy=True)

a[0] = 10

assert b[0] == 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this test to array-api-tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lithomas1 and others added 2 commits January 15, 2025 13:44
Co-Authored-By: Guido Imperiale <6213168+crusaderky@users.noreply.github.com>
@crusaderky
Copy link
Contributor

@ev-br this looks ready to be merged to me

Copy link
Contributor

@ev-br ev-br left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sequence of x.astype, then x.copy looks like making two copies to a non-expert; that said I'm ready to believe it's a right dask-specific pattern, and the PR is okay'd by the subject matter expert. Let's roll with this then. Thank you @lithomas1 , @crusaderky

@ev-br ev-br merged commit adbb6ef into data-apis:main Jan 16, 2025
42 checks passed
@crusaderky
Copy link
Contributor

crusaderky commented Jan 16, 2025

The sequence of x.astype, then x.copy looks like making two copies to a non-expert

It does make two copies of the dask collection - which is a thin python object. The dask graph wrapped internally is the same.

@lithomas1 lithomas1 deleted the dask-astype-copy branch January 16, 2025 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants