-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add zarr v2 endpoints to Tiled #774
base: main
Are you sure you want to change the base?
Conversation
FIX: Recursion error when pickling with dill TST: Initial tests for zarr endpoints Clean, refactor, and lint TST: tests for arrays and tables TST: tests for arrays and tables ENH: restructure demo examples ENH: (partial) support for StructDtype TST: fix tests ENH: support for datetime types
pyproject.toml
Outdated
@@ -44,6 +44,7 @@ tiled = "tiled.commandline.main:main" | |||
|
|||
# This is the union of all optional dependencies. | |||
all = [ | |||
"aiohttp", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot find where this is used. Maybe this was needed in some transient state of the PR but no longer is needed.
$ git grep aiohttp
pyproject.toml: "aiohttp",
pyproject.toml: "aiohttp",
pyproject.toml: "aiohttp",
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moreover, since we use starlette
for the server and httpx
for the client, it would be somewhat odd and redundant to use aiohttp
as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is used by fsspec.implementations.http.HTTPFileSystem
, which is needed to connect to a tiled server that requires authentication. I had the same thought yesterday, that this was something I used before but no longer need, but unfortunately it's not the case. We don't need it in all requirements though (only for testing), which I have fixed now.
arr = zarr.open(fs.get_mapper(url), mode="r") | ||
actual = arr[...] | ||
expected = df[col] | ||
assert numpy.array_equal(actual, expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Attempting to write raises a helpful error message:
ReadOnlyError: object is read-only
This behavior should be tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a couple of test cases here. Just to note, those errors are raised by fsspec
and zarr
client objects, even before the request reaches Tiled.
EDIT: I have included my code from this fix here: https://github.com/davramov/tiled/tree/add-zarr-forked Hi, I am working on serving zarr files to the standalone itkwidgets web browser visualization tool via Tiled using this PR. I am serving the zarr via tiled using this command: TILED_CONFIG=config.yaml TILED_ALLOW_ORIGINS="http://localhost:3000" tiled serve directory "../../data/tomo/scratch/" --public --verbose I ran into an issue where when I pass the tiled path into the itk-vtk-viewer url, itkVtkViewer.js:2
GET http://127.0.0.1:8000/zarr/v2/rec20230606_152011_jong-seto_fungal-mycelia_flat-AQ_fungi2_fast/.zattrs 404 (Not Found) It doesn't seem like this file is getting served as part of the directory (I noticed in To solve this I added a class ZarrAttrsAdapter:
"""
Adapter that exposes a Zarr node's .attrs as JSON.
"""
structure_family = "node" # or "container" if you prefer
specs: List[Spec] = []
def __init__(self, node: Union[zarr.hierarchy.Group, zarr.core.Array]):
"""
node: Zarr Group or Array whose attributes we want to serve
"""
self._node = node
def metadata(self) -> JSON:
"""
Return any extra metadata. In this example, it's empty.
"""
return {}
def structure(self):
"""
We have no numeric array data to describe, just JSON attributes.
"""
return None
def read(self) -> JSON:
"""
Return the node's attrs as a plain dictionary.
"""
return dict(self._node.attrs)
def __repr__(self) -> str:
return f"<ZarrAttrsAdapter attrs_keys={list(self._node.attrs.keys())}>" I also added an additional router in @router.get("{path:path}.zattrs", name="Zarr .zattrs metadata")
@router.get("/{path:path}/.zattrs", name="Zarr .zattrs metadata")
async def get_zarr_attrs(
request: Request,
entry=SecureEntry(
scopes=["read:data", "read:metadata"],
structure_families={
StructureFamily.table,
StructureFamily.container,
StructureFamily.array,
},
),
):
"""
Return Zarr attributes metadata (.zattrs).
If entry.metadata() (or entry.metadata) includes "zattrs", return them.
"""
# If it's an unstructured array, we do not treat it as a group for .zattrs
if entry.structure_family == StructureFamily.array and not isinstance(
entry.structure().data_type, StructDtype
):
raise HTTPException(status_code=HTTP_404_NOT_FOUND)
# Attempt to retrieve .zattrs from entry.metadata
try:
metadata_dict = entry.metadata() # if it's callable
except TypeError:
metadata_dict = entry.metadata # if it's a property
return Response(
json.dumps(metadata_dict),
status_code=200,
media_type="application/json",
) With these changes, I was able to successfully load the zarr volume in the itk-vtk-viewer web app. |
This PR exposes Tiled data as a zarr collection on a set of new api endpoints,
/zarr/v2/...
. This allows one to use zarr clients directly with Tiled, as if it was an external filesystem accessed throughfsspec
.Assuming a demo Tiled server is running on
127.0.0.1:8000
(e.g. started withtiled serve demo
), one can read its contents into zarr by first specifying a file system mapper and then passing it to zarr:The resulting object is a zarr.Group, which represents the root of the Tiled catalog tree and supports (most) of the usual operations on zarr groups:
The native tiled datastructures are mapped to zarr as follows:
Addresses the Issue #562.
Checklist