-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MNT: Refactoring changes to CSV adapter + CSVArrayAdapter #803
base: main
Are you sure you want to change the base?
Conversation
I see two structural changes here:
Some time ago, we encountered these examples and some others in the wild. We did not like the look of these Adapters: Thus, we added additional columns to the Asset--DataSource many-to-many relation table which encode which role and order each Asset plays. To be specific, with each row in the many-to-many relation table we also store, in addition to
I really like having standard separate paths for introspection (ii) and construction from DataSource/Asset(iii). That's a huge improvement. I see in the implementation of Now on shakier ground, just brainstorming... Can we get the best of both worlds? As a starting point, convenience function like this might work: def from_node_and_data_source(adapter_cls: type[Adapter], node: Node, data_source: DataSource) -> Adapter
"Usage example: from_data_source(CSVAdapter, node, data_source)"
parameters = defaultdict(list)
for asset in data_source.assets:
if asset.num is None:
# This asset is associated with a parameter that takes a single URI.
parameters[asset.parameter] = asset.data_uri
else:
# This asset is associated with a parameter that takes a list of URIs.
parameters[asset.parameter].append(asset.data_uri)
return adapter_cls(metadata=node.metadata, specs=node.specs, structure=data_source.structure, **parameters) (A further convenience could wrap this and figure out the Thus, we would drop the We would retain two great aspects of this PR:
There may be better implementations of these goals available, but I hope this is a promising example. |
CSVAdapter
to accept kwargs forpd.read_csv
(e.g. separator)dataframe_adapter
property fromCSVAdapter
multipart/related;type=text/csv
mimetypeCSVArrayAdaper
backed by anArrayAdapter
(instead ofTableAdapter
). It can be used to load homogeneous numerical arrays stored as scv files. The distinction between the two is intended to be done by the mimetype: "text/csv;header=present" -- for tables, and "text/csv;header=absent" -- for arrays.from_assets
andfrom_uris
being the two primary methods).Checklist
Add the ticket number which this PR closes to the comment section