From a3c1dc6904de8d39603748e75a0d5ced619528b1 Mon Sep 17 00:00:00 2001 From: Xavier Sosnovsky Date: Thu, 12 Dec 2024 12:00:36 +0100 Subject: [PATCH 01/27] Add placeholders for additional documentation --- docs/howto/data_rw.rst | 6 ++++++ docs/howto/sdmx_rest.rst | 6 ++++++ docs/index.rst | 14 +++++++++++++- docs/start.rst | 11 ++++++++++- 4 files changed, 35 insertions(+), 2 deletions(-) create mode 100644 docs/howto/data_rw.rst create mode 100644 docs/howto/sdmx_rest.rst diff --git a/docs/howto/data_rw.rst b/docs/howto/data_rw.rst new file mode 100644 index 00000000..39771794 --- /dev/null +++ b/docs/howto/data_rw.rst @@ -0,0 +1,6 @@ +.. _data-rw: + +Reading and writing SDMX datasets +================================= + +To be defined \ No newline at end of file diff --git a/docs/howto/sdmx_rest.rst b/docs/howto/sdmx_rest.rst new file mode 100644 index 00000000..a558069c --- /dev/null +++ b/docs/howto/sdmx_rest.rst @@ -0,0 +1,6 @@ +.. _sdmx-rest: + +SDMX-REST Queries +================= + +To be defined \ No newline at end of file diff --git a/docs/index.rst b/docs/index.rst index cca977fc..a8712193 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -13,7 +13,7 @@ Your opinionated Python SDMX library. .. toctree:: :maxdepth: 1 - :caption: How to + :caption: Metadata-driven processes howto/structure_fs howto/structure_db @@ -21,6 +21,18 @@ Your opinionated Python SDMX library. howto/map howto/config +.. toctree:: + :maxdepth: 1 + :caption: Data discovery + + howto/sdmx_rest + +.. toctree:: + :maxdepth: 2 + :caption: Handling SDMX datasets + + howto/data_rw + .. toctree:: :maxdepth: 1 diff --git a/docs/start.rst b/docs/start.rst index f04a9d10..8e3d89e9 100644 --- a/docs/start.rst +++ b/docs/start.rst @@ -56,7 +56,7 @@ However, metadata can do so much more than that, i.e. they can be "active" and - :ref:`config` ``pysdmx`` supports retrieving metadata from an SDMX Registry or any service -compliant with the SDMX-REST 2.0.0 API. +compliant with the SDMX-REST 2.0.0 (or above) API. Install ``pysdmx`` with the ``fmr`` extra to enable this functionality: @@ -74,6 +74,15 @@ allow: - Discovering data available in these services. - Retrieving data from these services. +Although this functionality is still under development, it is already +possible to :ref:`build SDMX-REST queries and execute them against a +web service`. + +Reading and writing SDMX datasets +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +[TO BE COMPLETED] + How can I get it? ----------------- From 612feaf52aff69a3cd5390272538dce4c10d841b Mon Sep 17 00:00:00 2001 From: Xavier Sosnovsky Date: Thu, 12 Dec 2024 13:22:33 +0100 Subject: [PATCH 02/27] Fix reference issues --- docs/api/fmr/async.rst | 2 +- docs/api/fmr/sync.rst | 2 +- docs/howto/config.rst | 4 ++-- docs/howto/map.rst | 4 ++-- docs/howto/structure_db.rst | 4 ++-- docs/howto/structure_fs.rst | 6 +++--- docs/howto/validate.rst | 4 ++-- 7 files changed, 13 insertions(+), 13 deletions(-) diff --git a/docs/api/fmr/async.rst b/docs/api/fmr/async.rst index 595fbfb2..34d5bbcf 100644 --- a/docs/api/fmr/async.rst +++ b/docs/api/fmr/async.rst @@ -12,5 +12,5 @@ SDMX service in an asynchronous (i.e. non-blocking fashion). >>> print(mapping) >>> asyncio.run(main()) -.. autoclass:: pysdmx.fmr.AsyncRegistryClient +.. autoclass:: pysdmx.api.fmr.AsyncRegistryClient :members: \ No newline at end of file diff --git a/docs/api/fmr/sync.rst b/docs/api/fmr/sync.rst index b84b8146..5e5f62a4 100644 --- a/docs/api/fmr/sync.rst +++ b/docs/api/fmr/sync.rst @@ -8,5 +8,5 @@ SDMX service in a synchronous (i.e. blocking fashion). >>> gr = RegistryClient("https://registry.sdmx.org/sdmx/v2/") >>> schema = gr.get_schema("dataflow", "UIS", "EDUCAT_CLASS_A", "1.0") -.. autoclass:: pysdmx.fmr.RegistryClient +.. autoclass:: pysdmx.api.fmr.RegistryClient :members: \ No newline at end of file diff --git a/docs/howto/config.rst b/docs/howto/config.rst index 5b45b6c7..94f02416 100644 --- a/docs/howto/config.rst +++ b/docs/howto/config.rst @@ -95,8 +95,8 @@ Connecting to a Registry ^^^^^^^^^^^^^^^^^^^^^^^^ ``pysdmx`` allows retrieving metadata from an SDMX Registry in either a -synchronous (via ``pymedal.fmr.RegistryClient``) or asynchronous fashion -(via ``pymedal.fmr.AsyncRegistryClient``). The choice depends on your use +synchronous (via ``pysdmx.api.fmr.RegistryClient``) or asynchronous fashion +(via ``pysdmx.api.fmr.AsyncRegistryClient``). The choice depends on your use case. The asynchronous client is often preferred as it is non-blocking. To connect to your target Registry, instantiate the client by passing the diff --git a/docs/howto/map.rst b/docs/howto/map.rst index f41e4d51..e1c7176d 100644 --- a/docs/howto/map.rst +++ b/docs/howto/map.rst @@ -26,8 +26,8 @@ Step-by-step Solution --------------------- ``pysdmx`` allows retrieving metadata from an SDMX Registry in either a -synchronous manner (via ``pymedal.fmr.RegistryClient``) or asynchronously -(via ``pymedal.fmr.AsyncRegistryClient``). The choice depends on the use case +synchronous manner (via ``pysdmx.api.fmr.RegistryClient``) or asynchronously +(via ``pysdmx.api.fmr.AsyncRegistryClient``). The choice depends on the use case (and preference), but we tend to use the asynchronous client by default as it is non-blocking. diff --git a/docs/howto/structure_db.rst b/docs/howto/structure_db.rst index c485f195..931793b7 100644 --- a/docs/howto/structure_db.rst +++ b/docs/howto/structure_db.rst @@ -30,8 +30,8 @@ Step-by-step Solution --------------------- ``pysdmx`` allows retrieving metadata from an SDMX Registry either -synchronously (via ``pymedal.fmr.RegistryClient``) or asynchronously -(via ``pymedal.fmr.AsyncRegistryClient``). The choice depends on the use case +synchronously (via ``pysdmx.api.fmr.RegistryClient``) or asynchronously +(via ``pysdmx.api.fmr.AsyncRegistryClient``). The choice depends on the use case and preference, but we use the asynchronous client by default as it is non-blocking. diff --git a/docs/howto/structure_fs.rst b/docs/howto/structure_fs.rst index d6a12ed7..28043344 100644 --- a/docs/howto/structure_fs.rst +++ b/docs/howto/structure_fs.rst @@ -42,8 +42,8 @@ Step-by-step Solution --------------------- ``pysdmx`` allows retrieving metadata from an SDMX Registry either -synchronously (via ``pymedal.fmr.RegistryClient``) or asynchronously -(via ``pymedal.fmr.AsyncRegistryClient``). +synchronously (via ``pysdmx.api.fmr.RegistryClient``) or asynchronously +(via ``pysdmx.api.fmr.AsyncRegistryClient``). Connecting to a Registry ^^^^^^^^^^^^^^^^^^^^^^^^ @@ -67,7 +67,7 @@ in a **category scheme** and related **categorizations**. .. code-block:: python - cs = await client.get_categories("MY_AGENCY", "MY_DATAFLOWS") + cs = await client.get_categories("MY_AGENCY", "MY_CATEGORY_SCHEME") Now we iterate over the categories (and their sub-categories) to find the dataflows attached to them. Use the ``dataflows`` property to get a set diff --git a/docs/howto/validate.rst b/docs/howto/validate.rst index 5f5be7bb..d94a8eb9 100644 --- a/docs/howto/validate.rst +++ b/docs/howto/validate.rst @@ -66,8 +66,8 @@ Step-by-step solution --------------------- ``pysdmx`` allows retrieving metadata from an SDMX Registry in either a -synchronous (via ``pymedal.fmr.RegistryClient``) or asynchronous fashion -(via ``pymedal.fmr.AsyncRegistryClient``). Which one to use depends on the +synchronous (via ``pysdmx.api.fmr.RegistryClient``) or asynchronous fashion +(via ``pysdmx.api.fmr.AsyncRegistryClient``). Which one to use depends on the use case (and taste), but we tend to use the asynchronous client by default, as it is non-blocking. From e332185cd20aca7965cae400f3920ab9004667c7 Mon Sep 17 00:00:00 2001 From: Xavier Sosnovsky Date: Thu, 12 Dec 2024 13:27:13 +0100 Subject: [PATCH 03/27] Update documentation for ValueMap --- docs/howto/map.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/howto/map.rst b/docs/howto/map.rst index e1c7176d..6ec43e60 100644 --- a/docs/howto/map.rst +++ b/docs/howto/map.rst @@ -170,7 +170,7 @@ mappings can be retrieved via the ``component_maps`` property: # target='CONTRACT', # values=[ # ValueMap(source='PROD TYPE', target='_T', valid_from=None, valid_to=None), - # ValueMap(source=re.compile('^([A-Z0-9]+)$'), target='\\1', valid_from=None, valid_to=None) + # ValueMap(source='regex:^([A-Z0-9]+)$', target='\\1', valid_from=None, valid_to=None) # ] # ) From d60689ff3fdfcaa9b5707de2ac4d34a273500343 Mon Sep 17 00:00:00 2001 From: Xavier Sosnovsky Date: Thu, 12 Dec 2024 13:36:26 +0100 Subject: [PATCH 04/27] Add classes to API documentation --- docs/api/model/code.rst | 2 +- docs/api/model/dataflow.rst | 2 +- docs/api/model/map.rst | 2 +- docs/api/model/organisation.rst | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/api/model/code.rst b/docs/api/model/code.rst index 0e31c08b..3d446bdd 100644 --- a/docs/api/model/code.rst +++ b/docs/api/model/code.rst @@ -2,4 +2,4 @@ Codelists, hierarchies and value lists ====================================== .. automodule:: pysdmx.model.code - :members: Code, Codelist, HierarchicalCode, Hierarchy \ No newline at end of file + :members: Code, Codelist, HierarchicalCode, Hierarchy, HierarchyAssociation \ No newline at end of file diff --git a/docs/api/model/dataflow.rst b/docs/api/model/dataflow.rst index f9e1efdf..55c7626b 100644 --- a/docs/api/model/dataflow.rst +++ b/docs/api/model/dataflow.rst @@ -10,4 +10,4 @@ Dataflows and data structures - :ref:`validate`. .. automodule:: pysdmx.model.dataflow - :members: Component, Components, DataflowInfo, Role, Schema \ No newline at end of file + :members: Component, Components, Dataflow, DataflowInfo, Role, Schema \ No newline at end of file diff --git a/docs/api/model/map.rst b/docs/api/model/map.rst index 799eceb8..4f3368c2 100644 --- a/docs/api/model/map.rst +++ b/docs/api/model/map.rst @@ -9,4 +9,4 @@ Mapping definitions - :ref:`map`. .. automodule:: pysdmx.model.map - :members: StructureMap, ComponentMap, MultiComponentMap, ValueMap, MultiValueMap, ImplicitComponentMap, FixedValueMap, DatePatternMap \ No newline at end of file + :members: StructureMap, ComponentMap, RepresentationMap, MultiComponentMap, MultiRepresentationMap, ValueMap, MultiValueMap, ImplicitComponentMap, FixedValueMap, DatePatternMap \ No newline at end of file diff --git a/docs/api/model/organisation.rst b/docs/api/model/organisation.rst index 8983dd87..881e0c18 100644 --- a/docs/api/model/organisation.rst +++ b/docs/api/model/organisation.rst @@ -2,4 +2,4 @@ Organisations ============= .. automodule:: pysdmx.model.organisation - :members: Organisation, Contact, DataflowRef \ No newline at end of file + :members: Agency, AgencyScheme, DataConsumer, DataConsumerScheme, DataProvider, DataProviderScheme, MetadataProvider, MetadataProviderScheme \ No newline at end of file From 8190a3cc2cac83ed71100db403a8d2375fafb704 Mon Sep 17 00:00:00 2001 From: Xavier Sosnovsky Date: Thu, 12 Dec 2024 13:37:23 +0100 Subject: [PATCH 05/27] Reorganize docs --- docs/api/model.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/api/model.rst b/docs/api/model.rst index a355c223..c2608579 100644 --- a/docs/api/model.rst +++ b/docs/api/model.rst @@ -61,10 +61,10 @@ API Reference .. toctree:: :maxdepth: 1 - model/organisation model/code model/concept model/dataflow model/category model/map - model/refmeta \ No newline at end of file + model/refmeta + model/organisation \ No newline at end of file From e848a41ad4b3bf1f6d3fd40e1a9c7c43df49fb4c Mon Sep 17 00:00:00 2001 From: Xavier Sosnovsky Date: Thu, 12 Dec 2024 13:38:34 +0100 Subject: [PATCH 06/27] Add documentation for DataflowRef --- docs/api/model/category.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/api/model/category.rst b/docs/api/model/category.rst index 35941e27..41e846a0 100644 --- a/docs/api/model/category.rst +++ b/docs/api/model/category.rst @@ -9,4 +9,4 @@ Category schemes - :ref:`fs`. .. automodule:: pysdmx.model.category - :members: Category, CategoryScheme \ No newline at end of file + :members: Category, CategoryScheme, DataflowRef \ No newline at end of file From c822446f19be426b8c8e1947d47d005a1808d30e Mon Sep 17 00:00:00 2001 From: Xavier Sosnovsky Date: Thu, 12 Dec 2024 14:18:29 +0100 Subject: [PATCH 07/27] Add API documentation for SDMX-REST --- docs/api/query/availability.rst | 11 +++++++++++ docs/api/query/data.rst | 11 +++++++++++ docs/api/query/refmeta.rst | 11 +++++++++++ docs/api/query/schema.rst | 11 +++++++++++ docs/api/query/service.rst | 11 +++++++++++ docs/api/query/structure.rst | 11 +++++++++++ docs/api/rest.rst | 26 ++++++++++++++++++++++++++ docs/index.rst | 1 + 8 files changed, 93 insertions(+) create mode 100644 docs/api/query/availability.rst create mode 100644 docs/api/query/data.rst create mode 100644 docs/api/query/refmeta.rst create mode 100644 docs/api/query/schema.rst create mode 100644 docs/api/query/service.rst create mode 100644 docs/api/query/structure.rst create mode 100644 docs/api/rest.rst diff --git a/docs/api/query/availability.rst b/docs/api/query/availability.rst new file mode 100644 index 00000000..c07ffd9f --- /dev/null +++ b/docs/api/query/availability.rst @@ -0,0 +1,11 @@ +SDMX-REST Availability Queries +============================== + +.. note:: + Additional information about how to build SDMX-REST queries + can be found in the following tutorial: + + - :ref:`sdmx-rest`. + +.. automodule:: pysdmx.api.qb.availability + :members: AvailabilityFormat, AvailabilityMode, AvailabilityQuery diff --git a/docs/api/query/data.rst b/docs/api/query/data.rst new file mode 100644 index 00000000..51b60d1d --- /dev/null +++ b/docs/api/query/data.rst @@ -0,0 +1,11 @@ +SDMX-REST Data Queries +====================== + +.. note:: + Additional information about how to build SDMX-REST queries + can be found in the following tutorial: + + - :ref:`sdmx-rest`. + +.. automodule:: pysdmx.api.qb.data + :members: DataContext, DataFormat, DataQuery diff --git a/docs/api/query/refmeta.rst b/docs/api/query/refmeta.rst new file mode 100644 index 00000000..6ed22315 --- /dev/null +++ b/docs/api/query/refmeta.rst @@ -0,0 +1,11 @@ +SDMX-REST Reference Metadata Queries +==================================== + +.. note:: + Additional information about how to build SDMX-REST queries + can be found in the following tutorial: + + - :ref:`sdmx-rest`. + +.. automodule:: pysdmx.api.qb.refmeta + :members: RefMetaByMetadataflowQuery, RefMetaByMetadatasetQuery, RefMetaByStructureQuery, RefMetaDetail, RefMetaFormat \ No newline at end of file diff --git a/docs/api/query/schema.rst b/docs/api/query/schema.rst new file mode 100644 index 00000000..a52be140 --- /dev/null +++ b/docs/api/query/schema.rst @@ -0,0 +1,11 @@ +SDMX-REST Schema Queries +======================== + +.. note:: + Additional information about how to build SDMX-REST queries + can be found in the following tutorial: + + - :ref:`sdmx-rest`. + +.. automodule:: pysdmx.api.qb.schema + :members: SchemaContext, SchemaFormat, SchemaQuery \ No newline at end of file diff --git a/docs/api/query/service.rst b/docs/api/query/service.rst new file mode 100644 index 00000000..8beded84 --- /dev/null +++ b/docs/api/query/service.rst @@ -0,0 +1,11 @@ +SDMX-REST Service Clients +========================= + +.. note:: + Additional information about how to execute SDMX-REST queries + against a specific service can be found in the following tutorial: + + - :ref:`sdmx-rest`. + +.. automodule:: pysdmx.api.qb + :members: ApiVersion, RestService diff --git a/docs/api/query/structure.rst b/docs/api/query/structure.rst new file mode 100644 index 00000000..9cb88328 --- /dev/null +++ b/docs/api/query/structure.rst @@ -0,0 +1,11 @@ +SDMX-REST Structure Queries +=========================== + +.. note:: + Additional information about how to build SDMX-REST queries + can be found in the following tutorial: + + - :ref:`sdmx-rest`. + +.. automodule:: pysdmx.api.qb.structure + :members: StructureDetail, StructureFormat, StructureQuery, StructureReference, StructureType diff --git a/docs/api/rest.rst b/docs/api/rest.rst new file mode 100644 index 00000000..1beb5506 --- /dev/null +++ b/docs/api/rest.rst @@ -0,0 +1,26 @@ +SDMX-REST queries +================= + +Overview +-------- + +``pysdmx`` allows **building SDMX-REST queries** and **executing them** +against an SDMX-REST compliant service. + +.. note:: + Discover how to execute SDMX-REST queries in the following tutorial: + + - :ref:`sdmx-rest`. + +API Reference +------------- + +.. toctree:: + :maxdepth: 1 + + query/service + query/availability + query/data + query/refmeta + query/schema + query/structure diff --git a/docs/index.rst b/docs/index.rst index a8712193..21cc0d51 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -40,6 +40,7 @@ Your opinionated Python SDMX library. api/model api/fmr + api/rest api/helper Indices and tables From 19658323282da77a4a1a98f7a1c2851f1fb264b6 Mon Sep 17 00:00:00 2001 From: Xavier Sosnovsky Date: Thu, 12 Dec 2024 14:21:26 +0100 Subject: [PATCH 08/27] Change name --- docs/api/rest.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/api/rest.rst b/docs/api/rest.rst index 1beb5506..82e8522e 100644 --- a/docs/api/rest.rst +++ b/docs/api/rest.rst @@ -1,5 +1,5 @@ -SDMX-REST queries -================= +SDMX-REST services +================== Overview -------- From 821dcc4d1e9b721e2f4a899006d932f03b7027bb Mon Sep 17 00:00:00 2001 From: Xavier Sosnovsky Date: Thu, 12 Dec 2024 15:13:26 +0100 Subject: [PATCH 09/27] Complete SDMX query builder documentation --- docs/api/rest.rst | 2 + docs/howto/sdmx_rest.rst | 116 ++++++++++++++++++++++++++++++++++++++- 2 files changed, 115 insertions(+), 3 deletions(-) diff --git a/docs/api/rest.rst b/docs/api/rest.rst index 82e8522e..6a253544 100644 --- a/docs/api/rest.rst +++ b/docs/api/rest.rst @@ -1,3 +1,5 @@ +.. _qb_api: + SDMX-REST services ================== diff --git a/docs/howto/sdmx_rest.rst b/docs/howto/sdmx_rest.rst index a558069c..8b73b131 100644 --- a/docs/howto/sdmx_rest.rst +++ b/docs/howto/sdmx_rest.rst @@ -1,6 +1,116 @@ .. _sdmx-rest: -SDMX-REST Queries -================= +SDMX-REST services +================== -To be defined \ No newline at end of file +``pysdmx`` allows **building SDMX-REST queries** and **executing them** +against an SDMX-REST compliant web service. + +For additional information about the SDMX-REST API, please refer to the +`SDMX documentation `_. + +SDMX-REST queries +----------------- + +The SDMX-REST API allows defining queries to retrieve +`data `_, +`structural `_ and +`reference metadata `_, +`schemas `_ +and `data availability `_. + +``pysdmx`` offers **query builders** for these different types of queries, as well as +enumerations for some of the parameters available in the SDMX-REST API. + +For example, the following can be used to retrieve information about a dataflow and +all the artefacts referenced directly or indirectly by this dataflow. + +.. code-block:: python + + from pysdmx.api.qb import ( + StructureDetail, + StructureQuery, + StructureReference, + StructureType, + ) + + query = StructureQuery( + StructureType.DATAFLOW, + "SDMX", + "NAMAIN_IDC_N", + detail=StructureDetail.REFERENCE_PARTIAL, + references=StructureReference.DESCENDANTS, + ) + +SDMX services +------------- + +Now that we have a query, we can execute it against the desired SDMX-REST service, +using the ``RestService`` class. + +The ``RestService`` requires an endpoint to which the query will be sent, +as well as the version of the SDMX-REST API that the endpoint supports. + +.. code-block:: python + + from pysdmx.api.qb import ApiVersion, RestService + + endpoint = "https://registry.sdmx.org/sdmx/v2/" + version = ApiVersion.V2_0_0 + service = RestService(endpoint, version) + resp = service.structure(query) + +In case the query requires features that are not available in the version +of the API supported by the endpoint, an error will be raised. + +Deserializing the response +-------------------------- + +The response of the web service will be returned as a sequence of bytes. +By default, the returned responses will be in the SDMX-JSON format, but +this can be configured when instantiating the ``RestService``. + +You can then process the response using your preferred library for the +requested format, such as, for example, Python ``json`` module, for SDMX-JSON +responses. + +Alternatively, for some messages, ``pysdmx.io.json.sdmxjson2`` deserializers +can be used. This is not well documented yet, as only a subset of messages +is currently supported, but further work will take place in this space. + +The code below shows how to do that, using one of the supported messages. + +.. code-block:: python + + import msgspec + + from pysdmx.api.qb import ( + ApiVersion, + RestService, + StructureQuery, + StructureType, + ) + from pysdmx.io.json.sdmxjson2.messages.code import JsonCodelistMessage + + # Step 1: Build your query + query = StructureQuery(StructureType.CODELIST, "SDMX", "CL_FREQ") + + # Step 2: Execute the query against your desired service + endpoint = "https://registry.sdmx.org/sdmx/v2/" + version = ApiVersion.V2_0_0 + service = RestService(endpoint, version) + resp = service.structure(query) + + # Step 3: Deserialize the response into a domain object + decoder = msgspec.json.Decoder(JsonCodelistMessage) + cl = decoder.decode(resp).to_model() + + # Step 4: Use the object the way you see fit + print(f"There are {len(cl.codes)} codes in the codelist") + + # Example output + # There are 34 codes in the codelist + + +For additional information about the query builders and the SDMX-REST service +class, please refer to the :ref:`API documentation`. \ No newline at end of file From 0c2defc70922d83271f777b26d3b7972ebac8417 Mon Sep 17 00:00:00 2001 From: Xavier Sosnovsky Date: Thu, 12 Dec 2024 16:11:10 +0100 Subject: [PATCH 10/27] Update release note --- docs/release.rst | 29 +++++++++++------------------ 1 file changed, 11 insertions(+), 18 deletions(-) diff --git a/docs/release.rst b/docs/release.rst index 149617a8..bd769e7c 100644 --- a/docs/release.rst +++ b/docs/release.rst @@ -1,23 +1,16 @@ Release notes ============= -1.0.0-beta-10 ------------- +1.0.0 (2024-12) +--------------- -Added -^^^^^ +Features +^^^^^^^^ -- Schema generation supports Hierarchical Associations - referencing Provision Agreements - - -1.0.0-beta-1 ------------- - -Added -^^^^^ - -- Core SDMX model classes -- Sync and async clients to retrieve metadata - from an SDMX Registry or SDMX-REST service -- Helper functions to handle SDMX URNs +- Offer core domain classes for the SDMX information model. +- Offer sync and async clients to retrieve metadata + from an SDMX Registry or SDMX-REST service, and use them to + drive statistical business processes. +- Offer SDMX-REST query builders and a service client to execute + queries against SDMX-REST services. +- Offer functions to handle SDMX URNs. From e2c7a4852faaf1b9e3ce93aae5375f2a999ec5f8 Mon Sep 17 00:00:00 2001 From: Xavier Sosnovsky Date: Thu, 12 Dec 2024 16:19:53 +0100 Subject: [PATCH 11/27] Add documentation for VTL artefacts --- docs/api/model.rst | 3 ++- docs/api/model/vtl.rst | 12 ++++++++++++ docs/howto/vtl.rst | 6 ++++++ docs/index.rst | 1 + 4 files changed, 21 insertions(+), 1 deletion(-) create mode 100644 docs/api/model/vtl.rst create mode 100644 docs/howto/vtl.rst diff --git a/docs/api/model.rst b/docs/api/model.rst index c2608579..a9a699c0 100644 --- a/docs/api/model.rst +++ b/docs/api/model.rst @@ -67,4 +67,5 @@ API Reference model/category model/map model/refmeta - model/organisation \ No newline at end of file + model/organisation + model/vtl \ No newline at end of file diff --git a/docs/api/model/vtl.rst b/docs/api/model/vtl.rst new file mode 100644 index 00000000..d2a0ff62 --- /dev/null +++ b/docs/api/model/vtl.rst @@ -0,0 +1,12 @@ +VTL artefacts +============= + +.. note:: + Additional information about how VTL artefacts can be used for + validating and transforming SDMX data is available in the + following tutorial: + + - :ref:`vtl`. + +.. automodule:: pysdmx.model.vtl + :members: CustomType, CustomTypeScheme, FromVtlMapping, NamePersonalisation, NamePersonalisationScheme, Ruleset, RulesetScheme, ToVtlMapping, Transformation, TransformationScheme, UserDefinedOperator, UserDefinedOperatorScheme, VtlCodelistMapping, VtlConceptMapping, VtlDataflowMapping, VtlMapping, VtlMappingScheme \ No newline at end of file diff --git a/docs/howto/vtl.rst b/docs/howto/vtl.rst new file mode 100644 index 00000000..980e6347 --- /dev/null +++ b/docs/howto/vtl.rst @@ -0,0 +1,6 @@ +.. _vtl: + +Using VTL for validation +======================== + +To be defined \ No newline at end of file diff --git a/docs/index.rst b/docs/index.rst index 21cc0d51..ce7948f7 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -32,6 +32,7 @@ Your opinionated Python SDMX library. :caption: Handling SDMX datasets howto/data_rw + howto/vtl .. toctree:: From 742eb62e3be04f49ab5f6945abf217d45c4e3be6 Mon Sep 17 00:00:00 2001 From: "javier.hernandez" Date: Sat, 14 Dec 2024 08:55:01 +0100 Subject: [PATCH 12/27] Draft code for data reading and writing docs. Fixed relevant docstrings. Changed authors and copyright on relevant files. Signed-off-by: javier.hernandez --- docs/conf.py | 4 +- docs/howto/data_rw.rst | 218 ++++++++++++++++++- pyproject.toml | 4 +- src/pysdmx/io/csv/sdmx10/reader/__init__.py | 6 +- src/pysdmx/io/csv/sdmx10/writer/__init__.py | 6 +- src/pysdmx/io/csv/sdmx20/reader/__init__.py | 6 +- src/pysdmx/io/csv/sdmx20/writer/__init__.py | 3 +- src/pysdmx/io/pd.py | 9 +- src/pysdmx/io/xml/sdmx21/reader/__init__.py | 17 +- src/pysdmx/io/xml/sdmx21/writer/__init__.py | 6 +- tests/io/csv/sdmx10/writer/test_writer_v1.py | 17 ++ tests/io/csv/sdmx20/writer/test_writer_v2.py | 21 ++ 12 files changed, 288 insertions(+), 29 deletions(-) diff --git a/docs/conf.py b/docs/conf.py index 8f36501c..4e5fd088 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -19,8 +19,8 @@ # -- Project information ----------------------------------------------------- project = "pysdmx" -copyright = "2023, BIS" -author = "BIS" +copyright = "2024, BIS, MeaningfulData" +author = "BIS, MeaningfulData" # -- General configuration --------------------------------------------------- diff --git a/docs/howto/data_rw.rst b/docs/howto/data_rw.rst index 39771794..70e60631 100644 --- a/docs/howto/data_rw.rst +++ b/docs/howto/data_rw.rst @@ -3,4 +3,220 @@ Reading and writing SDMX datasets ================================= -To be defined \ No newline at end of file +``pysdmx`` allows to read and write SDMX datasets in the following formats: + +- SDMX-CSV 1.0 (located in ``pysdmx.io.csv.sdmx10``) +- SDMX-CSV 2.0 (located in ``pysdmx.io.csv.sdmx20``) +- SDMX-ML 2.1 (located in ``pysdmx.io.xml.sdmx21``) + - SDMX-ML 2.1 Generic + - SDMX-ML 2.1 Structure Specific + +Currently, all data-related readers and writers are based on PandasDataset class. + +.. autoclass:: pysdmx.io.pd.PandasDataset + :show-inheritance: + :undoc-members: + +Reading data +------------ + +To read data, we may pass the string to the reading functions or use the input processor: + +.. automodule:: pysdmx.io.input_processor + :members: process_string_to_read + +A typical example to read data from a file, a string or a buffer + +.. code-block:: python + + from pysdmx.io.input_processor import process_string_to_read + # Import from desired reader + + # Read file sample.csv from the same folder as this code + file_path = Path(__file__).parent / "sample.csv" + input_str, extension = process_string_to_read(file_path) + + # Using reader, result will be a dictionary (key: dataset.short_urn, value: dataset) + datasets = read(input_str) + # Accessing the data of the test dataset + df = dataset["DataStructure=TEST_AGENCY:TEST_ID(1.0)"].data + +SDMX-CSV 1.0 +^^^^^^^^^^^^ + +`SDMX-CSV 1.0 specification `_ + +.. warning:: + + The SDMX-CSV 1.0 format is deprecated and should not be used for new implementations. + It only allows a dataflow to be represented, which is not enough for most use cases. + +.. autofunction:: pysdmx.io.csv.sdmx10.reader.read + +.. code-block:: python + + from pysdmx.io.input_processor import process_string_to_read + from pysdmx.io.csv.sdmx10.reader import read + from pathlib import Path + + # Read file sample.csv from the same folder as this code + file_path = Path(__file__).parent / "sample10.csv" + input_str, extension = process_string_to_read(file_path) + + # Using reader, result will be a dictionary (key: dataset.short_urn, value: dataset) + datasets = read(input_str) + # Accessing the data of the test dataset + df = dataset["Dataflow=TEST_AGENCY:TEST_ID(1.0)"].data + +SDMX-CSV 2.0 +^^^^^^^^^^^^ + +`SDMX-CSV 2.0 specification `_ + +.. autofunction:: pysdmx.io.csv.sdmx20.reader.read + +We currently support only comma as the delimiter. +Only the `ordinary case `_ is supported. + +You may use any custom script for the remaining use cases, if any one is interested in them, please +raise an issue in `GitHub `_. + +.. code-block:: python + + from pysdmx.io.input_processor import process_string_to_read + from pysdmx.io.csv.sdmx20.reader import read + from pathlib import Path + + # Read file from the same folder as this code + file_path = Path(__file__).parent / "sample20.csv" + input_str, extension = process_string_to_read(file_path) + + # Using reader, result will be a dictionary (key: dataset.short_urn, value: dataset) + datasets = read(input_str) + # Accessing the data of the test dataset + df = dataset["DataStructure=TEST_AGENCY:TEST_ID(1.0)"].data + +SDMX-ML 2.1 +^^^^^^^^^^^ + +SDMX-ML 2.1 format is described +`here (pdf file for IM) `_ + +.. autofunction:: pysdmx.io.xml.sdmx21.reader.read_xml + +We do not support the following elements: + +- Dimension Group +- Reference to Provision Agreement + +The reader supports both Generic and Structure Specific SDMX-ML 2.1. +It will automatically detect any structural validation errors (if validate=True) and raise an exception. + +.. warning:: + + The SDMX-ML 2.1 Generic format is deprecated and should not be used for new implementations. + SDMX-ML 3.0 only uses the Structure Specific format, which is more efficient and easier to use. + +.. code-block:: python + + from pysdmx.io.input_processor import process_string_to_read + from pysdmx.io.xml.sdmx21.reader import read_xml + from pathlib import Path + + # Read file from the same folder as this code + file_path = Path(__file__).parent / "sample21.xml" + input_str, extension = process_string_to_read(file_path) + + # Using reader, result will be a dictionary (key: dataset.short_urn, value: dataset) + datasets = read_xml(input_str, validate=True) + + # Accessing the data of the test dataset + df = dataset["DataStructure=TEST_AGENCY:TEST_ID(1.0)"].data + +Writing data +------------ + +``pysdmx`` allows to return the written data as a string or write it to a file. SDMX-CSV writers only allow one dataset to be written at a time, while SDMX-ML writers allow multiple datasets to be written at once. + +SDMX-CSV 1.0 +^^^^^^^^^^^^ + +`SDMX-CSV 1.0 specification `_ + +.. warning:: + + The SDMX-CSV 1.0 format is deprecated and should not be used for new implementations. + It only allows a dataflow to be represented, which is not enough for most use cases. + +.. autofunction:: pysdmx.io.csv.sdmx10.writer.writer + +.. code-block:: python + + from pysdmx.io.csv.sdmx10.writer import writer + from pathlib import Path + + # Write to file sample.csv in the same folder as this code + file_path = Path(__file__).parent / "sample.csv" + writer(dataset, file_path) + + +SDMX-CSV 2.0 +^^^^^^^^^^^^ + +`SDMX-CSV 2.0 specification `_ + +.. note:: + + The SDMX-CSV 2.0 writer will write the data as the `ordinary case `_. If you need to write data in other cases, you may need to write a custom script. + +.. warning:: + + We use only comma as the delimiter. + +.. autofunction:: pysdmx.io.csv.sdmx20.writer.writer + +.. code-block:: python + + from pysdmx.io.csv.sdmx20.writer import writer + from pathlib import Path + + # Write to file sample.csv in the same folder as this code + file_path = Path(__file__).parent / "sample.csv" + writer(dataset, file_path) + +SDMX-ML 2.1 +^^^^^^^^^^^ + +SDMX-ML 2.1 format is described +`here (pdf file for IM) `_ + +SDMX-ML 2.1 format allows to write multiple datasets at once. To use the Series format, you need to pass the dimension at observation dictionary, where the key is the dataset short urn and the value is the dimension id to be observed. + +.. important:: + + For each dataset, if dataset.structure is not a Schema, the writer can only write in the Structure Specific All Dimensions format. + We perform a check to ensure that the dataset has a Schema structure for the remaining formats as we need to know the roles for each component. This check also ensures that the dataset.structure has at least one dimension and one measure defined. + +.. autofunction:: pysdmx.io.xml.sdmx21.writer.writer + +.. code-block:: python + + from pysdmx.io.xml.sdmx21.writer import writer + from pysdmx.io.xml.enums import MessageType + from pathlib import Path + + # Dictionary of datasets to write + datasets = { + "DataStructure=TEST_AGENCY:TEST_ID(1.0)": dataset + } + + # Dimension at observation mapping + dim_mapping = { + "DataStructure=TEST_AGENCY:TEST_ID(1.0)": "TIME_PERIOD" + } + + # Write to file sample.xml in the same folder as this code + file_path = Path(__file__).parent / "sample.xml" + writer(content=datasets, type_=MessageType.StructureSpecificDataSet, path=file_path, dimension_at_observation=dim_mapping) + diff --git a/pyproject.toml b/pyproject.toml index fb385b18..57909a8c 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -4,7 +4,9 @@ version = "1.0.0-rc-5" description = "Your opinionated Python SDMX library" authors = [ "Xavier Sosnovsky ", - "Stratos Nikoloutsos " + "Stratos Nikoloutsos ", + "Francisco Javier Hernandez del Caño ", + "Mateo de Lorenzo Argeles " ] readme = "README.rst" documentation = "https://bis-med-it.github.io/pysdmx" diff --git a/src/pysdmx/io/csv/sdmx10/reader/__init__.py b/src/pysdmx/io/csv/sdmx10/reader/__init__.py index 2e43a9fe..ebdea942 100644 --- a/src/pysdmx/io/csv/sdmx10/reader/__init__.py +++ b/src/pysdmx/io/csv/sdmx10/reader/__init__.py @@ -37,13 +37,13 @@ def __generate_dataset_from_sdmx_csv(data: pd.DataFrame) -> PandasDataset: def read(infile: str) -> Dict[str, PandasDataset]: - """Reads csv file and returns a payload dictionary. + """Reads csv file and returns a dictionary. Args: - infile: Path to file, str. + infile: A string containing the CSV file, comma separated. Returns: - payload: dict. + A dictionary containing the datasets with the short URN as keys. Raises: Invalid: If it is an invalid CSV file. diff --git a/src/pysdmx/io/csv/sdmx10/writer/__init__.py b/src/pysdmx/io/csv/sdmx10/writer/__init__.py index a740e302..fb72149a 100644 --- a/src/pysdmx/io/csv/sdmx10/writer/__init__.py +++ b/src/pysdmx/io/csv/sdmx10/writer/__init__.py @@ -1,6 +1,7 @@ """SDMX 1.0 CSV writer module.""" from copy import copy +from pathlib import Path from typing import Optional import pandas as pd @@ -9,13 +10,14 @@ def writer( - dataset: PandasDataset, output_path: Optional[str] = None + dataset: PandasDataset, output_path: Optional[Path] = None ) -> Optional[str]: """Converts a dataset to an SDMX CSV format. Args: dataset: dataset - output_path: output_path + output_path: Path to file, if None, returns the + SDMX CSV data as a string Returns: SDMX CSV data as a string diff --git a/src/pysdmx/io/csv/sdmx20/reader/__init__.py b/src/pysdmx/io/csv/sdmx20/reader/__init__.py index 4f4839d1..cff83773 100644 --- a/src/pysdmx/io/csv/sdmx20/reader/__init__.py +++ b/src/pysdmx/io/csv/sdmx20/reader/__init__.py @@ -88,13 +88,13 @@ def __generate_dataset_from_sdmx_csv(data: pd.DataFrame) -> PandasDataset: def read(infile: str) -> Dict[str, PandasDataset]: - """Reads csv file and returns a payload dictionary. + """Reads csv file and returns a dictionary. Args: - infile: Path to file, str. + infile: A string containing the CSV file, comma separated. Returns: - payload: dict. + A dictionary containing the datasets with the short URN as keys. Raises: Invalid: If it is an invalid CSV file. diff --git a/src/pysdmx/io/csv/sdmx20/writer/__init__.py b/src/pysdmx/io/csv/sdmx20/writer/__init__.py index ea97d4c7..f1f90deb 100644 --- a/src/pysdmx/io/csv/sdmx20/writer/__init__.py +++ b/src/pysdmx/io/csv/sdmx20/writer/__init__.py @@ -1,6 +1,7 @@ """SDMX 2.0 CSV writer module.""" from copy import copy +from pathlib import Path from typing import Optional import pandas as pd @@ -10,7 +11,7 @@ def writer( - dataset: PandasDataset, output_path: Optional[str] = None + dataset: PandasDataset, output_path: Optional[Path] = None ) -> Optional[str]: """Converts a dataset to an SDMX CSV format. diff --git a/src/pysdmx/io/pd.py b/src/pysdmx/io/pd.py index ae92fd18..93323228 100644 --- a/src/pysdmx/io/pd.py +++ b/src/pysdmx/io/pd.py @@ -12,12 +12,11 @@ class PandasDataset(Dataset, frozen=False, kw_only=True): withhold data. Args: - attributes: Attributes at dataset level- - data: Dataframe. + attributes: Attributes at dataset level. + data: Pandas Dataframe. structure: - URN or Schema related to this Dataset - (DSD, Dataflow, ProvisionAgreement) - short_urn: Combination of Agency_id, Id and Version. + URN or Schema related to this Dataset + (DSD, Dataflow, ProvisionAgreement) """ data: pd.DataFrame diff --git a/src/pysdmx/io/xml/sdmx21/reader/__init__.py b/src/pysdmx/io/xml/sdmx21/reader/__init__.py index 3173ea90..edf2291d 100644 --- a/src/pysdmx/io/xml/sdmx21/reader/__init__.py +++ b/src/pysdmx/io/xml/sdmx21/reader/__init__.py @@ -1,4 +1,4 @@ -"""SDMX 2.1 XML reader package.""" +"""SDMX 2.1 XML reader module.""" from typing import Any, Dict, Optional @@ -41,7 +41,7 @@ def read_xml( - infile: str, + input: str, validate: bool = True, mode: Optional[MessageType] = None, use_dataset_id: bool = False, @@ -49,25 +49,26 @@ def read_xml( """Reads an SDMX-ML file and returns a dictionary with the parsed data. Args: - infile: Path to file, URL, or string. + input: String to parse. validate: If True, the XML data will be validated against the XSD. mode: The type of message to parse. use_dataset_id: If True, the dataset ID will be used as the key in the resulting dictionary. + If False, the short URN will be used as the key. Returns: - dict: Dictionary with the parsed data. + A dictionary containing the datasets. Raises: - Invalid: If the SDMX data cannot be parsed. + Invalid: If the SDMX-ML 2.1 data message cannot be parsed. """ if validate: - validate_doc(infile) + validate_doc(input) dict_info = xmltodict.parse( - infile, **XML_OPTIONS # type: ignore[arg-type] + input, **XML_OPTIONS # type: ignore[arg-type] ) - del infile + del input if mode is not None and MODES[mode.value] not in dict_info: raise Invalid( diff --git a/src/pysdmx/io/xml/sdmx21/writer/__init__.py b/src/pysdmx/io/xml/sdmx21/writer/__init__.py index 13744e0a..142210f5 100644 --- a/src/pysdmx/io/xml/sdmx21/writer/__init__.py +++ b/src/pysdmx/io/xml/sdmx21/writer/__init__.py @@ -1,5 +1,5 @@ """SDMX 2.1 writer package.""" - +from pathlib import Path from typing import Any, Dict, Optional from pysdmx.errors import NotImplemented @@ -18,7 +18,7 @@ def writer( content: Dict[str, Any], type_: MessageType, - path: str = "", + path: Optional[Path] = None, prettyprint: bool = True, header: Optional[Header] = None, ) -> Optional[str]: @@ -52,7 +52,7 @@ def writer( outfile += get_end_message(type_, prettyprint) - if path == "": + if path is None: return outfile with open(path, "w", encoding="UTF-8", errors="replace") as f: diff --git a/tests/io/csv/sdmx10/writer/test_writer_v1.py b/tests/io/csv/sdmx10/writer/test_writer_v1.py index 2113db24..a66328ba 100644 --- a/tests/io/csv/sdmx10/writer/test_writer_v1.py +++ b/tests/io/csv/sdmx10/writer/test_writer_v1.py @@ -43,6 +43,23 @@ def test_to_sdmx_csv_writing(data_path, data_path_reference): check_like=True, ) +def test_to_sdmx_csv_writing_to_file(data_path, data_path_reference, tmpdir): + urn = "urn:sdmx:org.sdmx.infomodel.datastructure.DataFlow=MD:DS1(1.0)" + + dataset = PandasDataset( + attributes={}, + data=pd.read_json(data_path, orient="records"), + structure=urn, + ) + dataset.data = dataset.data.astype("str") + writer(dataset, output_path=tmpdir / "output.csv") + result_df = pd.read_csv(tmpdir / "output.csv").astype(str) + reference_df = pd.read_csv(data_path_reference).astype(str) + pd.testing.assert_frame_equal( + result_df.fillna("").replace("nan", ""), + reference_df.replace("nan", ""), + check_like=True, + ) def test_writer_attached_attrs(data_path, data_path_reference_atch_atts): urn = "urn:sdmx:org.sdmx.infomodel.datastructure.DataFlow=MD:DS1(1.0)" diff --git a/tests/io/csv/sdmx20/writer/test_writer_v2.py b/tests/io/csv/sdmx20/writer/test_writer_v2.py index 5e7d246e..53d1211d 100644 --- a/tests/io/csv/sdmx20/writer/test_writer_v2.py +++ b/tests/io/csv/sdmx20/writer/test_writer_v2.py @@ -53,6 +53,27 @@ def test_to_sdmx_csv_writing(data_path, data_path_reference): check_like=True, ) +def test_to_sdmx_csv_writing_to_file(data_path, data_path_reference, tmpdir): + urn = ( + "urn:sdmx:org.sdmx.infomodel.registry." + "ProvisionAgreement=MD:PA1(1.0)" + ) + dataset = PandasDataset( + attributes={}, + data=pd.read_json(data_path, orient="records"), + structure=urn, + ) + dataset.data = dataset.data.astype("str") + writer(dataset, output_path=tmpdir / "output.csv") + result_df = pd.read_csv(tmpdir / "output.csv").astype(str) + reference_df = pd.read_csv(data_path_reference).astype(str) + pd.testing.assert_frame_equal( + result_df.fillna("").replace("nan", ""), + reference_df.replace("nan", ""), + check_like=True, + ) + + def test_writer_attached_attrs(data_path, data_path_reference_attch_atts): dataset = PandasDataset( From 1b8535f08c95aad4971af3b88d41d9ce94c94cee Mon Sep 17 00:00:00 2001 From: "javier.hernandez" Date: Thu, 19 Dec 2024 18:06:04 +0100 Subject: [PATCH 13/27] Linting changes. Signed-off-by: javier.hernandez --- src/pysdmx/io/xml/sdmx21/reader/__init__.py | 3 ++- src/pysdmx/io/xml/sdmx21/writer/__init__.py | 1 + tests/io/csv/sdmx10/writer/test_writer_v1.py | 2 ++ tests/io/csv/sdmx20/writer/test_writer_v2.py | 2 +- 4 files changed, 6 insertions(+), 2 deletions(-) diff --git a/src/pysdmx/io/xml/sdmx21/reader/__init__.py b/src/pysdmx/io/xml/sdmx21/reader/__init__.py index edf2291d..33aaf8c8 100644 --- a/src/pysdmx/io/xml/sdmx21/reader/__init__.py +++ b/src/pysdmx/io/xml/sdmx21/reader/__init__.py @@ -65,7 +65,8 @@ def read_xml( if validate: validate_doc(input) dict_info = xmltodict.parse( - input, **XML_OPTIONS # type: ignore[arg-type] + input, + **XML_OPTIONS, # type: ignore[arg-type] ) del input diff --git a/src/pysdmx/io/xml/sdmx21/writer/__init__.py b/src/pysdmx/io/xml/sdmx21/writer/__init__.py index 3270698d..c9b54d60 100644 --- a/src/pysdmx/io/xml/sdmx21/writer/__init__.py +++ b/src/pysdmx/io/xml/sdmx21/writer/__init__.py @@ -1,4 +1,5 @@ """SDMX 2.1 writer package.""" + from pathlib import Path from typing import Any, Dict, Optional diff --git a/tests/io/csv/sdmx10/writer/test_writer_v1.py b/tests/io/csv/sdmx10/writer/test_writer_v1.py index bbc636e8..615ea072 100644 --- a/tests/io/csv/sdmx10/writer/test_writer_v1.py +++ b/tests/io/csv/sdmx10/writer/test_writer_v1.py @@ -43,6 +43,7 @@ def test_to_sdmx_csv_writing(data_path, data_path_reference): check_like=True, ) + def test_to_sdmx_csv_writing_to_file(data_path, data_path_reference, tmpdir): urn = "urn:sdmx:org.sdmx.infomodel.datastructure.DataFlow=MD:DS1(1.0)" @@ -61,6 +62,7 @@ def test_to_sdmx_csv_writing_to_file(data_path, data_path_reference, tmpdir): check_like=True, ) + def test_writer_attached_attrs(data_path, data_path_reference_atch_atts): urn = "urn:sdmx:org.sdmx.infomodel.datastructure.DataFlow=MD:DS1(1.0)" dataset = PandasDataset( diff --git a/tests/io/csv/sdmx20/writer/test_writer_v2.py b/tests/io/csv/sdmx20/writer/test_writer_v2.py index a4ca02b4..8425bba9 100644 --- a/tests/io/csv/sdmx20/writer/test_writer_v2.py +++ b/tests/io/csv/sdmx20/writer/test_writer_v2.py @@ -53,6 +53,7 @@ def test_to_sdmx_csv_writing(data_path, data_path_reference): check_like=True, ) + def test_to_sdmx_csv_writing_to_file(data_path, data_path_reference, tmpdir): urn = ( "urn:sdmx:org.sdmx.infomodel.registry." @@ -74,7 +75,6 @@ def test_to_sdmx_csv_writing_to_file(data_path, data_path_reference, tmpdir): ) - def test_writer_attached_attrs(data_path, data_path_reference_attch_atts): dataset = PandasDataset( attributes={"DECIMALS": 3}, From 9c629b5452e2d9347d95d2ba659328a37e3a8cd1 Mon Sep 17 00:00:00 2001 From: "javier.hernandez" Date: Thu, 19 Dec 2024 18:07:21 +0100 Subject: [PATCH 14/27] Added VTL example. Signed-off-by: javier.hernandez --- docs/howto/vtl.rst | 129 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 127 insertions(+), 2 deletions(-) diff --git a/docs/howto/vtl.rst b/docs/howto/vtl.rst index 980e6347..4a684c9b 100644 --- a/docs/howto/vtl.rst +++ b/docs/howto/vtl.rst @@ -1,6 +1,131 @@ .. _vtl: -Using VTL for validation -======================== +Using VTL for Validation +^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. important:: + A seamless integration of ``pysdmx`` and ``vtlengine`` will modify this + tutorial. The current version is a placeholder for the upcoming changes. + For the latest updates, please check + `issue #158 `_. + +In this tutorial, we shall examine the utilization of ``pysdmx`` +for reading **data** and **metadata** to generate and operate on +datapoints using ``vtlengine``. + +Numerous types of operations can be performed; however, this +tutorial will focus exclusively on the fundamental ones. + +.. contents:: + :local: + :depth: 2 + +Required Metadata +----------------- + +For the present scenario, the requisite metadata is contingent +upon the desired operations. For reference please check +`sdmx to vtl documentation `_ + +Step-by-Step Solution +--------------------- + +``pysdmx`` facilitates the reading of data and metadata from an SDMX +file. For the purpose of this tutorial, we shall employ the XML files +``metadata.xml`` (data structure) and ``data.xml`` (data). + +Reading the Data +~~~~~~~~~~~~~~~~ + +The initial step involves reading the data structure and data from the +SDMX files: + +.. code-block:: python + + def read_sample(path: Path): + with open(path, "r") as f: + return f.read() + + # Read metadata + metadata_sample = read_sample(Path("metadata.xml")) + meta_content, filetype = process_string_to_read(metadata_sample) + metadata_result = read_xml(meta_content, validate=True) + + # Read data + data_sample = read_sample(Path("data.xml")) + data_content, filetype = process_string_to_read(data_sample) + data_result = read_xml(data_content, validate=True) + +Filtering the Data +~~~~~~~~~~~~~~~~~~ + +Subsequent to obtaining the metadata and data, the desired dataflows and +data structures must be filtered: + +.. code-block:: python + + data_structure_1 = metadata_result["DataStructures"]["DS_1"] + data_1 = data_result["DS_1"].data + + data_structure_2 = metadata_result["DataStructures"]["DS_2"] + data_2 = data_result["DS_2"].data + +Parsing the Metadata +~~~~~~~~~~~~~~~~~~~~ + +To construct the datapoint, the metadata must be converted to the VTL +format using the ``to_vtl_json`` upcoming **DataStructureDefinition** method: + +.. code-block:: python + + from pysdmx.model.dataflow import Component, DataStructureDefinition, Role + from pysdmx.model.__utils import VTL_DTYPES_MAPPING, VTL_ROLE_MAPPING + + def to_vtl_json( + dsd: DataStructureDefinition, path: Optional[str] = None + ) -> Optional[Dict[str, Any]]: + """Formats the DataStructureDefinition as a VTL DataStructure.""" + + dataset_name = dsd.id + components = [] + NAME = "name" + ROLE = "role" + TYPE = "type" + NULLABLE = "nullable" + + _components: List[Component] = [] + _components.extend(dsd.components.dimensions) + _components.extend(dsd.components.measures) + _components.extend(dsd.components.attributes) + + for c in _components: + _type = VTL_DTYPES_MAPPING[c.dtype] + _nullability = c.role != Role.DIMENSION + _role = VTL_ROLE_MAPPING[c.role] + + component = { + NAME: c.id, + ROLE: _role, + TYPE: _type, + NULLABLE: _nullability, + } + + components.append(component) + + result = { + "datasets": [{"name": dataset_name, "DataStructure": components}] + } + if path is not None: + with open(path, "w") as fp: + json.dump(result, fp) + return None + + return result + + vtl_data_structure_1 = to_vtl_json(data_structure_1) + vtl_data_structure_2 = to_vtl_json(data_structure_2) + +Preparing the Dictionary +~~~~~~~~~~~~~~~~~~~~~~~~ To be defined \ No newline at end of file From 535ca867ad7bbee859fa66aae0a5b1c7a7b5ad27 Mon Sep 17 00:00:00 2001 From: "javier.hernandez" Date: Thu, 19 Dec 2024 18:26:07 +0100 Subject: [PATCH 15/27] Updated VTL example. Signed-off-by: javier.hernandez --- docs/howto/vtl.rst | 46 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 45 insertions(+), 1 deletion(-) diff --git a/docs/howto/vtl.rst b/docs/howto/vtl.rst index 4a684c9b..6b1c83cb 100644 --- a/docs/howto/vtl.rst +++ b/docs/howto/vtl.rst @@ -128,4 +128,48 @@ format using the ``to_vtl_json`` upcoming **DataStructureDefinition** method: Preparing the Dictionary ~~~~~~~~~~~~~~~~~~~~~~~~ -To be defined \ No newline at end of file +To create the datapoint, a dictionary containing the required data and +structures must first be prepared. The arguments `data_structures` and +`datapoints` support the following types: + +- `Dict[str, Any]` +- `Path` +- `List[Union[Dict[str, Any], Path]]` + +The example below uses dictionaries for simplicity: + +.. code-block:: python + + vtl_data_structures = { + "DS_1": vtl_data_structure_1, + "DS_2": vtl_data_structure_2, + } + + datapoints = { + "DS_1": data_1, + "DS_2": data_2, + } + +Defining the Expression and Execution +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Next, define the expression to be executed and utilize the ``run`` +method of ``vtlengine`` to perform the operation. The following example +demonstrates the addition of the datapoints `DS_1` and `DS_2`, with the +result assigned to a new datapoint `DS_r`: + +For reference please check +`vtlengine run documentation `_ + +.. code-block:: python + + import vtlengine + + expression = "DS_r <- DS_1 + DS_2;" + + run_result = run( + script=expression, + data_structures=vtl_data_structures, + datapoints=datapoints, + return_only_persistent=True, + ) From 6b2c025de639cc40aadd1c3279c581ac17a3ba68 Mon Sep 17 00:00:00 2001 From: "javier.hernandez" Date: Fri, 20 Dec 2024 09:38:29 +0100 Subject: [PATCH 16/27] Updated release notes. Signed-off-by: javier.hernandez --- docs/release.rst | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/docs/release.rst b/docs/release.rst index bd769e7c..31bab2c6 100644 --- a/docs/release.rst +++ b/docs/release.rst @@ -1,7 +1,7 @@ Release notes ============= -1.0.0 (2024-12) +1.0.0 (2024-12-20) --------------- Features @@ -14,3 +14,13 @@ Features - Offer SDMX-REST query builders and a service client to execute queries against SDMX-REST services. - Offer functions to handle SDMX URNs. +- Offer data readers and writers for the following formats: + - SDMX-ML 2.1 + - GenericData (Series & AllDimensions) + - StructureSpecificData (Series & AllDimensions) + - SDMX-CSV 2.0 + - SDMX-CSV 1.0 +- Offers structures readers and writers for the following formats: + - SDMX-ML 2.1 + - SDMX-JSON 2.0 + - FusionJSON From b4a34aa013ea0bee45a08a421a81b946763b0107 Mon Sep 17 00:00:00 2001 From: "javier.hernandez" Date: Tue, 14 Jan 2025 09:51:06 +0100 Subject: [PATCH 17/27] Updated copyright on conf. Signed-off-by: javier.hernandez --- docs/conf.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/conf.py b/docs/conf.py index db29a8c5..29dc8523 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -21,7 +21,7 @@ # -- Project information ----------------------------------------------------- project = "pysdmx" -copyright = "2024, BIS, MeaningfulData" +copyright = "2025, BIS" author = "BIS, MeaningfulData" From 2403d815318b5abb6c177e093ee4947942fb339f Mon Sep 17 00:00:00 2001 From: "javier.hernandez" Date: Tue, 14 Jan 2025 09:51:49 +0100 Subject: [PATCH 18/27] Updated vtl docs with better wording. Signed-off-by: javier.hernandez --- docs/howto/vtl.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/howto/vtl.rst b/docs/howto/vtl.rst index 6b1c83cb..4dee0de5 100644 --- a/docs/howto/vtl.rst +++ b/docs/howto/vtl.rst @@ -23,7 +23,7 @@ tutorial will focus exclusively on the fundamental ones. Required Metadata ----------------- -For the present scenario, the requisite metadata is contingent +For the present scenario, the required metadata is contingent upon the desired operations. For reference please check `sdmx to vtl documentation `_ @@ -31,7 +31,7 @@ Step-by-Step Solution --------------------- ``pysdmx`` facilitates the reading of data and metadata from an SDMX -file. For the purpose of this tutorial, we shall employ the XML files +file or service. For the purpose of this tutorial, we shall employ the XML files ``metadata.xml`` (data structure) and ``data.xml`` (data). Reading the Data From 90e6d5bef035d2dc2cd386d00009489defd52d6d Mon Sep 17 00:00:00 2001 From: "javier.hernandez" Date: Tue, 14 Jan 2025 10:47:57 +0100 Subject: [PATCH 19/27] Updated vtl docs with new io package functions. Signed-off-by: javier.hernandez --- docs/howto/vtl.rst | 54 +++++++++++++++++++++++++--------------------- 1 file changed, 29 insertions(+), 25 deletions(-) diff --git a/docs/howto/vtl.rst b/docs/howto/vtl.rst index 4dee0de5..1bb9463a 100644 --- a/docs/howto/vtl.rst +++ b/docs/howto/vtl.rst @@ -5,8 +5,9 @@ Using VTL for Validation .. important:: A seamless integration of ``pysdmx`` and ``vtlengine`` will modify this - tutorial. The current version is a placeholder for the upcoming changes. - For the latest updates, please check + tutorial. The current version is a placeholder for the upcoming changes + showing the use of both libraries separated. + For the latest updates on VTL usage, please check `issue #158 `_. In this tutorial, we shall examine the utilization of ``pysdmx`` @@ -32,46 +33,49 @@ Step-by-Step Solution ``pysdmx`` facilitates the reading of data and metadata from an SDMX file or service. For the purpose of this tutorial, we shall employ the XML files -``metadata.xml`` (data structure) and ``data.xml`` (data). +``structures.xml`` (data structure) and ``data.csv`` (data). Reading the Data ~~~~~~~~~~~~~~~~ The initial step involves reading the data structure and data from the -SDMX files: +SDMX files. The following code snippet demonstrates the process: .. code-block:: python - def read_sample(path: Path): - with open(path, "r") as f: - return f.read() + from pathlib import Path - # Read metadata - metadata_sample = read_sample(Path("metadata.xml")) - meta_content, filetype = process_string_to_read(metadata_sample) - metadata_result = read_xml(meta_content, validate=True) + # Path to the structures file in SDMX-ML 2.1 (same directory as this script) + path_to_structures = Path(__file__).parent / "structures.xml" - # Read data - data_sample = read_sample(Path("data.xml")) - data_content, filetype = process_string_to_read(data_sample) - data_result = read_xml(data_content, validate=True) + # Path to the data file + path_to_data = Path(__file__).parent / "data.csv" -Filtering the Data -~~~~~~~~~~~~~~~~~~ + # Get Structures SDMX Message + structures_msg = read_sdmx(path_to_structures) -Subsequent to obtaining the metadata and data, the desired dataflows and -data structures must be filtered: + # Get Data message + data_msg = read_sdmx(path_to_data) + + +Extracting the Data and Data Structure +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +After reading the data and metadata, the next step is to extract the +data and data structure from the SDMX messages. The following code snippet demonstrates +the process using the Short URN ``SDMX_TYPE=AGENCY_ID:ID(VERSION)`` .. code-block:: python - data_structure_1 = metadata_result["DataStructures"]["DS_1"] - data_1 = data_result["DS_1"].data + # Extract the data structure and data for DS_1 + data_structure_1 = structures_msg.get_data_structure_definition("DataStructure=MD:DS_1(1.0)") + data_1 = data_msg.get_data("DataStructure=MD:DS_1(1.0)") + + # Extract the data structure and data for DS_2 + data_structure_2 = structures_msg.get_data_structure_definition("DataStructure=BIS:DS_2(1.0)") + data_2 = data_msg.get_data("DataStructure=BIS:DS_2(1.0)") - data_structure_2 = metadata_result["DataStructures"]["DS_2"] - data_2 = data_result["DS_2"].data -Parsing the Metadata -~~~~~~~~~~~~~~~~~~~~ To construct the datapoint, the metadata must be converted to the VTL format using the ``to_vtl_json`` upcoming **DataStructureDefinition** method: From d857a0427308408a80cb2f6179e8ae10824146e7 Mon Sep 17 00:00:00 2001 From: "javier.hernandez" Date: Tue, 14 Jan 2025 10:49:04 +0100 Subject: [PATCH 20/27] Updated rw functions with new io functions. Signed-off-by: javier.hernandez --- docs/howto/data_rw.rst | 79 ++++++++++++++++++++++++++++++++---------- 1 file changed, 60 insertions(+), 19 deletions(-) diff --git a/docs/howto/data_rw.rst b/docs/howto/data_rw.rst index 70e60631..b0b98bae 100644 --- a/docs/howto/data_rw.rst +++ b/docs/howto/data_rw.rst @@ -20,26 +20,66 @@ Currently, all data-related readers and writers are based on PandasDataset class Reading data ------------ -To read data, we may pass the string to the reading functions or use the input processor: +To read data, we recommend using the read_sdmx function or the get_datasets function: -.. automodule:: pysdmx.io.input_processor - :members: process_string_to_read +.. automodule:: pysdmx.io.reader + :members: read_sdmx -A typical example to read data from a file, a string or a buffer +A typical example to read data from a file, a string or a buffer, using read_sdmx .. code-block:: python - from pysdmx.io.input_processor import process_string_to_read - # Import from desired reader + from pysdmx.io import read_sdmx - # Read file sample.csv from the same folder as this code - file_path = Path(__file__).parent / "sample.csv" - input_str, extension = process_string_to_read(file_path) + # Read file from the same folder as this code + file_path = Path(__file__).parent / "sample.csv" - # Using reader, result will be a dictionary (key: dataset.short_urn, value: dataset) - datasets = read(input_str) - # Accessing the data of the test dataset - df = dataset["DataStructure=TEST_AGENCY:TEST_ID(1.0)"].data + # Read from file + data_msg = read_sdmx(file_path) + + # Read from URL + data_msg = read_sdmx("https://example.com/sample.csv") + + # Extracting the datasets (list of Dataset) + datasets = data_msg.data + + # Accessing the data of the test dataset by its Short URN + df = data_msg.get_dataset("DataStructure=TEST_AGENCY:TEST_ID(1.0)").data + + # Accessing the data of the test dataset by its position in the SDMX Message + df = data_msg.data[0].data + + +By default, the read_sdmx function will automatically detect the format of the file and use the appropriate reader. We may as well use the get_datasets to associate a dataset to its Schema: + +.. automodule:: pysdmx.io.reader + :members: get_datasets + +.. important:: + + If the structures message is used, the get_datasets function will associate the dataset to its Schema. If the structures message is not used, the get_datasets function will return a list of datasets without any Schema association. + If a dataset references a dataflow, the structure message requires to have the dataflow children (or all descendants), i.e. the DataStructureDefinitions associated to this Dataflow in the same SDMX Message (with or without referenced artefacts like Codelists, ConceptSchemes, etc). + +.. code-block:: python + + from pysdmx.io import get_datasets + + # Read file from the same folder as this code (SDMX-CSV 2.0) + data_path = Path(__file__).parent / "sample.csv" + + # Data contains a reference to the dataflow ``Dataflow=MD:TEST(1.0)`` + datasets = get_datasets(data_path) + + print(datasets[0].structure) # Outputs a string with the Schema Short URN -> "Dataflow=MD:TEST(1.0)" + + # Reading the datasets and associating the schema + datasets = get_datasets(data_path, "https://example.com/dataflow/MD/TEST/1.0?references=descendants") + + print(datasets[0].structure) # Outputs a Schema object with the associated components + + +Both methods are based on the individual readers for each format supported, which are described below. +All individual readers will have as input a string. SDMX-CSV 1.0 ^^^^^^^^^^^^ @@ -57,16 +97,17 @@ SDMX-CSV 1.0 from pysdmx.io.input_processor import process_string_to_read from pysdmx.io.csv.sdmx10.reader import read + from pathlib import Path # Read file sample.csv from the same folder as this code file_path = Path(__file__).parent / "sample10.csv" - input_str, extension = process_string_to_read(file_path) + input_str, format = process_string_to_read(file_path) - # Using reader, result will be a dictionary (key: dataset.short_urn, value: dataset) + # Using reader, result will be a list of datasets datasets = read(input_str) # Accessing the data of the test dataset - df = dataset["Dataflow=TEST_AGENCY:TEST_ID(1.0)"].data + df = dataset[0].data SDMX-CSV 2.0 ^^^^^^^^^^^^ @@ -90,12 +131,12 @@ raise an issue in `GitHub `_. # Read file from the same folder as this code file_path = Path(__file__).parent / "sample20.csv" - input_str, extension = process_string_to_read(file_path) + input_str, format = process_string_to_read(file_path) - # Using reader, result will be a dictionary (key: dataset.short_urn, value: dataset) + # Using reader, result will be a list of datasets datasets = read(input_str) # Accessing the data of the test dataset - df = dataset["DataStructure=TEST_AGENCY:TEST_ID(1.0)"].data + df = dataset[0].data SDMX-ML 2.1 ^^^^^^^^^^^ From c59e7d2a772c8d49b54904e3bed76e70f02a89bb Mon Sep 17 00:00:00 2001 From: "javier.hernandez" Date: Tue, 14 Jan 2025 16:10:47 +0100 Subject: [PATCH 21/27] Updated data_rw documentation. Signed-off-by: javier.hernandez --- docs/howto/data_rw.rst | 52 ++++++++++++++++++++++++------------------ 1 file changed, 30 insertions(+), 22 deletions(-) diff --git a/docs/howto/data_rw.rst b/docs/howto/data_rw.rst index b0b98bae..09b8fa04 100644 --- a/docs/howto/data_rw.rst +++ b/docs/howto/data_rw.rst @@ -138,13 +138,17 @@ raise an issue in `GitHub `_. # Accessing the data of the test dataset df = dataset[0].data -SDMX-ML 2.1 -^^^^^^^^^^^ +SDMX-ML 2.1 Data Readers +^^^^^^^^^^^^^^^^^^^^^^^^ SDMX-ML 2.1 format is described `here (pdf file for IM) `_ -.. autofunction:: pysdmx.io.xml.sdmx21.reader.read_xml +```pysdmx`` supports both Generic and Structure Specific SDMX-ML 2.1 to handle data on SDMX-ML, both as All Dimensions or Series format. + +.. autofunction:: pysdmx.io.xml.sdmx21.reader.generic.read + +.. autofunction:: pysdmx.io.xml.sdmx21.reader.structure_specific.read We do not support the following elements: @@ -156,24 +160,25 @@ It will automatically detect any structural validation errors (if validate=True) .. warning:: - The SDMX-ML 2.1 Generic format is deprecated and should not be used for new implementations. - SDMX-ML 3.0 only uses the Structure Specific format, which is more efficient and easier to use. + The SDMX-ML 2.1 Generic format is deprecated and should not be used for new implementations. SDMX-ML 3.0 only uses the Structure Specific format, which is more efficient and easier to use. .. code-block:: python from pysdmx.io.input_processor import process_string_to_read - from pysdmx.io.xml.sdmx21.reader import read_xml + from pysdmx.io.xml.sdmx21.reader.generic import read as read_generic # For Generic format + from pysdmx.io.xml.sdmx21.reader.structure_specific import read # For Structure Specific format from pathlib import Path # Read file from the same folder as this code file_path = Path(__file__).parent / "sample21.xml" - input_str, extension = process_string_to_read(file_path) + input_str, format = process_string_to_read(file_path) - # Using reader, result will be a dictionary (key: dataset.short_urn, value: dataset) - datasets = read_xml(input_str, validate=True) + # Using reader, result will be a list of datasets + datasets = read(input_str, validate=True) # Accessing the data of the test dataset - df = dataset["DataStructure=TEST_AGENCY:TEST_ID(1.0)"].data + df = dataset[0].data + Writing data ------------ @@ -199,7 +204,9 @@ SDMX-CSV 1.0 # Write to file sample.csv in the same folder as this code file_path = Path(__file__).parent / "sample.csv" - writer(dataset, file_path) + + # Write the datasets (list of Dataset or PandasDataset) to the file + writer(datasets, file_path) SDMX-CSV 2.0 @@ -226,8 +233,8 @@ SDMX-CSV 2.0 file_path = Path(__file__).parent / "sample.csv" writer(dataset, file_path) -SDMX-ML 2.1 -^^^^^^^^^^^ +SDMX-ML 2.1 Data Writers +^^^^^^^^^^^^^^^^^^^^^^^^ SDMX-ML 2.1 format is described `here (pdf file for IM) `_ @@ -237,27 +244,28 @@ SDMX-ML 2.1 format allows to write multiple datasets at once. To use the Series .. important:: For each dataset, if dataset.structure is not a Schema, the writer can only write in the Structure Specific All Dimensions format. - We perform a check to ensure that the dataset has a Schema structure for the remaining formats as we need to know the roles for each component. This check also ensures that the dataset.structure has at least one dimension and one measure defined. + We perform a check to ensure that the dataset has a Schema structure for the remaining formats as we need to know the roles for each component. + This check also ensures that the dataset.structure has at least one dimension and one measure defined. -.. autofunction:: pysdmx.io.xml.sdmx21.writer.writer +.. autofunction:: pysdmx.io.xml.sdmx21.writer.generic.write +.. autofunction:: pysdmx.io.xml.sdmx21.writer.structure_specific.write .. code-block:: python - from pysdmx.io.xml.sdmx21.writer import writer + from pysdmx.io.xml.sdmx21.writer.generic import write as write_generic # For Generic format + from pysdmx.io.xml.sdmx21.writer.structure_specific import write # For StructureSpecific format from pysdmx.io.xml.enums import MessageType from pathlib import Path - # Dictionary of datasets to write - datasets = { - "DataStructure=TEST_AGENCY:TEST_ID(1.0)": dataset - } + # List of datasets to write + datasets = [dataset1, dataset2] - # Dimension at observation mapping + # Dimension at observation mapping (do not need to set them all if not needed dim_mapping = { "DataStructure=TEST_AGENCY:TEST_ID(1.0)": "TIME_PERIOD" } # Write to file sample.xml in the same folder as this code file_path = Path(__file__).parent / "sample.xml" - writer(content=datasets, type_=MessageType.StructureSpecificDataSet, path=file_path, dimension_at_observation=dim_mapping) + write(datasets, file_path, dimension_at_observation=dim_mapping) # This will write a Dataset in Series and another in AllDimensions format From de1cf0493689897789330d78990f942bf6d70faa Mon Sep 17 00:00:00 2001 From: "javier.hernandez" Date: Wed, 15 Jan 2025 10:29:15 +0100 Subject: [PATCH 22/27] Updated data_rw reference. Signed-off-by: javier.hernandez --- docs/howto/data_rw.rst | 6 ++++++ docs/start.rst | 2 +- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/docs/howto/data_rw.rst b/docs/howto/data_rw.rst index 09b8fa04..6d104699 100644 --- a/docs/howto/data_rw.rst +++ b/docs/howto/data_rw.rst @@ -3,6 +3,12 @@ Reading and writing SDMX datasets ================================= +.. note:: + + This tutorial shows how to read and write SDMX datasets using ``pysdmx``. + + - :ref:`data-rw`. + ``pysdmx`` allows to read and write SDMX datasets in the following formats: - SDMX-CSV 1.0 (located in ``pysdmx.io.csv.sdmx10``) diff --git a/docs/start.rst b/docs/start.rst index 8e3d89e9..09c93728 100644 --- a/docs/start.rst +++ b/docs/start.rst @@ -81,7 +81,7 @@ web service`. Reading and writing SDMX datasets ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -[TO BE COMPLETED] +Head to the :ref:`how-to guide` to learn how to read and write SDMX datasets How can I get it? ----------------- From b17913e880d7cd3d1ded501aacce36f2a0e4438c Mon Sep 17 00:00:00 2001 From: Xavier Sosnovsky Date: Thu, 16 Jan 2025 10:45:00 +0100 Subject: [PATCH 23/27] Exclude _site folder generated by sphinx --- .gitignore | 1 + 1 file changed, 1 insertion(+) diff --git a/.gitignore b/.gitignore index cde3dee8..a7d9e105 100644 --- a/.gitignore +++ b/.gitignore @@ -32,6 +32,7 @@ coverage-report.xml # Ignore Sphinx doc build result **/_build +**/_site # Ignore tox related folders/files .tox/** From f486bf75c3accbdf49f797e1f8c6bd81809bf9f8 Mon Sep 17 00:00:00 2001 From: Xavier Sosnovsky Date: Thu, 16 Jan 2025 10:45:41 +0100 Subject: [PATCH 24/27] Add details about extras --- docs/start.rst | 27 ++++++++++++++++++--------- 1 file changed, 18 insertions(+), 9 deletions(-) diff --git a/docs/start.rst b/docs/start.rst index 09c93728..f5f91723 100644 --- a/docs/start.rst +++ b/docs/start.rst @@ -58,11 +58,8 @@ However, metadata can do so much more than that, i.e. they can be "active" and ``pysdmx`` supports retrieving metadata from an SDMX Registry or any service compliant with the SDMX-REST 2.0.0 (or above) API. -Install ``pysdmx`` with the ``fmr`` extra to enable this functionality: - -.. code:: bash - - pip install pysdmx[fmr] +These classes are part of the core functionality and don't require additional +installations. Data discovery and data retrieval ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -81,7 +78,7 @@ web service`. Reading and writing SDMX datasets ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Head to the :ref:`how-to guide` to learn how to read and write SDMX datasets +Head to the :ref:`how-to guide` to learn how to read and write SDMX datasets. How can I get it? ----------------- @@ -98,9 +95,21 @@ For the core functionality, use: Some use cases require additional dependencies, which can be installed using `extras `_. For example, -to retrieve metadata from an SDMX Registry, install the ``fmr`` -extra: +to parse SDMX-ML messages, install the ``xml`` extra: .. code:: bash - pip install pysdmx[fmr] + pip install pysdmx[xml] + +The following extras are available: + +.. list-table:: Available extras + :widths: 25 50 + :header-rows: 1 + + * - Name + - Purpose + * - ``xml`` + - Read and Write SDMX-ML messages + * - ``pandas`` + - Handle SDMX datasets as Pandas data frames From 025819020153c6659cb703c339975bc94ead6a37 Mon Sep 17 00:00:00 2001 From: Xavier Sosnovsky Date: Fri, 17 Jan 2025 09:47:04 +0100 Subject: [PATCH 25/27] Update readme date --- docs/release.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/release.rst b/docs/release.rst index 31bab2c6..c220e607 100644 --- a/docs/release.rst +++ b/docs/release.rst @@ -1,11 +1,11 @@ Release notes ============= -1.0.0 (2024-12-20) ---------------- +1.0.0 (2025-01-20) +------------------ -Features -^^^^^^^^ +Added +^^^^^ - Offer core domain classes for the SDMX information model. - Offer sync and async clients to retrieve metadata From 9db62476b336f4d1bfc1ee9ae4a8190d4337aae7 Mon Sep 17 00:00:00 2001 From: Xavier Sosnovsky Date: Fri, 17 Jan 2025 09:49:09 +0100 Subject: [PATCH 26/27] Update readme to include reading and writing data --- README.rst | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.rst b/README.rst index 028961b7..f44a41f7 100644 --- a/README.rst +++ b/README.rst @@ -22,6 +22,8 @@ Key features: - **Metadata in action**: ``pysdmx`` supports retrieving metadata from an SDMX Registry or any service compliant with the SDMX-REST 2.0.0 API. Use these metadata to power statistical processes. +- **Reading and writing SDMX files**: ``pysdmx`` support reading and writing + SDMX data and structure messages, in various formats. - **Data discovery and retrieval**: This functionality is under development. ``pysdmx`` will enable listing public SDMX services, discovering available data available, and retrieving data from these services. From 8e919579c909ecb0b77fa91e79a7b41458dba3cf Mon Sep 17 00:00:00 2001 From: "javier.hernandez" Date: Fri, 17 Jan 2025 12:43:28 +0100 Subject: [PATCH 27/27] Added warning on data reading and writing tutorial. Signed-off-by: javier.hernandez --- docs/howto/data_rw.rst | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/docs/howto/data_rw.rst b/docs/howto/data_rw.rst index 6d104699..27f0f290 100644 --- a/docs/howto/data_rw.rst +++ b/docs/howto/data_rw.rst @@ -9,6 +9,21 @@ Reading and writing SDMX datasets - :ref:`data-rw`. +.. warning:: + To read and write data, you must use the extra "data". You may need to install it using the following command: + + .. code-block:: bash + + pip install pysdmx[data] + + For SDMX-ML format, you need to install the extra "xml" as well: + + .. code-block:: bash + + pip install pysdmx[data,xml] + + + ``pysdmx`` allows to read and write SDMX datasets in the following formats: - SDMX-CSV 1.0 (located in ``pysdmx.io.csv.sdmx10``)