The extractor-utils
package is an extension of the Cognite Python SDK intended to simplify the development of data
extractors or other integrations for Cognite Data Fusion.
Documentation is hosted here, including a quickstart tutorial.
The changelog is found here.
The best way to start a new extractor project is to use the cogex
CLI. You can install that from PyPI:
pip install cognite-extractor-manager
To initialize a new extractor project, run
cogex init
in the directory you want your extractor project in. The cogex
CLI will prompt you for some information about your
project, and then set up a poetry environment, git repository, commit hooks with type and style checks and load a
template.
Some source systems have a lot in common, such as RESTful APIs or systems exposing as MQTT. We therefore have extensions
to extractor-utils
tailroed to these protocols. These can be found in separate packages:
The package is open source under the Apache 2.0 license, and contribtuions are welcome.
This project adheres to the Contributor Covenant v2.0 as a code of conduct.
We use uv to manage dependencies and to administrate virtual environments. To develop
extractor-utils
, follow the following steps to set up your local environment:
-
Install uv if you haven't already.
-
Clone repository:
$ git clone git@github.com:cognitedata/python-extractor-utils.git
-
Move into the newly created local repository:
$ cd python-extractor-utils
-
Create virtual environment and install dependencies:
$ uv sync
All code must pass black and isort style checks to be merged. It is recommended to install pre-commit hooks to ensure this locally before commiting code:
$ poetry run pre-commit install
Each public method, class and module should have docstrings. Docstrings are written in the Google style. Please include unit and/or integration tests for submitted code, and remember to update the changelog.