Skip to content

Releases: kamu-data/kamu-cli

Release v0.199.1

06 Sep 19:38
9d1700f
Compare
Choose a tag to compare

[0.199.1] - 2024-09-06

Fixed

  • Fixed crash when a derived dataset is manually forced to update while an existing flow
    for this dataset is already waiting for a batching condition

Release v0.199.0

06 Sep 16:28
43de6c2
Compare
Choose a tag to compare

[0.199.0] - 2024-09-06

Added

  • Persistency has been enabled for Task and Flow domains.
    Both TaskExecutor and FlowExecutor now fully support transactional processing mode,
    and save state in Postgres or Sqlite database.
  • Tasks now support attaching metadata properties. Storing task->flow association as this type of metadata.
  • Flows and Tasks now properly recover the unfinished requests after server restart

Changed

  • Simplified database schema for flow configurations and minimized number of migrations
    (breaking change of the database schema)
  • Introduced pre_run() phase in flow executor, task executor & outbox processor to avoid startup races
  • Explicit in-memory task queue has been eliminated and replaced with event store queries
  • Get Data Panel: use SmTP for pull & push links
  • GQL api method setConfigCompaction allows to set metadataOnly configuration for both root and derived datasets
  • GQL api triggerFlow allows to trigger HARD_COMPACTION flow in metadataOnly mode for both root and derived datasets

Release v0.198.2

30 Aug 14:06
c2bbe8c
Compare
Choose a tag to compare

[0.198.2] - 2024-08-30

Added

  • Container sources allow string interpolation in env vars and command
  • Private Datasets, changes related to Smart Transfer Protocol:
    • kamu push: added --visibility private|public argument to specify the created dataset visibility
    • Send the visibility attribute in the initial request of the push flow

Changed

  • Schema propagation improvements:
    • Dataset schema will be defined upon first ingest, even if no records were returned by the source
    • Schema will also be defined for derivative datasets even if no records produced by the transformation
    • Above ensures that datasets that for a long time don't produce any data will not block data pipelines
  • Smart Transfer Protocol:
    • Use CreateDatasetUseCase in case of creation at the time of the dataset pulling
    • Now requires the x-odf-smtp-version header, which is used to compare client and server versions to prevent issues with outdated clients

Release v0.198.1

28 Aug 11:22
d89741a
Compare
Choose a tag to compare

[0.198.1] - 2024-08-28

Added

  • Private Datasets, ReBAC integration:
    • ReBAC properties update based on DatasetLifecycleMessage's:
    • kamu add: added hidden --visibility private|public argument, assumed to be used in multi-tenant case
    • GQL: DatasetsMut:
      • createEmpty(): added optional datasetVisibility argument
      • createFromSnapshot(): added optional datasetVisibility argument

Release v0.198.0

27 Aug 14:56
81f2327
Compare
Choose a tag to compare

[0.198.0] - 2024-08-27

Changed

  • If a polling/push source does not declare a read schema or a preprocess step (which is the case when ingesting data from a file upload) we apply the following new inference rules:
    • If event_time column is present - we will try to coerce it into a timestamp:
      • strings will be parsed as RFC3339 date-times
      • integers will be treated as UNIX timestamps in seconds
    • Columns with names that conflict with system columns will get renamed
  • All tests related to databases use the database_transactional_test macro
  • Some skipped tests will now also be run
  • Access token with duplicate names can be created if such name exists but was revoked (now for MySQL as well)
  • Updated sqlx crate to address RUSTSEC-2024-0363

Fixed

  • Derivative transform crash when input datasets have AddData events but don't have any Parquet files yet

Release v0.197.0

22 Aug 09:01
3009ff0
Compare
Choose a tag to compare

[0.197.0] - 2024-08-22

Changed

  • Breaking: Using DataFusion's enable_ident_normalization = false setting to work with upper case identifiers without needing to put quotes everywhere. This may impact your root and derivative datasets.
  • Datafusion transform engine was updated to latest version and includes JSON extensions
  • Breaking: Push ingest from csv format will default to header: true in case schema was not explicitly provided
  • Access token with duplicate names can be created if such name exists but was revoked
  • Many examples were simplified due to ident normalization changes

Fixed

  • Crash in kamu login command on 5XX server responses

Added

  • HTTP sources now include User-Agent header that defaults to kamu-cli/{major}.{minor}.{patch}
  • Externalized configuration of HTTP source parameters like timeouts and redirects
  • CI: build sqlx-cli image if it is missing

Release v0.196.0

19 Aug 11:22
4570bbe
Compare
Choose a tag to compare

[0.196.0] - 2024-08-19

Added

  • The /ingest endpoint will try to infer the media type of file by extension if not specified explicitly during upload.
    This resolves the problem with 415 Unsupported Media Type errors when uploading .ndjson files from the Web UI.
  • Private Datasets, preparation work:
    • Added SQLite-specific implementation of ReBAC repository
    • Added SQLite-specific implementation of DatasetEntryRepository
  • internal-error crate:
    • Added InternalError::reason() to get the cause of an error
    • Added methods to ResultIntoInternal:
      • map_int_err() - shortcut for result.int_err().map_err(...) combination
      • context_int_err() - ability to add a context message to an error
  • Added macro database_transactional_test!() to minimize boilerplate code

Changed

  • sqlx v0.8
  • Renamed setConfigSchedule GQL api to setConfigIngest. Also extended
    setConfigIngest with new field fetchUncacheable which indicates to ingone cache
    during ingest step

Release v0.195.1

16 Aug 12:40
bf7229b
Compare
Choose a tag to compare

[0.195.1] - 2024-08-16

Fixed

  • Add reset ENUM variant to dataset_flow_type in postgres migration

Release v0.195.0

16 Aug 09:28
da0db9b
Compare
Choose a tag to compare

[0.195.0] - 2024-08-16

Added

  • Reliable transaction-based internal cross-domain message passing component (MessageOutbox), replacing EventBus
    • Metadata-driven producer/consumer annotations
    • Immediate and transaction-backed message delivery
    • Background transactional message processor, respecting client idempotence
  • Persistent storage for flow configuration events

Changed

  • Upgraded to datafusion v41 (#713)
  • Introduced use case layer, encapsulating authorization checks and action validations, for first 6 basic dataset scenarios
    (creating, creating from snapshot, deleting, renaming, committing an event, syncing a batch of events),
  • Separated DatasetRepository on read-only and read-write parts
  • Isolated time-source library

Fixed

  • E2E: added additional force off colors to exclude sometimes occurring ANSI color sequences
  • E2E: modify a workaround for MySQL tests

Release v0.194.1

14 Aug 14:05
5402c33
Compare
Choose a tag to compare

[0.194.1] - 2024-08-14

Fixed

  • Add recursive field to Reset flow configurations in GQL Api which triggers HardCompaction in KeepMetadataOnly mode flow for each owned downstream dependency