Releases: kamu-data/kamu-cli
Releases · kamu-data/kamu-cli
Release v0.199.1
[0.199.1] - 2024-09-06
Fixed
- Fixed crash when a derived dataset is manually forced to update while an existing flow
for this dataset is already waiting for a batching condition
Release v0.199.0
[0.199.0] - 2024-09-06
Added
- Persistency has been enabled for Task and Flow domains.
BothTaskExecutor
andFlowExecutor
now fully support transactional processing mode,
and save state in Postgres or Sqlite database. - Tasks now support attaching metadata properties. Storing task->flow association as this type of metadata.
- Flows and Tasks now properly recover the unfinished requests after server restart
Changed
- Simplified database schema for flow configurations and minimized number of migrations
(breaking change of the database schema) - Introduced
pre_run()
phase in flow executor, task executor & outbox processor to avoid startup races - Explicit in-memory task queue has been eliminated and replaced with event store queries
- Get Data Panel: use SmTP for pull & push links
- GQL api method
setConfigCompaction
allows to setmetadataOnly
configuration for both root and derived datasets - GQL api
triggerFlow
allows to triggerHARD_COMPACTION
flow inmetadataOnly
mode for both root and derived datasets
Release v0.198.2
[0.198.2] - 2024-08-30
Added
- Container sources allow string interpolation in env vars and command
- Private Datasets, changes related to Smart Transfer Protocol:
kamu push
: added--visibility private|public
argument to specify the created dataset visibility- Send the visibility attribute in the initial request of the push flow
Changed
- Schema propagation improvements:
- Dataset schema will be defined upon first ingest, even if no records were returned by the source
- Schema will also be defined for derivative datasets even if no records produced by the transformation
- Above ensures that datasets that for a long time don't produce any data will not block data pipelines
- Smart Transfer Protocol:
- Use
CreateDatasetUseCase
in case of creation at the time of the dataset pulling - Now requires the
x-odf-smtp-version
header, which is used to compare client and server versions to prevent issues with outdated clients
- Use
Release v0.198.1
[0.198.1] - 2024-08-28
Added
- Private Datasets, ReBAC integration:
- ReBAC properties update based on
DatasetLifecycleMessage
's: kamu add
: added hidden--visibility private|public
argument, assumed to be used in multi-tenant case- GQL:
DatasetsMut
:createEmpty()
: added optionaldatasetVisibility
argumentcreateFromSnapshot()
: added optionaldatasetVisibility
argument
- ReBAC properties update based on
Release v0.198.0
[0.198.0] - 2024-08-27
Changed
- If a polling/push source does not declare a
read
schema or apreprocess
step (which is the case when ingesting data from a file upload) we apply the following new inference rules:- If
event_time
column is present - we will try to coerce it into a timestamp:- strings will be parsed as RFC3339 date-times
- integers will be treated as UNIX timestamps in seconds
- Columns with names that conflict with system columns will get renamed
- If
- All tests related to databases use the
database_transactional_test
macro - Some skipped tests will now also be run
- Access token with duplicate names can be created if such name exists but was revoked (now for MySQL as well)
- Updated
sqlx
crate to address RUSTSEC-2024-0363
Fixed
- Derivative transform crash when input datasets have
AddData
events but don't have any Parquet files yet
Release v0.197.0
[0.197.0] - 2024-08-22
Changed
- Breaking: Using DataFusion's
enable_ident_normalization = false
setting to work with upper case identifiers without needing to put quotes everywhere. This may impact your root and derivative datasets. - Datafusion transform engine was updated to latest version and includes JSON extensions
- Breaking: Push ingest from
csv
format will default toheader: true
in case schema was not explicitly provided - Access token with duplicate names can be created if such name exists but was revoked
- Many examples were simplified due to ident normalization changes
Fixed
- Crash in
kamu login
command on 5XX server responses
Added
- HTTP sources now include
User-Agent
header that defaults tokamu-cli/{major}.{minor}.{patch}
- Externalized configuration of HTTP source parameters like timeouts and redirects
- CI: build
sqlx-cli
image if it is missing
Release v0.196.0
[0.196.0] - 2024-08-19
Added
- The
/ingest
endpoint will try to infer the media type of file by extension if not specified explicitly during upload.
This resolves the problem with415 Unsupported Media Type
errors when uploading.ndjson
files from the Web UI. - Private Datasets, preparation work:
- Added SQLite-specific implementation of ReBAC repository
- Added SQLite-specific implementation of
DatasetEntryRepository
internal-error
crate:- Added
InternalError::reason()
to get the cause of an error - Added methods to
ResultIntoInternal
:map_int_err()
- shortcut forresult.int_err().map_err(...)
combinationcontext_int_err()
- ability to add a context message to an error
- Added
- Added macro
database_transactional_test!()
to minimize boilerplate code
Changed
sqlx
v0.8- Renamed
setConfigSchedule
GQL api tosetConfigIngest
. Also extended
setConfigIngest
with new fieldfetchUncacheable
which indicates to ingone cache
during ingest step
Release v0.195.1
[0.195.1] - 2024-08-16
Fixed
- Add
reset
ENUM variant todataset_flow_type
in postgres migration
Release v0.195.0
[0.195.0] - 2024-08-16
Added
- Reliable transaction-based internal cross-domain message passing component (
MessageOutbox
), replacingEventBus
- Metadata-driven producer/consumer annotations
- Immediate and transaction-backed message delivery
- Background transactional message processor, respecting client idempotence
- Persistent storage for flow configuration events
Changed
- Upgraded to
datafusion v41
(#713) - Introduced use case layer, encapsulating authorization checks and action validations, for first 6 basic dataset scenarios
(creating, creating from snapshot, deleting, renaming, committing an event, syncing a batch of events), - Separated
DatasetRepository
on read-only and read-write parts - Isolated
time-source
library
Fixed
- E2E: added additional force off colors to exclude sometimes occurring ANSI color sequences
- E2E: modify a workaround for MySQL tests
Release v0.194.1
[0.194.1] - 2024-08-14
Fixed
- Add
recursive
field toReset
flow configurations in GQL Api which triggersHardCompaction
inKeepMetadataOnly
mode flow for each owned downstream dependency