From 10b966fc8402af7388448dfcb453c8b53651e112 Mon Sep 17 00:00:00 2001 From: Beno!t POLASZEK Date: Fri, 17 Nov 2023 14:40:30 +0100 Subject: [PATCH] doc: improve doc (using Github alerts) --- README.md | 6 ++++-- doc/advanced_usage.md | 8 ++++---- doc/getting-started.md | 26 ++++++++++++++++++-------- 3 files changed, 26 insertions(+), 14 deletions(-) diff --git a/README.md b/README.md index 13ef6d0..88cc77a 100644 --- a/README.md +++ b/README.md @@ -48,10 +48,12 @@ Installation composer require bentools/etl:^4.0@alpha ``` -> **Warning #1**: Version 4.0 is a complete rewrite and introduces significant BC (backward compatibility) breaks. +> [!WARNING] +> Version 4.0 is a complete rewrite and introduces significant BC (backward compatibility) breaks. > Avoid upgrading from `^2.0` or `^3.0` unless you're fully aware of the changes. -> **Warning #2**: Version 4.0 is still at an alpha stage. BC breaks might occur between alpha releases. +> [!IMPORTANT] +> Version 4.0 is still at an alpha stage. BC breaks might occur between alpha releases. Usage ----- diff --git a/doc/advanced_usage.md b/doc/advanced_usage.md index a780787..5575d8c 100644 --- a/doc/advanced_usage.md +++ b/doc/advanced_usage.md @@ -31,9 +31,10 @@ $etl = (new EtlExecutor()) $etl->process('file:///tmp/cities.csv', $pdo); ``` -As you can see: -- Your transformer can _yield_ values, in case 1 extracted item becomes several items to load -- You can use `EtlState.destination` to retrieve the second argument you passed yo `$etl->process()`. +> [!IMPORTANT] +> As you can see: +> - Your transformer can _yield_ values, in case 1 extracted item becomes several items to load +> - You can use `EtlState.destination` to retrieve the second argument you passed yo `$etl->process()`. The `EtlState` object contains all elements relative to the state of your ETL workflow being running. @@ -54,7 +55,6 @@ But the last transformer of the chain (or your only one transformer) is determin - If your transformer `yields` values, each yielded value will be passed to the loader (and the loader will be called for each yielded value). - Next tick --------- diff --git a/doc/getting-started.md b/doc/getting-started.md index e11d920..f67ab0c 100644 --- a/doc/getting-started.md +++ b/doc/getting-started.md @@ -48,7 +48,8 @@ Then, let's have a look at `/tmp/cities.json`: ] ``` -Notice that we didn't _transform_ anything here, we just denormalized the CSV file to an array, then serialized that array to a JSON file. +> [!NOTE] +> We didn't _transform_ anything here, we just denormalized the CSV file to an array, then serialized that array to a JSON file. The `CSVExtractor` has some options to _read_ the data, such as considering that the 1st row is the column keys. @@ -92,14 +93,16 @@ Skipping items You can skip items at any time. -Use the `$state->skip()` method from the `EtlState` object as soon as your business logic requires it. +> [!TIP] +> Use the `skip()` method from the `EtlState` object as soon as your business logic requires it. Stopping the workflow --------------------- You can stop the workflow at any time. -Use the `$state->stop()` method from the `EtlState` object as soon as your business logic requires it. +> [!TIP] +> Use the `stop()` method from the `EtlState` object as soon as your business logic requires it. Using Events ------------ @@ -119,9 +122,15 @@ The `EtlExecutor` emits a variety of events during the ETL workflow, providing i - `FlushExceptionEvent` when something wrong occured during flush (the exception can be dismissed) - `EndEvent` whenever the workflow is complete. -All events give you access to the `EtlState` object, the state of the running ETL process, which allows you to read what's going on -(total number of items, number of loaded items, current extracted item index), write any arbitrary data into the `$state->context` array, -[skip items](#skipping-items), [stop the workflow](#stopping-the-workflow), and [trigger an early flush](#flush-frequency-and-early-flushes). +> [!IMPORTANT] +> All events give you access to the `EtlState` object, the state of the running ETL process. + +Accessing `$event->state` allows you to: +- Read what's going on (total number of items, number of loaded items, current extracted item index) +- Write any arbitrary data into the `$state->context` array +- [Skip items](#skipping-items) +- [Stop the workflow](#stopping-the-workflow) +- [Trigger an early flush](#flush-frequency-and-early-flushes). You can hook to those events during `EtlExecutor` instantiation, i.e.: @@ -138,8 +147,9 @@ Flush frequency and early flushes By default, the `flush()` method of your loader will be invoked at the end of the ETL, meaning it will likely keep all loaded items in memory before dumping them to their final destination. -Feel free to adjust a `flushFrequency` that fits your needs to manage memory usage and data processing efficiency -and optionally trigger an early flush at any time during the ETL process: +> [!TIP] +> - Feel free to adjust a `flushFrequency` that fits your needs to manage memory usage and data processing efficiency +> - Optionally, trigger an early flush at any time during the ETL process. ```php $etl = (new EtlExecutor(options: new EtlConfiguration(flushFrequency: 10)))