Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc: Improve doc (using Github alerts) #46

Merged
merged 1 commit into from
Nov 17, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,10 +48,12 @@ Installation
composer require bentools/etl:^4.0@alpha
```

> **Warning #1**: Version 4.0 is a complete rewrite and introduces significant BC (backward compatibility) breaks.
> [!WARNING]
> Version 4.0 is a complete rewrite and introduces significant BC (backward compatibility) breaks.
> Avoid upgrading from `^2.0` or `^3.0` unless you're fully aware of the changes.

> **Warning #2**: Version 4.0 is still at an alpha stage. BC breaks might occur between alpha releases.
> [!IMPORTANT]
> Version 4.0 is still at an alpha stage. BC breaks might occur between alpha releases.

Usage
-----
Expand Down
8 changes: 4 additions & 4 deletions doc/advanced_usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,10 @@ $etl = (new EtlExecutor())
$etl->process('file:///tmp/cities.csv', $pdo);
```

As you can see:
- Your transformer can _yield_ values, in case 1 extracted item becomes several items to load
- You can use `EtlState.destination` to retrieve the second argument you passed yo `$etl->process()`.
> [!IMPORTANT]
> As you can see:
> - Your transformer can _yield_ values, in case 1 extracted item becomes several items to load
> - You can use `EtlState.destination` to retrieve the second argument you passed yo `$etl->process()`.

The `EtlState` object contains all elements relative to the state of your ETL workflow being running.

Expand All @@ -54,7 +55,6 @@ But the last transformer of the chain (or your only one transformer) is determin
- If your transformer `yields` values, each yielded value will be passed to the loader (and the loader will be called for each yielded value).



Next tick
---------

Expand Down
26 changes: 18 additions & 8 deletions doc/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,8 @@ Then, let's have a look at `/tmp/cities.json`:
]
```

Notice that we didn't _transform_ anything here, we just denormalized the CSV file to an array, then serialized that array to a JSON file.
> [!NOTE]
> We didn't _transform_ anything here, we just denormalized the CSV file to an array, then serialized that array to a JSON file.

The `CSVExtractor` has some options to _read_ the data, such as considering that the 1st row is the column keys.

Expand Down Expand Up @@ -92,14 +93,16 @@ Skipping items

You can skip items at any time.

Use the `$state->skip()` method from the `EtlState` object as soon as your business logic requires it.
> [!TIP]
> Use the `skip()` method from the `EtlState` object as soon as your business logic requires it.

Stopping the workflow
---------------------

You can stop the workflow at any time.

Use the `$state->stop()` method from the `EtlState` object as soon as your business logic requires it.
> [!TIP]
> Use the `stop()` method from the `EtlState` object as soon as your business logic requires it.

Using Events
------------
Expand All @@ -119,9 +122,15 @@ The `EtlExecutor` emits a variety of events during the ETL workflow, providing i
- `FlushExceptionEvent` when something wrong occured during flush (the exception can be dismissed)
- `EndEvent` whenever the workflow is complete.

All events give you access to the `EtlState` object, the state of the running ETL process, which allows you to read what's going on
(total number of items, number of loaded items, current extracted item index), write any arbitrary data into the `$state->context` array,
[skip items](#skipping-items), [stop the workflow](#stopping-the-workflow), and [trigger an early flush](#flush-frequency-and-early-flushes).
> [!IMPORTANT]
> All events give you access to the `EtlState` object, the state of the running ETL process.

Accessing `$event->state` allows you to:
- Read what's going on (total number of items, number of loaded items, current extracted item index)
- Write any arbitrary data into the `$state->context` array
- [Skip items](#skipping-items)
- [Stop the workflow](#stopping-the-workflow)
- [Trigger an early flush](#flush-frequency-and-early-flushes).

You can hook to those events during `EtlExecutor` instantiation, i.e.:

Expand All @@ -138,8 +147,9 @@ Flush frequency and early flushes
By default, the `flush()` method of your loader will be invoked at the end of the ETL,
meaning it will likely keep all loaded items in memory before dumping them to their final destination.

Feel free to adjust a `flushFrequency` that fits your needs to manage memory usage and data processing efficiency
and optionally trigger an early flush at any time during the ETL process:
> [!TIP]
> - Feel free to adjust a `flushFrequency` that fits your needs to manage memory usage and data processing efficiency
> - Optionally, trigger an early flush at any time during the ETL process.

```php
$etl = (new EtlExecutor(options: new EtlConfiguration(flushFrequency: 10)))
Expand Down
Loading