Skip to content

Commit

Permalink
[Exporter] Refactor jobs implementation to use only Go SDK
Browse files Browse the repository at this point in the history
  • Loading branch information
alexott committed Jan 8, 2025
1 parent 01b63ac commit c0cd3c9
Show file tree
Hide file tree
Showing 7 changed files with 330 additions and 375 deletions.
2 changes: 1 addition & 1 deletion docs/guides/experimental-exporter.md
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,7 @@ For security reasons, [databricks_secret](../resources/secret.md) cannot contain
To speed up export, Terraform Exporter performs many operations, such as listing & actual data exporting, in parallel using Goroutines. Built-in defaults control the parallelism, but it's also possible to tune some parameters using environment variables specific to the exporter:

* `EXPORTER_WS_LIST_PARALLELISM` (default: `5`) controls how many Goroutines are used to perform parallel listing of Databricks Workspace objects (notebooks, directories, workspace files, ...).
* `EXPORTER_DIRECTORIES_CHANNEL_SIZE` (default: `100000`) controls the channel's capacity when listing workspace objects. Please ensure that this value is big enough (greater than the number of directories in the workspace; default value should be ok for most cases); otherwise, there is a chance of deadlock.
* `EXPORTER_DIRECTORIES_CHANNEL_SIZE` (default: `300000`) controls the channel's capacity when listing workspace objects. Please ensure that this value is big enough (greater than the number of directories in the workspace; default value should be ok for most cases); otherwise, there is a chance of deadlock.
* `EXPORTER_DEDICATED_RESOUSE_CHANNELS` - by default, only specific resources (`databricks_user`, `databricks_service_principal`, `databricks_group`) have dedicated channels - the rest are handled by the shared channel. This is done to prevent throttling by specific APIs. You can override this by providing a comma-separated list of resources as this environment variable.
* `EXPORTER_PARALLELISM_NNN` - number of Goroutines used to process resources of a specific type (replace `NNN` with the exact resource name, for example, `EXPORTER_PARALLELISM_databricks_notebook=10` sets the number of Goroutines for `databricks_notebook` resource to `10`). There is a shared channel (with name `default`) for handling of resources for which there are no dedicated channels - use `EXPORTER_PARALLELISM_default` to increase its size (default size is `15`). Defaults for some resources are defined by the `goroutinesNumber` map in `exporter/context.go` or equal to `2` if there is no value. *Don't increase default values too much to avoid REST API throttling!*
* `EXPORTER_DEFAULT_HANDLER_CHANNEL_SIZE` is the size of the shared channel (default: `200000`). You may need to increase it if you have a huge workspace.
Expand Down
Loading

0 comments on commit c0cd3c9

Please sign in to comment.