-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add documentation for Settings and Constants management #1521
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: elronbandel <elronbandel@gmail.com>
…and environment variable details Signed-off-by: elronbandel <elronbandel@gmail.com>
…es and descriptions Signed-off-by: elronbandel <elronbandel@gmail.com>
.. _settings: | ||
|
||
===================================== | ||
Library Settings and Constants |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will any user need to access constants? For the list, they seem internal only.
These could be part of the code documentation. Putting them here adds complexity in the explanation for most users.
- Simplify debugging and testing. | ||
- Enable dynamic configuration using environment variables or runtime contexts. | ||
|
||
Adding New Settings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not for users, only for contributors. I don't think we should document it in this tutorial. Only in the code (at latest as a last step - "for developers").
- Use a clear and descriptive name for the setting. | ||
- Always specify the type as one of `int`, `float`, or `bool`. | ||
|
||
Adding New Constants |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moreover here, constants are not to be added by users.
Using Settings Context | ||
====================== | ||
|
||
The :class:`Settings <settings_utils.Settings>` class provides a `context` manager to temporarily override settings within a specific block of code. After exiting the block, the settings revert to their original values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is important.
- bool | ||
- False | ||
- UNITXT_ALLOW_UNVERIFIED_CODE | ||
- Enables or disables execution of unverified code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Enables or disables execution of unverified code. | |
- Enables or disables execution of unverified code. Unverified code includes executable code from HF datasets and calls to ExecuteExpressions or other operators that run user code. This ensure only trusted code is executed. |
- bool | ||
- False | ||
- UNITXT_USE_ONLY_LOCAL_CATALOGS | ||
- Restricts operations to use only local catalogs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Restricts operations to use only local catalogs. | |
- Restricts loading of artifacts to only use local catalogs on local filesystems (and not remote GitHub repos). |
- int | ||
- None | ||
- UNITXT_GLOBAL_LOADER_LIMIT | ||
- Sets a limit on the number of global data loaders. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is default value for "loader_limit"?
- None | ||
- None | ||
- UNITXT_CATALOGS | ||
- Specifies the catalogs configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not clear.
- None | ||
- None | ||
- UNITXT_ARTIFACTORIES | ||
- Defines the artifact storage configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also not clear.
- str | ||
- "dataset_recipe" | ||
- UNITXT_DEFAULT_RECIPE | ||
- Specifies the default recipe for datasets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it needed? What can it be set to?
- bool | ||
- False | ||
- UNITXT_USE_EAGER_EXECUTION | ||
- Enables eager execution for tasks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Describe what it is.
- list | ||
- [] | ||
- UNITXT_REMOTE_METRICS | ||
- Defines a list of configurations for remote metrics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not really checked. Should we keep it?
- bool | ||
- False | ||
- UNITXT_TEST_CARD_DISABLE | ||
- Disables the use of test cards when enabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use it?
- bool | ||
- False | ||
- UNITXT_TEST_METRIC_DISABLE | ||
- Disables the use of test metrics when enabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use it?
- bool | ||
- False | ||
- UNITXT_SKIP_ARTIFACTS_PREPARE_AND_VERIFY | ||
- Skips preparation and verification of artifacts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use it?
- bool | ||
- True | ||
- UNITXT_DISABLE_HF_DATASETS_CACHE | ||
- Disables caching for Hugging Face datasets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an important one. Need to describe the behavior, why caching is disabled by default and what changing means.
- int | ||
- 1 | ||
- UNITXT_LOADER_CACHE_SIZE | ||
- Sets the cache size for data loaders. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is it used?
- bool | ||
- True | ||
- UNITXT_TASK_DATA_AS_TEXT | ||
- Enables representation of task data as plain text. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why set it?
- None | ||
- None | ||
- UNITXT_DEFAULT_FORMAT | ||
- Defines the default format for data processing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is important.
- str | ||
- "watsonx" | ||
- UNITXT_DEFAULT_PROVIDER | ||
- Specifies the default provider for tasks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Specifies the default provider for tasks. | |
- Defines the default provider used by CrossProviderInferenceEngine. Used to set the change the platform (OpenAI, HF, Watson) used for inference calls and LLM as Judges without changing code. |
Closes: #1517