Skip to content

components dataset_downloader

github-actions[bot] edited this page Sep 9, 2023 · 19 revisions

Dataset Downloader

dataset_downloader

Overview

Description: Downloads the dataset onto blob store.

Version: 0.0.1

View in Studio: https://ml.azure.com/registries/azureml/components/dataset_downloader/version/0.0.1

Inputs

Name Description Type Default Optional Enum
dataset_name Name of the dataset to download from HuggingFace; must be null if script is specified. string True
configuration If a specific sub-dataset of the dataset to download, specify the configuration name; specify 'all' to download all configurations. Else, leave it null. string True
split If a specific split of the dataset to download, specify the split name; specify 'all' to download all splits. string False
script_path Path to the dataset loading script. Must follow the HuggingFace dataset loading script template. For example, please refer https://github.com/Azure/azureml-assets/tree/main/assets/aml-benchmark/scripts/data_loaders. uri_file True

Outputs

Name Description Type
output_dataset Path to the directory where the dataset will be downloaded. uri_folder

Environment

azureml://registries/azureml/environments/model-evaluation/labels/latest

Clone this wiki locally