Skip to content

models distilbert base cased distilled squad

github-actions[bot] edited this page Oct 23, 2023 · 20 revisions

distilbert-base-cased-distilled-squad

Overview

The DistilBERT model is a distilled, smaller, faster, and cheaper version of the BERT model for Transformer-based language model. It is specifically trained for question answering in English and has been fine-tuned using knowledge distillation on SQuAD v1.1. It has 40% less parameters than bert-base-uncased and runs 60% faster while preserving over 95% of BERT's performances as measured on the GLUE language understanding benchmark. However, it is important to note that this model has some limitations and biases and should be used with caution. It should not be used to intentionally create hostile or alienating environments for people and it was not trained to be factual or true representations of people or events. The model has a F1 score of 87.1 on the SQuAD v1.1 dev set, but has unknown environmental impact as the carbon emissions and cloud provider are unknown. The model was developed by Hugging Face and is licensed under Apache 2.0. The authors of the Model Card are the Hugging Face team. The DistilBERT model requires significant computational power to train, as it was trained using 8 16GB V100 GPUs and for 90 hours. So it is important to keep that in mind when developing applications that utilize this model.

The above summary was generated using ChatGPT. Review the original model card to understand the data used to train the model, evaluation metrics, license, intended uses, limitations and bias before using the model.

Inference samples

Inference type Python sample (Notebook) CLI with YAML
Real time question-answering-online-endpoint.ipynb question-answering-online-endpoint.sh
Batch question-answering-batch-endpoint.ipynb coming soon

Finetuning samples

Task Use case Dataset Python sample (Notebook) CLI with YAML
Text Classification Emotion Detection Emotion emotion-detection.ipynb emotion-detection.sh
Token Classification Named Entity Recognition Conll2003 named-entity-recognition.ipynb named-entity-recognition.sh
Question Answering Extractive Q&A SQUAD (Wikipedia) extractive-qa.ipynb extractive-qa.sh

Model Evaluation

Task Use case Dataset Python sample (Notebook) CLI with YAML
Question Answering Extractive Q&A Squad v2 evaluate-model-question-answering.ipynb evaluate-model-question-answering.yml

Sample input

{
    "input_data": {
        "question": ["What is my name?", "Where do I live?"],
        "context": ["My name is John and I live in Seattle.", "My name is Ravi and I live in Hyderabad."]
    }
}

Sample output

[
    {
        "0": "John"
    },
    {
        "0": "Hyderabad"
    }
]

Version: 9

Tags

Preview computes_allow_list : ['Standard_NV12s_v3', 'Standard_NV24s_v3', 'Standard_NV48s_v3', 'Standard_NC6s_v3', 'Standard_NC12s_v3', 'Standard_NC24s_v3', 'Standard_NC24rs_v3', 'Standard_NC6s_v2', 'Standard_NC12s_v2', 'Standard_NC24s_v2', 'Standard_NC24rs_v2', 'Standard_NC4as_T4_v3', 'Standard_NC8as_T4_v3', 'Standard_NC16as_T4_v3', 'Standard_NC64as_T4_v3', 'Standard_ND6s', 'Standard_ND12s', 'Standard_ND24s', 'Standard_ND24rs', 'Standard_ND40rs_v2', 'Standard_ND96asr_v4'] license : apache-2.0 model_specific_defaults : ordereddict([('apply_deepspeed', 'true'), ('apply_lora', 'true'), ('apply_ort', 'true')]) task : question-answering

View in Studio: https://ml.azure.com/registries/azureml/models/distilbert-base-cased-distilled-squad/version/9

License: apache-2.0

Properties

SHA: 2d81d2eeffc5be1dad0839e48c04dd02952f8606

datasets: squad

evaluation-min-sku-spec: 8|0|28|56

evaluation-recommended-sku: Standard_DS4_v2

finetune-min-sku-spec: 4|1|28|176

finetune-recommended-sku: Standard_NC24rs_v3

finetuning-tasks: text-classification, token-classification, question-answering

inference-min-sku-spec: 2|0|7|14

inference-recommended-sku: Standard_DS2_v2, Standard_D2a_v4, Standard_D2as_v4, Standard_DS3_v2, Standard_D4a_v4, Standard_D4as_v4, Standard_DS4_v2, Standard_D8a_v4, Standard_D8as_v4, Standard_DS5_v2, Standard_D16a_v4, Standard_D16as_v4, Standard_D32a_v4, Standard_D32as_v4, Standard_D48a_v4, Standard_D48as_v4, Standard_D64a_v4, Standard_D64as_v4, Standard_D96a_v4, Standard_D96as_v4, Standard_F4s_v2, Standard_FX4mds, Standard_F8s_v2, Standard_FX12mds, Standard_F16s_v2, Standard_F32s_v2, Standard_F48s_v2, Standard_F64s_v2, Standard_F72s_v2, Standard_FX24mds, Standard_FX36mds, Standard_FX48mds, Standard_E2s_v3, Standard_E4s_v3, Standard_E8s_v3, Standard_E16s_v3, Standard_E32s_v3, Standard_E48s_v3, Standard_E64s_v3, Standard_NC4as_T4_v3, Standard_NC6s_v3, Standard_NC8as_T4_v3, Standard_NC12s_v3, Standard_NC16as_T4_v3, Standard_NC24s_v3, Standard_NC64as_T4_v3, Standard_NC24ads_A100_v4, Standard_NC48ads_A100_v4, Standard_NC96ads_A100_v4, Standard_ND96asr_v4, Standard_ND96amsr_A100_v4, Standard_ND40rs_v2

languages: en

Clone this wiki locally