-
Notifications
You must be signed in to change notification settings - Fork 130
models distilbert base cased distilled squad
Description: The DistilBERT model is a distilled, smaller, faster, and cheaper version of the BERT model for Transformer-based language model. It is specifically trained for question answering in English and has been fine-tuned using knowledge distillation on SQuAD v1.1. It has 40% less parameters than bert-base-uncased and runs 60% faster while preserving over 95% of BERT's performances as measured on the GLUE language understanding benchmark. However, it is important to note that this model has some limitations and biases and should be used with caution. It should not be used to intentionally create hostile or alienating environments for people and it was not trained to be factual or true representations of people or events. The model has a F1 score of 87.1 on the SQuAD v1.1 dev set, but has unknown environmental impact as the carbon emissions and cloud provider are unknown. The model was developed by Hugging Face and is licensed under Apache 2.0. The authors of the Model Card are the Hugging Face team. The DistilBERT model requires significant computational power to train, as it was trained using 8 16GB V100 GPUs and for 90 hours. So it is important to keep that in mind when developing applications that utilize this model. > The above summary was generated using ChatGPT. Review the original model card to understand the data used to train the model, evaluation metrics, license, intended uses, limitations and bias before using the model. ### Inference samples Inference type|Python sample (Notebook)|CLI with YAML |--|--|--| Real time|question-answering-online-endpoint.ipynb|question-answering-online-endpoint.sh Batch |question-answering-batch-endpoint.ipynb| coming soon ### Finetuning samples Task|Use case|Dataset|Python sample (Notebook)|CLI with YAML |--|--|--|--|--| Text Classification|Emotion Detection|Emotion|emotion-detection.ipynb|emotion-detection.sh Token Classification|Named Entity Recognition|Conll2003|named-entity-recognition.ipynb|named-entity-recognition.sh Question Answering|Extractive Q&A|SQUAD (Wikipedia)|extractive-qa.ipynb|extractive-qa.sh ### Model Evaluation Task| Use case| Dataset| Python sample (Notebook)| CLI with YAML |--|--|--|--|--| Question Answering | Extractive Q&A | Squad v2 | evaluate-model-question-answering.ipynb | evaluate-model-question-answering.yml #### Sample input json { "inputs": { "question": ["What is my name?", "Where do I live?"], "context": ["My name is John and I live in Seattle.", "My name is Ravi and I live in Hyderabad."] } }
#### Sample output json [ { "0": "John" }, { "0": "Hyderabad" } ]
Version: 5
Preview
license : apache-2.0
task : question-answering
View in Studio: https://ml.azure.com/registries/azureml/models/distilbert-base-cased-distilled-squad/version/5
License: apache-2.0
SHA: 2d81d2eeffc5be1dad0839e48c04dd02952f8606
datasets: squad
evaluation-min-sku-spec: 2|0|7|14
evaluation-recommended-sku: Standard_DS2_v2
finetune-min-sku-spec: 4|1|28|176
finetune-recommended-sku: Standard_NC24rs_v3
finetuning-tasks: text-classification, token-classification, question-answering
inference-min-sku-spec: 2|0|7|14
inference-recommended-sku: Standard_DS2_v2
languages: en