-
Notifications
You must be signed in to change notification settings - Fork 90
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add Model Zoo testing support (#2990)
- Loading branch information
1 parent
006dec2
commit 497c277
Showing
28 changed files
with
2,418 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# Model Zoo | ||
|
||
- [Test Generator with Datasets](./test_generator/) | ||
- [ONNX Zoo](./onnx_zoo/) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
# ONNX Zoo model tester | ||
|
||
Helper script to test [`ONNX Zoo models`](https://onnx.ai/models/) which have test data with [`test_runner.py`](../../test_runner.py) | ||
|
||
## Getting the repository | ||
|
||
> [!IMPORTANT] | ||
> Make sure to enable git-lfs. | ||
```bash | ||
git clone https://github.com/onnx/models.git --depth 1 | ||
``` | ||
|
||
## Running the tests | ||
|
||
> [!IMPORTANT] | ||
> The argument must point to a folder, not a file. | ||
```bash | ||
# VERBOSE=1 DEBUG=1 # use these for more log | ||
# ATOL=0.001 RTOL=0.001 TARGET=gpu # are the default values | ||
./test_models.sh models/validated | ||
``` | ||
|
||
You can also pass multiple folders, e.g.: | ||
|
||
```bash | ||
./test_models.sh models/validated/text/machine_comprehension/t5/ models/validated/vision/classification/shufflenet/ | ||
``` | ||
|
||
## Results | ||
|
||
Result are separated by dtype: `logs/fp32` and `logs/fp16` | ||
|
||
### Helpers | ||
|
||
```bash | ||
# Something went wrong | ||
grep -HRL PASSED logs | ||
# Runtime error | ||
grep -HRi RuntimeError logs/ | ||
# Accuracy issue | ||
grep -HRl FAILED logs | ||
``` | ||
|
||
## Cleanup | ||
|
||
If at any point something fails, the following things might need cleanup: | ||
- Remove `tmp_model` folder | ||
- `git lfs prune` in `models` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,118 @@ | ||
#!/bin/bash | ||
|
||
##################################################################################### | ||
# The MIT License (MIT) | ||
# | ||
# Copyright (c) 2015-2024 Advanced Micro Devices, Inc. All rights reserved. | ||
# | ||
# Permission is hereby granted, free of charge, to any person obtaining a copy | ||
# of this software and associated documentation files (the "Software"), to deal | ||
# in the Software without restriction, including without limitation the rights | ||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
# copies of the Software, and to permit persons to whom the Software is | ||
# furnished to do so, subject to the following conditions: | ||
# | ||
# The above copyright notice and this permission notice shall be included in | ||
# all copies or substantial portions of the Software. | ||
# | ||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN | ||
# THE SOFTWARE. | ||
# | ||
##################################################################################### | ||
|
||
set -e | ||
|
||
WORK_DIR="$(cd -P -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd -P)" | ||
SCRIPT_PATH=$(dirname $(dirname $(dirname $(readlink -f "$0"))))/test_runner.py | ||
TESTER_SCRIPT="${TESTER:-$SCRIPT_PATH}" | ||
ATOL="${ATOL:-0.001}" | ||
RTOL="${RTOL:-0.001}" | ||
TARGET="${TARGET:-gpu}" | ||
|
||
if [[ "${DEBUG:-0}" -eq 1 ]]; then | ||
PIPE=/dev/stdout | ||
else | ||
PIPE=/dev/null | ||
fi | ||
|
||
if [[ "${VERBOSE:-0}" -eq 1 ]]; then | ||
set -x | ||
fi | ||
|
||
# Iterate through input recursively, process any tar.gz file | ||
function iterate() { | ||
local dir="$1" | ||
|
||
for file in "$dir"/*; do | ||
if [ -f "$file" ]; then | ||
if [[ $file = *.tar.gz ]]; then | ||
process "$file" | ||
fi | ||
fi | ||
|
||
if [ -d "$file" ]; then | ||
iterate "$file" | ||
fi | ||
done | ||
} | ||
|
||
# Process will download the lfs file, extract model and test data | ||
# Test it with test_runner.py, then cleanup | ||
function process() { | ||
local file="$1" | ||
echo "INFO: process $file started" | ||
setup $file | ||
test $file fp32 | ||
test $file fp16 | ||
cleanup $file | ||
echo "INFO: process $file finished" | ||
} | ||
|
||
# Download and extract files | ||
function setup() { | ||
local file="$1" | ||
echo "INFO: setup $file" | ||
local_file="$(basename $file)" | ||
# We need to change the folder to pull the file | ||
folder="$(cd -P -- "$(dirname -- "$file")" && pwd -P)" | ||
cd $folder &> "${PIPE}" && git lfs pull --include="$local_file" --exclude="" &> "${PIPE}"; cd - &> "${PIPE}" | ||
tar xzf $file -C $WORK_DIR/tmp_model &> "${PIPE}" | ||
} | ||
|
||
# Remove tmp files and prune models | ||
function cleanup() { | ||
local file="$1" | ||
echo "INFO: cleanup $file" | ||
# We need to change the folder to pull the file | ||
folder="$(cd -P -- "$(dirname -- "$file")" && pwd -P)" | ||
cd $folder &> "${PIPE}" && git lfs prune &> "${PIPE}"; cd - &> "${PIPE}" | ||
rm -r $WORK_DIR/tmp_model/* &> "${PIPE}" | ||
} | ||
|
||
# Run test_runner.py and log if something goes wrong | ||
function test() { | ||
local file="$1" | ||
echo "INFO: test $file ($2)" | ||
local_file="$(basename $file)" | ||
flag="--atol $ATOL --rtol $RTOL --target $TARGET" | ||
if [[ "$2" = "fp16" ]]; then | ||
flag="$flag --fp16" | ||
fi | ||
EXIT_CODE=0 | ||
python3 $TESTER_SCRIPT ${flag} $WORK_DIR/tmp_model/*/ &> "$WORK_DIR/logs/$2/${local_file//\//_}.log" || EXIT_CODE=$? | ||
if [[ "${EXIT_CODE:-0}" -ne 0 ]]; then | ||
echo "WARNING: ${file} failed ($2)" | ||
fi | ||
} | ||
|
||
mkdir -p $WORK_DIR/logs/fp32/ $WORK_DIR/logs/fp16/ $WORK_DIR/tmp_model | ||
rm -fr $WORK_DIR/tmp_model/* | ||
|
||
for arg in "$@"; do | ||
iterate "$(dirname $(readlink -e $arg))/$(basename $arg)" | ||
done |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,117 @@ | ||
# Test Generator with Datasets | ||
|
||
Helper module to generate real samples from datasets for specific models. | ||
|
||
## Prerequisites | ||
|
||
```bash | ||
python3 -m venv .venv | ||
. .venv/bin/activate | ||
pip install -r requirements.txt | ||
``` | ||
|
||
To use audio based datasets, install sndfile | ||
```bash | ||
apt install libsndfile1 | ||
``` | ||
|
||
## Usage | ||
|
||
```bash | ||
usage: generate.py [-h] | ||
[--image {all,none,...}] | ||
[--text {all,none,...}] | ||
[--audio {all,none,...}] | ||
[--output-folder-prefix OUTPUT_FOLDER_PREFIX] | ||
[--sample-limit SAMPLE_LIMIT] | ||
[--decode-limit DECODE_LIMIT] | ||
|
||
optional arguments: | ||
-h, --help show this help message and exit | ||
--image {all,none,...} | ||
Image models to test with imagenet-2012-val dataset samples | ||
--text {all,none,...} | ||
Text models to test with squad-hf dataset samples | ||
--audio {all,none,...} | ||
Audio models to test with librispeech-asr dataset samples | ||
--output-folder-prefix OUTPUT_FOLDER_PREFIX | ||
Output path will be "<this-prefix>/<dataset-name>/<model-name>" | ||
--sample-limit SAMPLE_LIMIT | ||
Max number of samples generated. Use 0 to ignore it. | ||
--decode-limit DECODE_LIMIT | ||
Max number of sum-samples generated for decoder models. Use 0 to ignore it. (Only for decoder models) | ||
``` | ||
|
||
> [!NOTE] | ||
> Some models require permission to access, use `huggingface-cli login`. | ||
To generate everything: | ||
```bash | ||
python generate.py | ||
``` | ||
|
||
To generate a subset of the supported models: | ||
- `none` to skip it | ||
- `all` for every models | ||
- <name> list supported model names | ||
|
||
```bash | ||
python generate.py --image resnet50_v1.5 clip-vit-large-patch14 --text none --audio none | ||
``` | ||
|
||
## Test models | ||
|
||
`test_models.sh` will run all downloaded models on the `generated` samples. The result will be in `logs`. | ||
|
||
```bash | ||
./test_models.sh generated/ | ||
``` | ||
|
||
> [!NOTE] | ||
> `generated` is the default output folder, make sure to match `--output-folder-prefix` name. | ||
## Adding more models | ||
|
||
To add mode models, first choose the proper place: | ||
- [image](./sample_generator/model/image.py) | ||
- [text](./sample_generator/model/text.py) | ||
- [audio](./sample_generator/model/audio.py) | ||
- [hybrid](./sample_generator/model/hybrid.py) | ||
|
||
For example, adding basic would be this (e.g. ResNet): | ||
|
||
```python | ||
class ResNet50_v1_5(OptimumHFModelDownloadMixin, | ||
AutoImageProcessorHFMixin, BaseModel): | ||
@property | ||
def model_id(self): | ||
return "microsoft/resnet-50" | ||
|
||
@staticmethod | ||
def name(): | ||
return "resnet50_v1.5" | ||
``` | ||
|
||
Define the class with the proper `Mixin`s: | ||
- `OptimumHFModelDownloadMixin`: Download model from Hugging Face and export it to onnx with Optimum | ||
- `AutoImageProcessorHFMixin`: Define the processor from Hugging Face (This depends on the model type) | ||
- `BaseModel`: Default model type, other choice is `DecoderModel` | ||
|
||
Provide 2 mandatory fields: | ||
- `model_id`: Hugging Face url | ||
- `name`: unique name for model | ||
|
||
To add a more complex model (e.g. Decoder), check [text](./sample_generator/model/text.py). | ||
|
||
The [generate](./generate.py) part will need further updating to include the model. | ||
|
||
## Adding more datasets | ||
|
||
The 3 most common use cases are handled: | ||
- `Image`: with [imagenet](./sample_generator/dataset/imagenet.py) | ||
- `Text`: with [squad](./sample_generator/dataset/squad.py) | ||
- `Audio`: with [librispeech](./sample_generator/dataset/librispeech.py) | ||
|
||
To add a new use case, e.g. Video, create a new python file in dataset, and inherit a new class from Base. | ||
|
||
The [generate](./generate.py) part will need further updating to include the dataset. |
Oops, something went wrong.