Model | Backbone | Language | Dataset | F-score(%) | FPS | data shape (NCHW) | Config | Download |
---|---|---|---|---|---|---|---|---|
DBNet | MobileNetV3 | en | IC15 | 76.96 | 26.19 | (1,3,736,1280) | yaml | ckpt | mindir |
ResNet-18 | en | IC15 | 81.73 | 24.04 | (1,3,736,1280) | yaml | ckpt | mindir | |
ResNet-50 | en | IC15 | 85.00 | 21.69 | (1,3,736,1280) | yaml | ckpt | mindir | |
ResNet-50 | ch + en | 12 Datasets | 83.41 | 21.69 | (1,3,736,1280) | yaml | ckpt | mindir | |
DBNet++ | ResNet-50 | en | IC15 | 86.79 | 8.46 | (1,3,1152,2048) | yaml | ckpt | mindir |
ResNet-50 | ch + en | 12 Datasets | 84.30 | 8.46 | (1,3,1152,2048) | yaml | ckpt | mindir | |
EAST | ResNet-50 | en | IC15 | 86.86 | 6.72 | (1,3,720,1280) | yaml | ckpt | mindir |
MobileNetV3 | en | IC15 | 75.32 | 26.77 | (1,3,720,1280) | yaml | ckpt | mindir | |
PSENet | ResNet-152 | en | IC15 | 82.50 | 2.52 | (1,3,1472,2624) | yaml | ckpt | mindir |
ResNet-50 | en | IC15 | 81.37 | 10.16 | (1,3,736,1312) | yaml | ckpt | mindir | |
MobileNetV3 | en | IC15 | 70.56 | 10.38 | (1,3,736,1312) | yaml | ckpt | mindir | |
FCENet | ResNet50 | en | IC15 | 78.94 | 14.59 | (1,3,736,1280) | yaml | ckpt | mindir |
Model | Backbone | Dict File | Dataset | Acc(%) | FPS | data shape (NCHW) | Config | Download |
---|---|---|---|---|---|---|---|---|
CRNN | VGG7 | Default | IC15 | 66.01 | 465.64 | (1,3,32,100) | yaml | ckpt | mindir |
ResNet34_vd | Default | IC15 | 69.67 | 397.29 | (1,3,32,100) | yaml | ckpt | mindir | |
ResNet34_vd | ch_dict.txt | / | / | / | (1,3,32,320) | yaml | ckpt | mindir | |
SVTR | Tiny | Default | IC15 | 79.92 | 338.04 | (1,3,64,256) | yaml | ckpt | mindir |
Rare | ResNet34_vd | Default | IC15 | 69.47 | 273.23 | (1,3,32,100) | yaml | ckpt | mindir |
ResNet34_vd | ch_dict.txt | / | / | / | (1,3,32,320) | yaml | ckpt | mindir | |
RobustScanner | ResNet-31 | en_dict90.txt | IC15 | 73.71 | 22.30 | (1,3,48,160) | yaml | ckpt | mindir |
VisionLAN | ResNet-45 | Default | IC15 | 80.07 | 321.37 | (1,3,64,256) | yaml(LA) | ckpt(LA) | mindir(LA) |
graph LR;
subgraph Step 1
A[ckpt] -- export.py --> B[MindIR]
end
subgraph Step 2
B -- converter_lite --> C[MindSpore Lite MindIR];
end
subgraph Step 3
C -- input --> D[infer.py];
end
subgraph Step 4
D -- outputs --> E[eval_rec.py/eval_det.py];
end
F[images] -- input --> D;
As shown in the figure above, the inference process is divided into the following steps:
- Use
tools/export.py
to export the ckpt model to MindIR model; - Download and configure the model converter (i.e. converter_lite), and use the converter_lite tool to convert the MindIR to the MindSpore Lite MindIR;
- After preparing the MindSpore Lite MindIR and the input image, use
deploy/py_infer/infer.py
to perform inference; - Depending on the type of model, use
deploy/eval_utils/eval_det.py
to evaluate the inference results of the text detection models, or usedeploy/eval_utils/eval_rec.py
for text recognition models.
Note: Step 1 runs on Ascend910, GPU or CPU. Step 2, 3, 4 run on Ascend310 or 310P.
Let's take DBNet ResNet-50 en
in the model support list as an example to introduce the inference method:
- Download the ckpt file in the model support list and use the following command to export to MindIR, or directly download the exported mindir file from the model support list:
# Use the local ckpt file to export the MindIR of the `DBNet ResNet-50 en` model
# For more parameter usage details, please execute `python tools/export.py -h`
python tools/export.py --model_name_or_config dbnet_resnet50 --data_shape 736 1280 --local_ckpt_path /path/to/dbnet.ckpt
In the above command, --model_name_or_config
is the model name in MindOCR or we can pass the yaml directory to it (for example --model_name_or_config configs/rec/crnn/crnn_resnet34.yaml
);
The --data_shape 736 1280
parameter indicates that the size of the model input image is [736, 1280], and each MindOCR model corresponds to a fixed export data shape. For details, see data shape in the model support list;
--local_ckpt_path /path/to/dbnet.ckpt
parameter indicates that the model file to be exported is /path/to/dbnet.ckpt
- Use the converter_lite tool on Ascend310 or 310P to convert the MindIR to MindSpore Lite MindIR:
Run the following command:
converter_lite \
--saveType=MINDIR \
--fmk=MINDIR \
--optimize=ascend_oriented \
--modelFile=dbnet_resnet50-c3a4aa24-fbf95c82.mindir \
--outputFile=dbnet_resnet50
In the above command:
--fmk=MINDIR
indicates that the original format of the input model is MindIR, and the --fmk
parameter also supports ONNX, etc.;
--saveType=MINDIR
indicates that the output model format is MindIR format;
--optimize=ascend_oriented
indicates that optimize for Ascend devices;
--modelFile=dbnet_resnet50-c3a4aa24-fbf95c82.mindir
indicates that the current model path to be converted is dbnet_resnet50-c3a4aa24-fbf95c82.mindir
;
--outputFile=dbnet_resnet50
indicates that the path of the output model is dbnet_resnet50
, which can be automatically generated without adding the .mindir suffix;
After the above command is executed, the dbnet_resnet50.mindir
model file will be generated;
Learn more about converter_lite
Learn more about Model Conversion Tutorial
- Perform inference using
/deploy/py_infer/infer.py
codes anddbnet_resnet50.mindir
file:
python infer.py \
--input_images_dir=/path/to/ic15/ch4_test_images \
--det_model_path=/path/to/mindir/dbnet_resnet50.mindir \
--det_model_name_or_config=en_ms_det_dbnet_resnet50 \
--res_save_dir=/path/to/dbnet_resnet50_results
After the execution is completed, the prediction file det_results.txt
will be generated in the directory pointed to by the parameter --res_save_dir
When doing inference, you can use the --vis_det_save_dir
parameter to visualize the results:
Visualization of text detection results
Learn more about infer.py inference parameters
- Evaluate the results with the following command:
python deploy/eval_utils/eval_det.py \
--gt_path=/path/to/ic15/test_det_gt.txt \
--pred_path=/path/to/dbnet_resnet50_results/det_results.txt
The result is: {'recall': 0.8348579682233991, 'precision': 0.8657014478282576, 'f-score': 0.85}
Let's take CRNN ResNet34_vd en
in the model support list as an example to introduce the inference method:
-
Download the MindIR file in the model support list;
-
Use the converter_lite tool on Ascend310 or 310P to convert the MindIR to MindSpore Lite MindIR:
Run the following command:
converter_lite \
--saveType=MINDIR \
--fmk=MINDIR \
--optimize=ascend_oriented \
--modelFile=crnn_resnet34-83f37f07-eb10a0c9.mindir \
--outputFile=crnn_resnet34vd
After the above command is executed, the crnn_resnet34vd.mindir
model file will be generated;
For a brief description of the converter_lite parameters, see the text detection example above.
Learn more about converter_lite
Learn more about Model Conversion Tutorial
- Perform inference using
/deploy/py_infer/infer.py
codes andcrnn_resnet34vd.mindir
file:
python infer.py \
--input_images_dir=/path/to/ic15/ch4_test_word_images \
--rec_model_path=/path/to/mindir/crnn_resnet34vd.mindir \
--rec_model_name_or_config=../../configs/rec/crnn/crnn_resnet34.yaml \
--res_save_dir=/path/to/rec_infer_results
After the execution is completed, the prediction file rec_results.txt
will be generated in the directory pointed to by the parameter --res_save_dir
.
Learn more about infer.py inference parameters
- Evaluate the results with the following command:
python deploy/eval_utils/eval_rec.py \
--gt_path=/path/to/ic15/rec_gt.txt \
--pred_path=/path/to/rec_infer_results/rec_results.txt
The result is: {'acc': 0.6966779232025146, 'norm_edit_distance': 0.8627135157585144}