ภาษาไทย
FOMO is a object detection model designed for constrained device. Due to it's low foot print and memory requirement, this model is highly suitable for AIOT box or Esp32-S3. This repository will provide simple tips for building a FOMO model in Edge Impulse, including data collection, training, and deployment.
- AIOT, Esp32S3 or any Esp32 series.
- OV2640.
- Webcam (optional).
- Edge Impulse account(free).
Register a free account in Edge Impulse, and create a new project.
1. Collecting data from Esp32 can be a tediuos. Luckily, you can download and run the scripted that I've created camera-webserver-for-esp32S3 or use Webcam interface in Edge impulse.
- The best results of this network is obtained atleast 70 images per class and 10% of background(other) images. To put in perspective, training a model to count 2 fingers requires 70 images of one, another 70 images of two, and atleast 20-30 images of other fingers or object look alike.
- Images should has equal width and height otherwise it's width will be crop off when uploading to Edge Impulse. Here is snapshot of webserver used for data collections. Each images is 96 X 96 in dimension.
2. On the left tab, go to data aquisition, and upload images to Edge Impulse
3. Click yes for object Detection
1. On the top of the page, navigate to labeling queue and add label to each images. Keep in mind that images with non equal dimension will be crop off during this process, which is why I've equal image dimension.
Images with non equal dimension 320 X 240, notice the black shade on each sides of the image indicates that those parts will be crop off.
Images with equal dimension 96 X 96.
2. After labeling all images, navigate to Impulse design on the left and click on Create impulse. This will take you to a page where you can choose the size of the input model and resizing mode.
- Edge Impulse reccomends the size of the model should be in multiple of 8. The higher the input size, the slower the network for inferencing. But higher size has advantage of detecting multiple objects if it's presented in the frame.
Click on add a processing block and select the only option.
Click on add learning block and select the first option, then save the impulse.
4. After saving the impulse you will be directed to a new section. In this section, you can choose whether images will be train in Grayscale or RGB feature. I've left it as RGB for this project. Click on save parameters to proceed.
After selecting the features, the page will direct you to generate feature tab, click on generate feature and you will see the graph on the right side of the page.
5. This graph uses K-nearest neibors algorithm to represented the similarities between each images. Notice that red dot represent finger no.1 and pink represent finger no.2. If two classes are too close to each other like the ones I've circled, the object detection model will have problems distinguish between two classes which will greatly reduce the accuracy. Thus images that overlaped has to be deleted.
- After deleting and adding more images, the two classes should be seperated like this.
6. On the left panel select Object detection. These are the settings that can be customized.- Traning cycles indicates the number of epoch the model will go through, I've found that it is trivial to set it more than 80. I will be using 25 cycles for this project.
- Data augentation, multiplies amount of your dataset significantly. leave this on as default.
- Learning rate, determines how fast the model learn the features, this is best leave as just it is.
- Validation set size, also best to leave this as default as well.
- batch size, determines samples that will be propagated through the traning process e.g. if it's set 8 then the model will train on 1-8 images, then on the next cycle it will go through 9-16 and so forth. Batch size should be in the power of 2^n, e.g. 4, 8, 16, 32, 64, and etc. I've found that on small datasets 8 and 16 yield the best result. The batch size of 8 will be used for this project.
- Choose the model, as of now, only two FOMO models are avaiable. I will be using FOMO 0.35 for this project.
- Start traning the model, this process might takes up to 20 minutes.
Tips to improve mode's accuracy - Check if each class has overlapped features, go back to step no.5.
- Increase the datasets.
- decrease batch size.
- Epoch should not be more than 80 for smaller datasets
1. On the left tab, navigate to Deployment and change deployment option to Arduino library.
2. Change target option to Esp32.
3. Click on Build to start downloading the library, and you're done.
I've created two libraries for testing the model in real time, please visit FOMO-object-detect-stream-Esp32 for webserver platform or FOMO-object-detect-TFT for display on TFT screens.
Thanks to WIRELESS SOLUTION ASIA CO.,LTD for providing AIOT board to support this project. Also thanks to Bodmer / TFT_eSPI for the TFT libraries. Scripted used for Esp32 FOMO object detection inferencing were provided by Edge Impulse.