This repo contains the official test code for the project Upper Limb Segmentation in Egocentric Vision.
- python 3.x (versions 3.6 or 3.7 tested)
- numpy
- os
- natsort
- sys
- opencv (version 4.5.1 suggested)
- matplotlib
- tensorflow-gpu 1.15
- CUDA 10.0
- cuDNN for CUDA 10.0 (such as v7.6.4)
We tested our code on Windows 10 defining a Miniconda environment.
Clone repository:
git clone https://github.com/Unibas3D/Upper-Limb-Segmentation.git
Install all dependencies indicated in the Requirements Section.
Ensure that TensorFlow 1.15 with CUDA enabled is correctly installed:
import tensorflow as tf
print("TensorFlow version: ", tf.__version__)
print("TensorFlow is built with cuda: ", tf.test.is_built_with_cuda())
if tf.test.is_gpu_available():
print("GPU device: ", tf.test.gpu_device_name())
else:
print("No available GPU!")
Download the trained models based on the DeepLabv3+ architecture. They can be found here. Create a folder named deeplab_trained_models
in the root and put all models in. Your root folder structure should look like this:
/Upper-Limb-Segmentation/
deeplab_trained_models/
model_07_05_21/
model_08_05_21/
model_10_05_21/
model_13_05_21/
Run the following command to perform network inference with images from folder. Some sample images are available in the test_images
folder.
If you want to test your images, change the folder path accordingly in the code.
python inference_images_from_folder.py
Predictions are saved in the results
folder, which is automatically created if it does not exist.
The default color mask is red. Please, edit the line 11 of the colormap_dataset.py
file to set another color.
Run the following command to perform network inference using the input stream from a webcam or a video file.
python inference_webcam_or_video.py
The default inference is performed using the webcam stream. Please, change the cam ID (default is 0) if necessary. If you want to test videos, please uncomment lines 93-94 and comment line 97. Change video path at line 93.
We will release our dataset for encouraging future research on upper limb segmentation. Please send an email to monica.gruosso@unibas.it or nicola.capece@unibas.it if you need it for academic research and non-commercial purposes.
Before requesting our data, please verify that you understand and agree to comply with the following:
- This data may ONLY be used for non-commercial uses (This also means that it cannot be used to train models for commercial use).
- You may NOT redistribute the dataset. This includes posting it on a website or sending it to others.
- You may include images from our dataset in academic papers.
- Any publications utilizing this dataset have to reference papers indicated in the Citation Section.
- These restrictions include not just the images in their current form but any images created from these images (i.e., “derivative” images).
- Models trained using our data may only be distributed (posted on the internet or given to others) under the condition that the model can only be used for non-commercial uses.
If you use the code or the data for your research, please cite the following papers:
- Ours
@article{gruosso202egocentric,
title={Egocentric Upper Limb Segmentation in Unconstrained Real-Life Scenarios},
author={Gruosso, Monica and Nicola, Capece and Erra, Ugo},
journal={Virtual Reality},
year={2022}
publisher={Springer},
doi={https://doi.org/10.1007/s10055-022-00725-4}
}
@inproceedings{gruosso2021solid,
title={Solid and Effective Upper Limb Segmentation in Egocentric Vision},
author={Gruosso, Monica and Nicola, Capece and Erra, Ugo},
booktitle={The 26th International Conference on 3D Web Technology},
year={2021}
}
- DeepLabv3+
@inproceedings{deeplabv3plus2018,
title={Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation},
author={Liang-Chieh Chen and Yukun Zhu and George Papandreou and Florian Schroff and Hartwig Adam},
booktitle={ECCV},
year={2018}
}
- TEgO dataset
@inproceedings{lee2019hands,
title={Hands Holding Clues for Object Recognition in Teachable Machines},
author={Lee, Kyungjun and Kacorri, Hernisa},
booktitle={Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems},
year={2019},
organization={ACM}
}
- EDSH dataset
@inproceedings{li2013pixel,
title={Pixel-level hand detection in ego-centric videos},
author={Li, Cheng and Kitani, Kris M},
booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
pages={3570--3577},
year={2013}
}