ImageCaptioning improved with attention. Also a PyQt5 applications
- Hello guys, hope you are doing awesome these days !😄
- In my previous ImageCaption repository, I implemented a ImageCaption algorithm and I promised to upload an attention based version latter. And here it is !😄
- Using the
ResNet50
pretrained on ImageNet as the backbone(no finetune) and also some attention, the model can describe image like human(most of the time). - Moreover,
Beam Search
are also used during the inferrence part and this give another great improvment on the model's performence - Now, let's enjoy some funny stuff😎
- skimage
- spacy
- PyQt5
- Pip install them
- Download the flickr30k dataset, unpack all the images into the folder
flickr30k/flickr30k-images
. I have already preprocessed the captions.txt, and you don't need to download that - flickr(提取码:hrf3)
- Put the downloaded checkpoint into the folder
checkpoint
- checkpoint(提取码:qny4)
train.py
line20 - line26, set the dataset pathtrain.py
line31 - line34,load_model
:load my checkpoint or not.- Ok, you can train now
inferrence.py
line245, choose your predict image path
Then, you need to push the initialize button to load the model, after that, just wait the Finished
sign appers in the right.
Email Address
mountchicken@outlook.com