Image Captioning : Show attend and tell pytorch

This repository contains Pytorch implementation of the image captioning model published in the paper Show attend and tell (Xu et al, 2015)

Environment

Ubuntu 18.04
CUDA 11.0
cuDNN
Nvidia GeForce RTX 2080Ti

Requirements

Java 8
Python 3.8.5
- Pytorch 1.7.0
- Other Python libraries specified in requirements.txt

How to Use

Step 1. Setup python virtual environment

$ virtualenv .env
$ source .env/bin/activate
(.env) $ pip install --upgrade pip
(.env) $ pip install -r requirements.txt

Step 2. Prepare data and path

Step 3. Training

Run

(.env) $ python train.py

You can change some hyperparameters by modifying config.py.

Step 4. Inference

Step 5. Prepare Evaluation Codes

Quantitative Results

Encoder	Trained on	BLEU4	CIDEr	METEOR	ROUGE_L
VGG	COCO2014	24.16	51.67	22.0	-
Resnet101	COCO2014	-	76.2	23.9	64.2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Image Captioning : Show attend and tell pytorch

Environment

Requirements

How to Use

Step 1. Setup python virtual environment

Step 2. Prepare data and path

Step 3. Training

Step 4. Inference

Step 5. Prepare Evaluation Codes

Quantitative Results

Qualitative Results

Training data

(1) Generated Caption : A Train traveling down tracks next to a Forest.

(2) Generated Caption : A man riding a Skateboard down a street.

Validation data

(1) Generated Caption : A group of people standing around a truck.

(2) Generated Caption : A dog sitting on a boat in the water.

Test data

(1) Generated Caption : A women is sitting at a table with a plate of food.

(2) Generated Caption : A person in a red jacket is standing on a snow covered slope.

Files

README.md

Latest commit

History

README.md

File metadata and controls

Image Captioning : Show attend and tell pytorch

Environment

Requirements

How to Use

Step 1. Setup python virtual environment

Step 2. Prepare data and path

Step 3. Training

Step 4. Inference

Step 5. Prepare Evaluation Codes

Quantitative Results

Qualitative Results

Training data

(1) Generated Caption : A Train traveling down tracks next to a Forest.

(2) Generated Caption : A man riding a Skateboard down a street.

Validation data

(1) Generated Caption : A group of people standing around a truck.

(2) Generated Caption : A dog sitting on a boat in the water.

Test data

(1) Generated Caption : A women is sitting at a table with a plate of food.

(2) Generated Caption : A person in a red jacket is standing on a snow covered slope.