Transformers

O. Vinyals, A. Toshev, S. Bengio and D. Erhan.
Show and Tell: A Neural Image Caption Generator, 2015 https://arxiv.org/pdf/1411.4555.pdf

Proposal: PROJECT PROPOSAL.pdf

Project goals:

Try to reproduce image captioning NN model.
Train the model on different datasets.
Measure results on different metrics, suggested in paper (starting from BLEU score metric, and diving into what other metrics they used)