IMDB-review-classification

This project solves the IMDB review classification problem, which is a case study of Deep Learning with Python (See section 6.1.3).

The book has an implementaion in Keras. I re-implement it using PyTorch.

.
├── models                       # Custom models
│   └── SimpleModel.py           # Model introduced by Deep Learning with Python
├── resources                    # Downloaded resources
│   ├── aclImdb
│   └── glove.6B
├── tests                        # Unit tests
│   ├── test_utils_dataset.py
│   ├── test_utils_embedding.py
│   └── test_utils_tokenizer.py
└── utils
│   ├── Dataset.py               # IMDBDataset class
│   ├── Embedding.py             # GloVe embedding class
│   ├── plotting.py              # Plotting metrics history during training
│   ├── training.py              # Train and evaluate loops
│   └── Tokenizer.py             # A simple tokenizer
├── .gitignore
├── LICENSE
├── main.ipynb
├── README.md
├── requirements.txt
└── setup.sh

Usage

Prepare the environment specified in requirements.txt.
Run setup.sh to prepare the requested resources (IMDB and GloVe).
Run main.ipynb.

What I've learned

PyTorch development life-cycle
TDD (Test Driven Development) practice
Tokenizer implementation (because there is no tokenizer in PyTorch as easy as Keras' tokenizer)
IMDB dataset preprocessing
GloVe embedding usage

TODO

The tokenizer and sequence padding should be seperated.
Tokenizer should support using library.
Implement some baseline models to make comparison.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

IMDB-review-classification

Usage

What I've learned

TODO

Files

README.md

Latest commit

History

README.md

File metadata and controls

IMDB-review-classification

Usage

What I've learned

TODO