Speech-To-Transcript

Description

Designed and trained end-to-end character level speech to transcript generator using an encoder-decoder network consisting of CNNs and pyramidal bidirectional LSTMs based encoder, attention-based LSTM decoder, teacher-forcing and gumble noise.
Decoded the sequences using greedy decoding, random decoding and beam search.
Achieved a Levenshtein distance of 9.43 on the test set.

Note: This project is part of my Homeworks. Current CMU students please refrain from going through the codes.