-
Designed and trained end-to-end character level speech to transcript generator using an encoder-decoder network consisting of CNNs and pyramidal bidirectional LSTMs based encoder, attention-based LSTM decoder, teacher-forcing and gumble noise.
-
Decoded the sequences using greedy decoding, random decoding and beam search.
-
Achieved a Levenshtein distance of 9.43 on the test set.
Note: This project is part of my Homeworks. Current CMU students please refrain from going through the codes.