Skip to content

Latest commit

 

History

History
11 lines (6 loc) · 546 Bytes

README.md

File metadata and controls

11 lines (6 loc) · 546 Bytes

Speech-To-Transcript

Description

  1. Designed and trained end-to-end character level speech to transcript generator using an encoder-decoder network consisting of CNNs and pyramidal bidirectional LSTMs based encoder, attention-based LSTM decoder, teacher-forcing and gumble noise.

  2. Decoded the sequences using greedy decoding, random decoding and beam search.

  3. Achieved a Levenshtein distance of 9.43 on the test set.

Note: This project is part of my Homeworks. Current CMU students please refrain from going through the codes.