This project adapts the 🦣 MAMMOTH toolkit, built on top of OpenNMT-py, for sign language translation, leveraging the Phoenix2014T dataset. This work contributes to the advancement of neural machine translation in the domain of sign language using modular, research-friendly tools from Helsinki-NLP.
- Installation is the same as provided by mammoth library. In addition, sentencepiece and sacrebleu are also installed.
- configs dir: the configuration json file
- data dir: includes the Phoenix2014T sign language translation dataset
- vocabs dir: includes the tokenizer vocabulary
- Download the Phoenix2014T dataset files:
$ ./download.sh
- Run:
sbatch job.sh