Official implementation of the paper Compressing Audio Visual Speech Recognition Models With Parameterized Hypercomplex Layers
, presented in SETN 2022.
dl.acm.org/doi/10.1145/3549737.3549785
This codebase is now deprecated, see our latest work: https://github.com/jpanagos/vsr_phm
It should still work, following the instructions from the base repository (see Acknowledgments).
Base code from: https://github.com/mpc001/Lipreading_using_Temporal_Convolutional_Networks/tree/47872c9a7a357b70a4adc97e51658c1e43fde8d9
PHM layer implementation (Linear/1d/2d) from: https://github.com/eleGAN23/HyperNets/blob/4d3b5274e384c90f89419971f7e055e921be01ad/layers/ph_layers.py
If you use this in your work, cite:
@inproceedings{10.1145/3549737.3549785,
author = {Panagos, Iason Ioannis and Sfikas, Giorgos and Nikou, Christophoros},
title = {Compressing Audio Visual Speech Recognition Models With Parameterized Hypercomplex Layers},
year = {2022},
isbn = {9781450395977},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3549737.3549785},
doi = {10.1145/3549737.3549785},
booktitle = {Proceedings of the 12th Hellenic Conference on Artificial Intelligence},
articleno = {44},
numpages = {7},
keywords = {automatic speech recognition, parameterized hypercomplex multiplication, quaternions},
location = {Corfu, Greece},
series = {SETN '22}
}