Unveiling Interpretability in Self-Supervised Speech Representations for Parkinson’s Diagnosis 🗣️🎙️📝📊

D. Gimeno-Gómez, C. Botelho, A. Pompili, A. Abad, C.-D. Martínez-Hinarejos

📘 Introduction | 🛠️ Data Preparation | 🚀 Training and Evaluation | 📖 Citation | 📝 License

📘 Introduction

Abstract. Recent works in pathological speech analysis have increasingly relied on powerful self-supervised speech representations, leading to promising results. However, the complex, black-box nature of these embeddings and the limited research on their interpretability significantly restrict their adoption for clinical diagnosis. To address this gap, we propose a novel, interpretable framework specifically designed to support Parkinson’s Disease (PD) diagnosis. Through the design of simple yet effective cross-attention mechanisms for both embedding- and temporal-level analysis, the proposed framework offers interpretability from two distinct but complementary perspectives. Experimental findings across five well-established speech benchmarks for PD detection demonstrate the framework’s capability to identify meaningful speech patterns within self-supervised representations for a wide range of assessment tasks. Fine-grained temporal analyses further underscore its potential to enhance the interpretability of deep-learning pathological speech models, paving the way for the development of more transparent, trustworthy, and clinically applicable computer-assisted diagnosis systems in this domain. Moreover, in terms of classification accuracy, our method achieves results competitive with state-of-the-art approaches, while also demonstrating robustness in cross-lingual scenarios when applied to spontaneous speech production. 📜 Arxiv Link

🛠️ Preparation

Prepare the conda environment to run the experiments:

conda create -n ssl-parkinson python=3.10
conda activate ssl-parkinson
pip install -r requirements.txt

🚀 Training and Evaluation

To train and evaluate our proposed framework, we should follow a pipeline consisting of multiple steps, including data preprocessing, dataset split, feature extraction, as well as the ultimate training and evaluation. As an example, we provide the scripts aimed to address our GITA corpus experiments:

bash scripts/run/dataset_preparation/gita.sh $DATASET_DIR $METADATA_PATH
bash scripts/run/feature_extraction/gita.sh
bash scripts/run/experiments/cross_full/gita.sh

, where $DATASET_DIR and $METADATA_PATH refer to the directory containing all the audio waveform samples and the CSV including the corpus subject metadata, respectively.

📖 Citation

The paper is currently under review for the Special Issue on Modelling and Processing Language and Speech in Neurodegenerative Disorders published by Journal of Selected Topics in Signal Processing (JSTSP). For the moment, if you found useful our work, please cite our preprint paper as follows:

@article{gimeno2024unveiling,
  author={Gimeno-G{\'o}mez, David and Botelho, Catarina and Pompili, Anna and Abad, Alberto and Martínez-Hinarejos, Carlos-D.},
  title={{Unveiling Interpretability in Self-Supervised Speech Representations for Parkinson’s Diagnosis}},
  journal={arXiv preprint arXiv:2412.02006},
  volume={},
  pages={}
  year={2024},
  publisher={},
}

📝 License

This work is protected by MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
configs		configs
datasets		datasets
docs		docs
metadata		metadata
models		models
scripts		scripts
tools		tools
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unveiling Interpretability in Self-Supervised Speech Representations for Parkinson’s Diagnosis 🗣️🎙️📝📊

📘 Introduction

🛠️ Preparation

🚀 Training and Evaluation

📖 Citation

📝 License

About

Releases

Packages

Languages

License

david-gimeno/interpreting-ssl-parkinson-speech

Folders and files

Latest commit

History

Repository files navigation

Unveiling Interpretability in Self-Supervised Speech Representations for Parkinson’s Diagnosis 🗣️🎙️📝📊

📘 Introduction

🛠️ Preparation

🚀 Training and Evaluation

📖 Citation

📝 License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages