Towards Interpretability of Speech Pause in Dementia Detection Using Adversarial Learning

Citation

Zhu, Youxiang; Tran, Bang; Liang, Xiaohui; Batsis, John A.; & Roth, Robert M. (2022). Towards Interpretability of Speech Pause in Dementia Detection Using Adversarial Learning. Proceedings for the International Conference on Acoustics, Speech, and Signal Processing, 2022, 6462-6466. PMCID: PMC10102974

Abstract

Speech pause is an effective biomarker in dementia detection. Recent deep learning models have exploited speech pauses to achieve highly accurate dementia detection, but have not exploited the interpretability of speech pauses, i.e., what and how positions and lengths of speech pauses affect the result of dementia detection. In this paper, we will study the positions and lengths of dementia-sensitive pauses using adversarial learning approaches. Specifically, we first utilize an adversarial attack approach by adding the perturbation to the speech pauses of the testing samples, aiming to reduce the confidence levels of the detection model. Then, we apply an adversarial training approach to evaluate the impact of the perturbation in training samples on the detection model. We examine the interpretability from the perspectives of model accuracy, pause context, and pause length. We found that some pauses are more sensitive to dementia than other pauses from the model's perspective, e.g., speech pauses near to the verb "is". Increasing lengths of sensitive pauses or adding sensitive pauses leads the model inference to Alzheimer's Disease (AD), while decreasing the lengths of sensitive pauses or deleting sensitive pauses leads to non-AD.

URL

http://dx.doi.org/10.1109/icassp43922.2022.9747006

Reference Type

Journal Article

Year Published

2022

Journal Title

Proceedings for the International Conference on Acoustics, Speech, and Signal Processing

Author(s)

Zhu, Youxiang
Tran, Bang
Liang, Xiaohui
Batsis, John A.
Roth, Robert M.

Article Type

Regular

PMCID

PMC10102974

ORCiD

Batis - 0000-0002-2823-6651