Automatic estimation of the triangular vowel space area from i-Vectors
Parkinsons Disease (PD) is a neurodegenerative disorder which gradually effects the neurological condition of the patient. In many cases the disease impairs the reliability of the articulatory system and the ability to pronounce vowels normally. One prominent way to measure the degree of the functioning of the articulatory system is the Vowel Space Area (VSA). However, the typical way to measure it, is to manually annotate sustained vowel recordings or phonetically annotated speech utterances of a speaker and then analyze the signals. However, it is often desirable to measure the VSA directly from unlabeled natural speech. Therefore an automatic model-based system is proposed in this paper to estimate the triangular Vowels Space Area (tVSA) and the underlying corner vowel formant frequencies directly from unlabeled natural speech. The proposed algorithm is able to estimate the tVSA automatically from the speech signals without the need of phonetical or vowel transcriptions. The i-Vectors are extracted from the signals as the speakers characteristic representation, from which the speakers corner vowel formant frequencies are estimated by regression classifiers. Two regression classifiers, namely Deep Neural Networks (DNN) and Support Vector Regression (SVR), are investigated in this work. The proposed configuration employs the SVR classifier, which is able to predict the corner vowel formant frequencies of the test speakers with R2 up to 0.56719 and p up to 0.76485.