Audiology and Speech Pathology - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 18
  • Item
    No Preview Available
    Plug-and-play microphones for recording speech and voice with smart devices
    Noffs, G ; Cobler-Lichter, M ; Perera, T ; Kolbe, SC ; Butzkueven, H ; Boonstra, FMC ; van der Walt, A ; Vogel, AP (KARGER, 2023-11-16)
    INTRODUCTION Smart devices are widely available and capable of quickly recording and uploading speech segments for health-related analysis. The switch from laboratory recordings with professional-grade microphone set ups to remote, smart device-based recordings offers immense potential for the scalability of voice assessment. Yet, a growing body of literature points to a wide heterogeneity among acoustic metrics for their robustness to variation in recording devices. The addition of consumer-grade plug-and-play microphones has been proposed as a possible solution. Our aim was to assess if the addition of consumer-grade plug-and-play microphones increase the acoustic measurement agreement between ultra-portable devices and a reference microphone. METHODS Speech was simultaneously recorded by a reference high-quality microphone commonly used in research, and by two configurations with plug-and-play microphones. Twelve speech-acoustic features were calculated using recordings from each microphone to determine the agreement intervals in measurements between microphones. Agreement intervals were then compared to expected deviations in speech in various neurological conditions. Each microphone's response to speech and to silence were characterized through acoustic analysis to explore possible reasons for differences in acoustic measurements between microphones. The statistical differentiation of two groups, neurotypical and people with Multiple Sclerosis, using metrics from each tested microphone was compared to that of the reference microphone. RESULTS The two consumer-grade plug-and-play microphones favoured high frequencies (mean centre of gravity difference ≥ +175.3Hz) and recorded more noise (mean difference in signal-to-noise ≤ -4.2dB) when compared to the reference microphone. Between consumer-grade microphones, differences in relative noise were closely related to distance between the microphone and the speaker's mouth. Agreement intervals between the reference and consumer-grade microphones remained under disease-expected deviations only for fundamental frequency (f0, agreement interval ≤0.06Hz), f0 instability (f0 CoV, agreement interval ≤0.05%) and for tracking of second formant movement (agreement interval ≤1.4Hz/millisecond). Agreement between microphones was poor for other metrics, particularly for fine timing metrics (mean pause length and pause length variability for various tasks). The statistical difference between the two groups of speakers was smaller with the plug-and-play than with the reference microphone. CONCLUSION Measurement of f0 and F2 slope were robust to variation in recording equipment while other acoustic metrics were not. Thus, the tested plug-and-play microphones should not be used interchangeably with professional-grade microphones for speech analysis. Plug-and-play microphones may assist in equipment standardization within speech studies, including remote or self-recording, possibly with small loss in accuracy and statistical power as observed in this study.
  • Item
    No Preview Available
    Disease Delineation for Multiple Sclerosis, Friedreich Ataxia, and Healthy Controls Using Supervised Machine Learning on Speech Acoustics
    Schultz, BG ; Joukhadar, Z ; Nattala, U ; Quiroga, MDM ; Noffs, G ; Rojas, S ; Reece, H ; van der Walt, A ; Vogel, AP (IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 2023)
    Neurodegenerative disease often affects speech. Speech acoustics can be used as objective clinical markers of pathology. Previous investigations of pathological speech have primarily compared controls with one specific condition and excluded comorbidities. We broaden the utility of speech markers by examining how multiple acoustic features can delineate diseases. We used supervised machine learning with gradient boosting (CatBoost) to delineate healthy speech from speech of people with multiple sclerosis or Friedreich ataxia. Participants performed a diadochokinetic task where they repeated alternating syllables. We subjected 74 spectral and temporal prosodic features from the speech recordings to machine learning. Results showed that Friedreich ataxia, multiple sclerosis and healthy controls were all identified with high accuracy (over 82%). Twenty-one acoustic features were strong markers of neurodegenerative diseases, falling under the categories of spectral qualia, spectral power, and speech rate. We demonstrated that speech markers can delineate neurodegenerative diseases and distinguish healthy speech from pathological speech with high accuracy. Findings emphasize the importance of examining speech outcomes when assessing indicators of neurodegenerative disease. We propose large-scale initiatives to broaden the scope for differentiating other neurological diseases and affective disorders.
  • Item
    Thumbnail Image
    ParkinSong: Outcomes of a 12-Month Controlled Trial of Therapeutic Singing Groups in Parkinson's Disease
    Tamplin, J ; Morris, ME ; Marigliani, C ; Baker, FA ; Noffs, G ; Vogel, AP (IOS Press, 2020-07-28)
    Background: Parkinson’s disease (PD) frequently causes progressive deterioration in speech, voice and cognitive aspects of communication. These affect wellbeing and quality of life and are associated with caregiver strain and burden. Therapeutic singing groups can ameliorate PD-related communication disorders and increase social interaction and wellbeing for caregivers and care recipients. Objective: To analyse the effects of ParkinSong group singing sessions on Parkinson’s communication and wellbeing outcomes for people with PD and caregivers over 12 months. Methods: A 4-armed controlled clinical trial compared ParkinSong with active non-singing control conditions over 12 months. Two dosage levels (weekly versus monthly) were available for each condition. ParkinSong comprised high-effort vocal, respiratory and speech exercises, group singing, and social interaction. PD-specific outcomes included vocal loudness, speech intelligibility, maximum phonation time, respiratory muscle strength, and voice related quality of life (QoL). Wellbeing outcomes were also measured for caregivers and care recipients. Results: We recruited 75 people with PD and 44 caregivers who attended weekly ParkinSong, monthly ParkinSong, weekly control or monthly control groups. We found significant improvements in the primary outcome of vocal loudness (p = 0.032), with weekly singers 5.13 dB louder (p = 0.044) and monthly singers 5.69 dB louder (p = 0.015) than monthly controls at 12 months. ParkinSong participants also showed greater improvements in voice-related QoL and anxiety. Caregivers who attended ParkinSong showed greater reductions in depression and stress scores. Conclusions: This 12-month controlled clinical trial of ParkinSong demonstrated improvements in speech loudness and voice-related QoL for participants with PD, and enhanced wellbeing for both caregivers and care recipients. No adverse effects were reported over 12 months and improvements were sustained.
  • Item
    Thumbnail Image
    An Update on the Measurement of Motor Cerebellar Dysfunction in Multiple Sclerosis
    Kenyon, KH ; Boonstra, F ; Noffs, G ; Butzkueven, H ; Vogel, AP ; Kolbe, S ; van der Walt, A (SPRINGER, 2023-08)
    Multiple sclerosis (MS) is a progressive disease that often affects the cerebellum. It is characterised by demyelination, inflammation, and neurodegeneration within the central nervous system. Damage to the cerebellum in MS is associated with increased disability and decreased quality of life. Symptoms include gait and balance problems, motor speech disorder, upper limb dysfunction, and oculomotor difficulties. Monitoring symptoms is crucial for effective management of MS. A combination of clinical, neuroimaging, and task-based measures is generally used to diagnose and monitor MS. This paper reviews the present and new tools used by clinicians and researchers to assess cerebellar impairment in people with MS (pwMS). It also describes recent advances in digital and home-based monitoring for people with MS.
  • Item
    Thumbnail Image
    Automatic speech recognition in neurodegenerative disease
    Schultz, BG ; Tarigoppula, VSA ; Noffs, G ; Rojas, S ; van der Walt, A ; Grayden, DB ; Vogel, AP (SPRINGER, 2021-09)
    Abstract Automatic speech recognition (ASR) could potentially improve communication by providing transcriptions of speech in real time. ASR is particularly useful for people with progressive disorders that lead to reduced speech intelligibility or difficulties performing motor tasks. ASR services are usually trained on healthy speech and may not be optimized for impaired speech, creating a barrier for accessing augmented assistance devices. We tested the performance of three state-of-the-art ASR platforms on two groups of people with neurodegenerative disease and healthy controls. We further examined individual differences that may explain errors in ASR services within groups, such as age and sex. Speakers were recorded while reading a standard text. Speech was elicited from individuals with multiple sclerosis, Friedreich’s ataxia, and healthy controls. Recordings were manually transcribed and compared to ASR transcriptions using Amazon Web Services, Google Cloud, and IBM Watson. Accuracy was measured as the proportion of words that were correctly classified. ASR accuracy was higher for controls than clinical groups, and higher for multiple sclerosis compared to Friedreich’s ataxia for all ASR services. Amazon Web Services and Google Cloud yielded higher accuracy than IBM Watson. ASR accuracy decreased with increased disease duration. Age and sex did not significantly affect ASR accuracy. ASR faces challenges for people with neuromuscular disorders. Until improvements are made in recognizing less intelligible speech, the true value of ASR for people requiring augmented assistance devices and alternative communication remains unrealized. We suggest potential methods to improve ASR for those with impaired speech.
  • Item
    Thumbnail Image
    Speech metrics, general disability, brain imaging and quality of life in multiple sclerosis
    Noffs, G ; Boonstra, FMC ; Perera, T ; Butzkueven, H ; Kolbe, SC ; Maldonado, F ; Cofre Lizama, LE ; Galea, MP ; Stankovich, J ; Evans, A ; van Der Walt, A ; Vogel, AP (WILEY, 2021-01)
    BACKGROUND AND PURPOSE: Objective measurement of speech has shown promising results to monitor disease state in multiple sclerosis. In this study, we characterize the relationship between disease severity and speech metrics through perceptual (listener based) and objective acoustic analysis. We further look at deviations of acoustic metrics in people with no perceivable dysarthria. METHODS: Correlations and regression were calculated between speech measurements and disability scores, brain volume, lesion load and quality of life. Speech measurements were further compared between three subgroups of increasing overall neurological disability: mild (as rated by the Expanded Disability Status Scale ≤2.5), moderate (≥3 and ≤5.5) and severe (≥6). RESULTS: Clinical speech impairment occurred majorly in people with severe disability. An experimental acoustic composite score differentiated mild from moderate (P < 0.001) and moderate from severe subgroups (P = 0.003), and correlated with overall neurological disability (r = 0.6, P < 0.001), quality of life (r = 0.5, P < 0.001), white matter volume (r = 0.3, P = 0.007) and lesion load (r = 0.3, P = 0.008). Acoustic metrics also correlated with disability scores in people with no perceivable dysarthria. CONCLUSIONS: Acoustic analysis offers a valuable insight into the development of speech impairment in multiple sclerosis. These results highlight the potential of automated analysis of speech to assist in monitoring disease progression and treatment response.
  • Item
    Thumbnail Image
    Effects of face masks on acoustic analysis and speech perception: Implications for peri-pandemic protocols
    Magee, M ; Lewis, C ; Noffs, G ; Reece, H ; Chan, J ; Zaga, C ; Paynter, C ; Birchall, O ; Azocar, SR ; Ediriweera, A ; Caverlé, M ; Schultz, B ; Vogel, A (Cold Spring Harbor Laboratory, 2020)

    ABSTRACT

    Wearing face masks (alongside physical distancing) provides some protection against infection from COVID-19. Face masks can also change how we communicate and subsequently affect speech signal quality. Here we investigated how three face mask types (N95, surgical and cloth) affect acoustic analysis of speech and perceived intelligibility in healthy subjects. We compared speech produced with and without the different masks on acoustic measures of timing, frequency, perturbation and power spectral density. Speech clarity was also examined using a standardized intelligibility tool by blinded raters. Mask type impacted the power distribution in frequencies above 3kHz for both the N95 and surgical masks. Measures of timing and spectral tilt also differed across mask conditions. Cepstral and harmonics to noise ratios remained flat across mask type. No differences were observed across conditions for word or sentence intelligibility measures. Our data show that face masks change the speech signal, but some specific acoustic features remain largely unaffected (e.g., measures of voice quality) irrespective of mask type. Outcomes have bearing on how future speech studies are run when personal protective equipment is worn.
  • Item
    Thumbnail Image
    Effects of face masks on acoustic analysis and speech perception: Implications for peri-pandemic protocols
    Magee, M ; Lewis, C ; Noffs, G ; Reece, H ; Chan, JCS ; Zaga, CJ ; Paynter, C ; Birchall, O ; Rojas Azocar, S ; Ediriweera, A ; Kenyon, K ; Caverlé, MW ; Schultz, BG ; Vogel, AP (Acoustical Society of America (ASA), 2020-12)
    Wearing face masks (alongside physical distancing) provides some protection against infection from COVID-19. Face masks can also change how people communicate and subsequently affect speech signal quality. This study investigated how three common face mask types (N95, surgical, and cloth) affected acoustic analysis of speech and perceived intelligibility in healthy subjects. Acoustic measures of timing, frequency, perturbation, and power spectral density were measured. Speech intelligibility and word and sentence accuracy were also examined using the Assessment of Intelligibility of Dysarthric Speech. Mask type impacted the power distribution in frequencies above 3 kHz for the N95 mask, and above 5 kHz in surgical and cloth masks. Measures of timing and spectral tilt mainly differed with N95 mask use. Cepstral and harmonics to noise ratios remained unchanged across mask type. No differences were observed across conditions for word or sentence intelligibility measures; however, accuracy of word and sentence translations were affected by all masks. Data presented in this study show that face masks change the speech signal, but some specific acoustic features remain largely unaffected (e.g., measures of voice quality) irrespective of mask type. Outcomes have bearing on how future speech studies are run when personal protective equipment is worn.
  • Item
    Thumbnail Image
    Novel Functional MRI Task for Studying the Neural Correlates of Upper Limb Tremor
    Boonstra, FMC ; Perera, T ; Noffs, G ; Marotta, C ; Vogel, AP ; Evans, AH ; Butzkueven, H ; Moffat, BA ; van der Walt, A ; Kolbe, SC (FRONTIERS MEDIA SA, 2018-07-02)
    Introduction: Tremor of the upper limbs is a disabling symptom that is present during several neurological disorders and is currently without treatment. Functional MRI (fMRI) is an essential tool to investigate the pathophysiology of tremor and aid the development of treatment options. However, no adequately or standardized protocols for fMRI exists at present. Here we present a novel, online available fMRI task that could be used to assess the in vivo pathology of tremor. Objective: This study aims to validate the tremor-evoking potential of the fMRI task in a small group of tremor patients outside the scanner and assess the reproducibility of the fMRI task related activation in healthy controls. Methods: Twelve HCs were scanned at two time points (baseline and after 6-weeks). There were two runs of multi-band fMRI and the tasks included a "brick-breaker" joystick game. The game consisted of three conditions designed to control for most of the activation related to performing the task by contrasting the conditions: WATCH (look at the game without moving joystick), MOVE (rhythmic left/right movement of joystick without game), and PLAY (playing the game). Task fMRI was analyzed using FSL FEAT to determine clusters of activation during the different conditions. Maximum activation within the clusters was used to assess the ability to control for task related activation and reproducibility. Four tremor patients have been included to test ecological and construct validity of the joystick task by assessing tremor frequencies captured by the joystick. Results: In HCs the game activated areas corresponding to motor, attention and visual areas. Most areas of activation by our game showed moderate to good reproducibility (intraclass correlation coefficient (ICC) 0.531-0.906) with only inferior parietal lobe activation showing poor reproducibility (ICC 0.446). Furthermore, the joystick captured significantly more tremulous movement in tremor patients compared to HCs (p = 0.01) during PLAY, but not during MOVE. Conclusion: Validation of our novel task confirmed tremor-evoking potential and reproducibility analyses yielded acceptable results to continue further investigations into the pathophysiology of tremor. The use of this technique in studies with tremor patient will no doubt provide significant insights into the treatment options.
  • Item
    No Preview Available
    Objective speech marker correlates with clinical scores in non-dysarthric MS
    Noffs, G ; Boonstra, F ; Kolbe, S ; Perera, T ; Shanahan, C ; Evans, A ; Butzkueven, H ; Vogel, A ; Van der Walt, A (SAGE PUBLICATIONS LTD, 2017-10-01)
    Background: Reduction of brain volume occurs in clinically active disease and correlates with progressive disability in multiple Sclerosis (MS). Although dysarthria is highly prevalent in MS, it only becomes clinically relevant in advanced stages of the disease. The relationship between early sub-clinical markers of dysarthria and overall disease severity is poorly understood. Aim: To examine the relationship between an objective marker of speech performance and validated clinical scores for disease severity in non-dysarthric subjects with relapsing-remitting and secondary progressive MS. Method: An experienced neurologist scored patients according to the Expanded Disability Status Scale (EDSS) and the Scale for the Assessment and Rating of Ataxia (SARA). Acoustic analysis was used to investigate the diadochokinetic speed in “as fast as possible” repetition of the meaningless word /pa/ta/ka/. Brain images were acquired using 3 Tesla magnetic resonance. Images were automatically segmented using FreeSurfer (5.7) to determine volumes for whole brain (excluding ventricules) and cerebellum. Lesions were automatically segmented by the lesion prediction algorithm as implemented in the Lesion Segmentation Tool version 2.0.15 for SPM (Statistical Parametric Mapping software). Statistical correlations were processed in SPSS (v 23.0) controlling for age. After adjustment for multiple comparisons, a p< 0.01 was considered for statistical significance. Results: We assessed 35 MS patients with normal speech (i.e. SARA speech sub-score 0-1; age=47.7±12years; disease duration=13.2±8.4). Diadochokinetic rate (mean=5.63±0.83 syllables per second) directly correlated with EDSS (Spearman's rho=0.454, 2-tailed p=0.007; median EDSS=3.5, interquartile range=3.5) and SARA (rho=0.515, p=0.002; SARA median=9, interquartile range 11.975), but not with whole brain volume (p=0.022), lesion load (p=0.032) or cerebellar volume (p=0.037). Conclusion: Changes in acoustic markers can be detected before overt dysarthria in MS and reflect overall disease severity. Larger and longitudinal studies are needed to understand if those markers can help monitoring disease progression.