1. Developing automated algorithms for digitized language biomarkers in clinical speech

Speech production is a complex, intentional, planned activity, which involves multiple brain regions, and we can expect patients with degenerative brain conditions to show impaired speech compared to healthy speakers. Manual assessments of speech require a substantial amount of time, labor, and cost. My work incorporates digitized, automated, large-scale analyses of natural speech to identify digital linguistic biomarkers within patients with various conditions, from Alzheimer’s disease, primary progressive aphasia, amyotrophic lateral sclerosis spectrum disorder, to developmental disorders, such as autism spectrum disorders, and to psychosis, such as schizophrenia. Through collaboration with Penn Frontotemporal Degeneration Center, I have identified specific language characteristics that distinguish clinical speech produced by patients with neurodegenerative disease, which enhances our understanding of the relation between language characteristics and neurodegeneration.

Publications:

  • Nevler, Naomi, Sharon Ash, Corey T. McMillan, Lauren Elman, Leo McCluskey, David J Irwin, Sunghye Cho, Mark Y Liberman, and Murray Grossman. (2020). Automated analysis of natural speech in amyotrophic lateral sclerosis spectrum disorder. Neurology, 95(12), e1629 - e1639.

  • Cho, Sunghye, Naomi Nevler, Sharon Ash, Sanjana Shellikeri, David J. Irwin, Lauren Massimo, Katya Rascovsky, Christopher Olm, Murray Grossman, Mark Liberman. (2021). Automated analysis of lexical features in Frontotemporal Degeneration. Cortex, 137, 215-231.

  • Parjane, Natalia, Sunghye Cho, Sharon Ash, Sanjana Shellikeri, Mark Liberman, Leslie M Shaw, David J Irwin, Murray Grossman, Naomi Nevler. (2021). Digital speech analysis in progressive supranuclear palsy and corticobasal syndromes. Journal of Alzheimer’s Disease, 82(1), 33-45.

  • Sunny X. Tang, Reno Kriz, Sunghye Cho, Suh Jung Park, Jenna Harowitz, Raquel E. Gur, Mahendra T. Bhati, Daniel H. Wolf, João Sedoc, Mark Y. Liberman. (2021). Natural language processing methods are sensitive to sub-clinical linguistic differences in schizophrenia spectrum disorders. NPJ Schizophrenia, 7, 25.


2. Automatic machine learning classification of patient population with acoustic and lexical features


Earlier screening and slowing the apparent disease progression rate through behavioral adjustments to the environment are key to helping patients and their families, which will eventually decrease the personal and societal costs of disease. In this line of research, I develop automatic classification systems using advanced machine learning algorithms for clinical population. I utilize a large number of lexical and acoustic features obtained from brief 1- to 5-minute natural speech samples using cutting-edge technologies of natural language processing and automatic speech detection algorithms, and show automated diagnostic categorization at >90% accuracy for patients with primary progressive aphasia and dementia, >85% for patients with schizophrenia, and >75% accuracy for children with autism spectrum disorder.

Publications:

  • Cho, Sunghye, Mark Liberman, Neville Ryant, Meredith Cola, Robert T. Schultz, and Julia Parish- Morris. (2019). Automatic detection of Autism Spectrum Disorder in children using acoustic and text features from brief natural conversations. In Proceedings of INTERSPEECH 2019, 2513 - 2517.

  • Cho, Sunghye, Naomi Nevler, Sanjana Shellikeri, Sharon Ash, Mark Liberman, and Murray Grossman. (2020). Automatic classification of primary progressive aphasia patients using lexical and acoustic features. In Proceedings of Language Resources and Evaluation Conference (LREC) 2020 workshop on Resources and Processing of Linguistic, Para-linguistic and Extra-linguistic Data from People with Various Forms of Cognitive/Psychiatric/Developmental Impairments (RaPID-3), 60-65.

  • Sunny X. Tang, Reno Kriz, Sunghye Cho, Suh Jung Park, Jenna Harowitz, Raquel E. Gur, Mahendra T. Bhati, Daniel H. Wolf, João Sedoc, Mark Y. Liberman. (2021). Natural language processing methods are sensitive to sub-clinical linguistic differences in schizophrenia spectrum disorders. NPJ Schizophrenia, 7, 25.


3. Characterization of language change and variation in healthy population

Identifying language change and variation in healthy population is a key base of understanding language change and variation in patient population. Language variation and change show highly systematic patterns within the hierarchical structure of the grammar, and my early research publications were focused on language variation and change in healthy population with a wide age range. My projects, in particular, investigate prosodic variation and change in various languages and dialects, such as Seoul Korean, Tokyo Japanese, and American English, and their effects on the grammatical system of native speakers. Recently, my research interest is expanding to include lexical, acoustic, and prosodic changes in healthy aging. The ultimate goal is to develop a normative baseline of language characteristics in healthy speakers across wide age and education ranges.

Publications:

  • Cho, S., Nevler, N., Shellikeri, S., Parjane, N., Irwin, D. J., Ryant, N., Ash, S., Cieri, C., Liberman, M., & Grossman, M. (2021). Lexical and acoustic characteristics of young and older healthy adults. Journal of Speech, Language, and Hearing Research, 64(2), 302-314.

  • Cho, Sunghye, Naomi Nevler, Natalia Parjane, Christopher Cieri, Mark Y. Liberman, Murray Grossman, Katheryn A. Q. Cousins. (2021). Automated analysis of digitized letter fluency data. Frontiers in Psychology 12, 654214.