RESEARCH

Overview

I explore the intriguing inquiries that drive my work in the field of linguistics and speech science. My research examines questions that illuminate the intricate mechanics of spoken language and its interaction with various linguistic components and social contexts. I employ a comprehensive array of computational, quantitative, and experimental methodologies to address pivotal questions in linguistics, speech, and voice sciences. Here, I present a glimpse into the fascinating areas of investigation that define my research program.

Dissertation: The Prosodic Substrate of Consonant and Tone Dynamics 

In my dissertation [.pdf], I adopted a dynamic approach to investigating the relationship between consonant and tone. I accomplished this by examining how segmental and tonal sensitivities respond to the phrasal prosodic structure in a language characterized by specific intonational patterns. To untangle the complexities of the interaction between segments and prosody, I focus on analyzing intonation patterns in the contemporary Seoul dialect of Korean.

Korean exhibits the well-known three-way voiceless stop contrast (aspirated, fortis, and lenis). Among young speakers of Seoul Korean, these consonants are categorized as LAX (lenis) and TENSE (aspirated and fortis) stops based on their segmental tonal characteristics (low tone for LAX and high tone for TENSE). When combined with the language's regularly specified Accentual Phrase tonal pattern, which involves alternating low and high tonal sequences, this contrast offers an opportunity to simultaneously examine the expression of both segmental and prosodic tonal attributes.

Beyond providing the phonetic description required for establishing new pronunciation norms among younger generations of speakers, this study also elucidates how the local phonetic structure of a contrast system changes according to phrasal positions. Specifically, it uncovers the tone gestures used for conveying segmental "tenseness" and how these gestures interact with the language's phrasal intonational pattern.

The synthesis of these discoveries unveils an intricate tonal pattern in which local segmental details are jointly expressed with phrasal information. Moreover, it showcases the efficacy of dynamical modeling in comprehensively explaining both categorical and gradual facets of tone realization. This insight also illuminates how a prosodically asymmetric pattern emerges from a singular foundational system.

(Associated Work: Lee, Goldstein & Byrd, in preparation; talks at LSA 2019 and LSA 2020; Oh & Lee, 2018, 2020, and 2022)

Prosodic Structuring in Spoken Language

In addition to my dissertation research, I have conducted separate empirical studies focusing on various aspects of prosodic structuring in spoken language. These investigations shed light on the intricate interplay between linguistic prosody and articulatory mechanisms.

Phonetic Signatures of Prosodic Boundary and Accentual Prominence

In my work examining the phonetic signatures of prosodic boundaries and accentual prominence in both Korean and English (Lee, 2011, 2013; Cho, Lee & Kim, 2011, 2014), I reveal a fascinating connection. I demonstrate that the variability attributed to prosodic structure can be comprehensively explained by the coordination between prosodic gestures and vocal tract gestures.

Listener Segmentation and Sub-Segmental Sensitivity

In collaboration with Louis Goldstein and Elsi Kaiser (Lee, Goldstein & Kaiser, 2020), I employ a mouse-tracking method to investigate listeners' segmentation of internal open juncture sequences (e.g., "I scream." vs. "ice cream"). We aim to discern listeners' sensitivity to sub-segmental information. Our findings suggest a remarkable alignment between segmentation and lexical access, rooted in bottom-up phonetic information. These insights contribute to the development of a spoken language recognition model featuring position-specific representations at the prelexical level. Additionally, our findings hint at the potential incorporation of detailed phonetic information within listeners' lexicons.

Articulatory Composition in Different Prosodic Contexts

Collaborating with Louis Goldstein and Shri Narayanan (Lee, Goldstein & Narayanan, 2015), I utilize real-time magnetic resonance imaging to illuminate the articulatory composition of the Korean liquid sound across distinct prosodic context. Our investigation highlights a nuanced understanding. It suggests that the positional allophony between the lateral and flap sounds extends beyond mere gestural reduction, encompassing a categorical distinction in gestural composition.

These empirical studies collectively contribute to unraveling the intricate relationship between linguistic prosody and articulatory gestures during speech production and perception.

The Temporal Organization of Prosody in Multimodal Speech and Co-speech Gestures

Prosodic structure, which encompasses word grouping and prominence expression, plays a pivotal role in shaping linguistic communication. In my recent NSF-funded project, conducted in collaboration with Jelena Krivokapić, I aim to comprehend the intricate interplay between speech and accompanying body movements, known as co-speech gestures. These gestures are integral to everyday communication, yet our understanding of their interaction with prosodic structure remains limited. This research seeks to illuminate how prosody integrates into different modalities during speech production. 

We investigate how prosodic information manifests in co-speech gestures at phrasal boundaries and under prominence. Employing state-of-the-art instruments and analysis tools (Lee, Krivokapić, & Purse, in preparation), we capture multimodal data encompassing pitch, vocal tract movement, head, eyebrow, and hand motions. These recordings, gathered from participants of prosodically diverse languages engaging in various communication tasks, provide insights into the dynamic interplay between speech and gestures.

This exploration extends to two distinct languages with varying prosodic features—one with no lexical stress and pitch accents, and another with stress and pitch accent-based prominence. Through these cross-linguistic experiments, I uncover how languages orchestrate the coordination between speech and co-speech gestures. Additionally, we pioneer a real-time accommodation study involving pairs of speakers in communicative tasks, offering a unique insight into prosodic accommodation across modalities.

This investigation not only enriches our understanding of articulatory organization and linguistic prosody but also contributes to theoretical models of prosodic structure. Furthermore, this research yields a valuable database of multimodal articulator movement signals, combined with synchronized audio and video recordings. This resource will be accessible to the research and education community, facilitating diverse applications, from sign language to education, engineering, and clinical contexts. Overall, my work advances the comprehension of multimodal communication and lays the groundwork for further exploration in related fields.

Speech Accommodation in Dyadic Interaction

Another significant facet of my research program closely examines cognitive control over interactional spoken language behaviors. In particular, I focus on adaptive accommodation behaviors exhibited within pairs of interacting speakers.

In a notable NIH-funded endeavor, we venture into the realm of conversational interaction (Lee, Y., Gordon Danner, Parrell, Lee, S., Goldstein & Byrd, 2018). We employ a dual electromagnetic articulography setup, aiming to unveil the intricate dance of prosodic, acoustic, and articulatory speech behaviors as they adapt in response to the speech of their dyadic partner over the course of the interaction. The findings from this study provide valuable insights into the coordination of linguistic phrasal structure realization across interacting speakers. Moreover, our work offers novel evidence of speakers' accommodation at the level of cognitively specified motor control of articulatory gestures, with a keen sensitivity to prosodic structure.

Expanding on this foundation, I am currently at the helm of an NSF-funded project in collaboration with Jelena Krivokapić. We're charting a course towards a more expansive understanding of accommodation through a multimodal approach. Within the framework of this project, our focus lies in collecting multimodal data within dyadic contexts, offering a unique vantage point to comprehensively explore the accommodation of prosodic gestures, encompassing segmental articulatory and co-speech gestures (including various body movements that convey intentions) during interactive communication. By dissecting the interplay among these elements, we address the intricate mechanisms underlying how speakers adeptly tailor their linguistic behaviors within the landscape of dynamic interaction.

In our latest research (Lee, Goldstein, Parrell & Byrd, 2021), we further explore the interrelationship between individual variability and speech adaptation during conversation. Through the development of a simple computational model of attunement, we shed light on the intricate dynamics that shape how conversing interlocutors adapt their speech behaviors. Our exploration encompasses both model simulation results and behavioral data, revealing the convergence behaviors of two interlocutors over time. These results underscore the pivotal role played by individual variability or flexibility in fostering speaker adaptability. Furthermore, we identify structured variability as a key determinant in recognizing participants who engage in convergence during spoken language interactions marked by accommodation.

This multifaceted investigation significantly contributes to our understanding of the underlying mechanisms of interactional spoken language behavior, shedding light on the complex interplay between cognitive control, adaptation, and individual variability.

Voice Identity from Variability

Human voices, often referred to as our "auditory faces," constitute a unique blend of speaker, signal, listener, and their intricate interaction, rendering them inherently social in nature. The profound interplay between perception and production unveils a captivating tapestry of dynamic, variable signals that shape the rich spectrum of actual utterances. In this branch of my research program, I focus on voice identity, investigating the multifaceted nature of individual speaker variability and its influence on voice perception and recognition, harnessing the power of complex, dynamic speech datasets and corpora. Through this intricate interplay between perception and production, I aim to address the fundamental question: What makes your voice uniquely yours?

Exploring Voice Quality with an Interdisciplinary Lens

With Jody Kreiman, Patricia Keating, and Abeer Alwan, I pursue an interdisciplinary approach to studying voice quality, a venture supported by NSF and NIH. Our collaborative approach combines diverse expertise to unravel the complexities of talker voice variation. Centered around a sophisticated psychoacoustic model (Kreiman, Lee, Garellek, Samlan, & Gerratt, 2022), our exploration bridges individual speaker variability and broader, population voice spaces. This framework navigates the intricacies of voice qualities, spanning speaking styles (Lee & Kreiman, 2022a), emotions, dialects, and languages (Lee & Kreiman, 2022b). Leveraging innovative computational tools (Lee, Keating, & Kreiman, 2019), we dissect acoustic attributes that define voice variation. Analyzing a diverse dataset, we uncover the intricate features within and across speakers. This research enriches cross-linguistic perspectives on voice quality and informs clinical practices. Our interdisciplinary inquiry not only advances voice research but also lays a foundation for clinical applications. It amplifies our understanding of voice production and recognition, propelling transformative discoveries in vocal communication.

Unraveling Neuromuscular Controls in Voice Production, Acoustics, and Perception

Collaborating closely with Dinesh Chhetri and the otolaryngology team at the UCLA Laryngeal Lab, our research takes a comprehensive approach to understanding the intricate dynamics of voice production, acoustics, and perception. Utilizing an in vivo canine model and our novel 3D reconstruction method, which enable the manipulation of laryngeal nerve stimulations to impact vocal fold medial surface deformation (Reddy, Schlegel, Lee & Chhetri, 2022), our multifaceted investigation uncovers the mechanisms governing voice production and quality. It identifies perceptually significant vocal fold vibration asymmetry patterns that contribute to the clinical understanding of voice disorders. Through rigorous analysis of the acoustic and perceptual outcomes resulting from systematically manipulated neuromuscular variations, we've identified distinct vocal fold vibration asymmetry patterns with perceptual relevance that further our comprehension of voice disorders (Chung, Lee, Reddy, Zhang & Chhetri, 2023). Additionally, our work informs the optimal choice of thyroplasty implant treatment for glottal insufficiency (Reddy, Lee, Zhang & Chhetri, 2022). Our ongoing efforts in the field of voice research drive the advancement of clinical approaches for diagnosing and treating voice disorders, thereby paving the way for transformative progress.