Normalized gestural overlap measures and spatial properties of lingual movements in Korean non-assimilating contexts*

Minjung Son 1 , **
Author Information & Copyright
1Department of English Language and Literature, Hannam University, Daejeon, Korea
**Corresponding author:

© Copyright 2019 Korean Society of Speech Sciences. This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Received: Aug 16, 2019; Revised: Sep 08, 2019; Accepted: Sep 17, 2019

Published Online: Sep 30, 2019


The current electromagnetic articulography study analyzes several articulatory measures and examines whether, and if so, how they are interconnected, with a focus on cluster types and an additional consideration of speech rates and morphosyntactic contexts. Using articulatory data on non-assimilating contexts from three Seoul-Korean speakers, we examine how speaker-dependent gestural overlap between C1 and C2 in a low vowel context (/a/-to-/a/) and their resulting intergestural coordination are realized. Examining three C1C2 sequences (/k(#)t/, /k(#)p/, and /p(#)t/), we found that three normalized gestural overlap measures (movement onset lag, constriction onset lag, and constriction plateau lag) were correlated with one another for all speakers. Limiting the scope of analysis to C1 velar stop (/k(#)t/ and /k(#)p/), the results are recapitulated as follows. First, for two speakers (K1 and K3), i) longer normalized constriction plateau lags (i.e., less gestural overlap) were observed in the pre-/t/ context, compared to the pre-/p/ (/k(#)t/>/k(#)p/), ii) the tongue dorsum at the constriction offset of C1 in the pre-/t/ contexts was more anterior, and iii) these two variables are correlated. Second, the three speakers consistently showed greater horizontal distance between the vertical tongue dorsum and the vertical tongue tip position in /k(#)t/ sequences when it was measured at the time of constriction onset of C2 (/k(#)t/>/k(#)p/): the tongue tip completed its constriction onset by extending further forward in the pre-/t/ contexts than the uncontrolled tongue tip articulator in the pre-/p/ contexts (/k(#)t/>/k(#)p/). Finally, most speakers demonstrated less variability in the horizontal distance of the lingual-lingual sequences, which were taken as the active articulators (/k(#)t/=/k(#)p/ for K1; /k(#)t/</k(#)p/ for K2 and K3). Taken together, the results suggest that biomechanical constraints be, at least partly, active: speakers can control spatio-temporal articulatory coordination in an interactive way such that greater tongue dorsum horizontal advancement is related to less gestural overlap in consecutive lingual-lingual (/k(#)t/) sequences.

Keywords: greater stability; intergestural coordination; inter-speaker variability; normalized gestural overlap

1. Introduction

Gestural overlap has been hypothesized to exist as an essential aspect of speech (Browman & Goldstein, 1986, 1988, 1989, 1990, 1991, 1992). By hypothesis, constriction in the oral cavity can be functionally classified as consonants and vowels: the consonantal tier is defined by higher stiffness as compared to the vocalic tier with less stiffness (see Saltzman & Munhall (1989) for more detailed descriptions of dynamical parameters), and gestural coordination can occur within a tier (e.g., C-to-C, V-to-V) and across tiers (e.g., C-to-V, V-to-C) (Browman & Goldstein, 1992). In particular, consonantal and vocalic gestures overlap in syllable onset (C-V) or coda (V-C): a consonantal gesture is synchronously coordinated with a vocalic gesture in syllable onset, and sequentially in syllable coda (Browman & Goldstein, 1995; Nam, 2007; Nam et al., 2009, inter alia). Synchronous coordination observed in CV sequences is considered a more stable mode of speech: CV is more frequent cross-linguistically than VC, and the former is acquired earlier than the latter at the developmental stages (Nam et al., 2013; Vihman & Greenlee, 1987). To quote Nam (2007), "… the coupling pattern for onsets results in faster stabilization into steady-state intergestural phasing than for codas..." In his computational simulation and reaction-time study with a working hypothesis that a stop consonant consists of close and release gestures, greater stability was observed in C-V coupling. In one production study, an abrupt transition shift was observed from V-C coordination (anti-phase as a less stable mode of speech production) to C-V coordination (in-phase as a more stable mode of speech production) in a repetitive task involving the production of an /ip/ sequence (Kelso et al., 1986). Furthermore, the stability of gestural overlap differed as a function of the nature of a compound structure: non-lexicalized C#C sequences exhibited less stability in terms of gestural overlap compared to lexicalized C#C sequences (Cho, 2001).

At the level of phonetic execution, physically measured gestures exhibit context-dependent properties (Browman & Goldstein, ms.). In Mooshammer et al. (1995), vocalic effects on horizontal movement of dorsal gestures during acoustic constriction duration were systematically examined in German with two subjects. Combining tense vowels (/i/, /u/, /a/) between V1 and V2 in /bV1gV2/, egg-shaped forward movement of the tongue body was consistently observed in most combinations of vowels; such movement was stimulated by /i/ in V2. However, this was noticeably inhibited by /i/ in V1. Similar results were obtained with a high front lax vowel /I/ in V1 compared with /ʊ/ and /a/ in the context /bV1Cɐ/. In addition, consonantal effects were observed on the dorsal gesture in non-assimilating contexts, where lingual-lingual sequences demonstrated less gestural overlap, compared to labial-lingual sequences (/k(#)t/</p(#)t/) (Kochetov et al., 2007) (cf., similar degrees of gestural overlap (/k(#)t/=/p(#)t/) in Son (2011)). With varying manners of articulation in C2, Son (2011) showed that dorsal stop /k/ in C1 was more overlapped by a coronal stop /t/ than coronal fricative /s/ in C2 (/k(#)t/>/k(#)s/). Likewise, a target of place assimilation in American English is more overlapped by a trigger, as compared to the reverse order (/d#g/>/g#d/) or non-target coronal fricative (/d#g/>/s#g/) (Byrd, 1996).

Varying degrees of inter-consonantal gestural overlap were also attributed to other phonological factors. Different degrees of gestural overlap and variability in American English were also observed depending on different prosodic conditions: less gestural overlap and variability were attested in #CC, compared to CC# or C#C (Byrd, 1996). Depending on the phonological knowledge of native speakers (e.g., that some sequences undergo place assimilation and gestural overlap is greater in assimilating contexts than non-assimilating contexts within an assimilating language (e.g., Korean)), gestural overlap of comparable sequences was also greater in non-assimilating contexts of an assimilating language, compared to a non-assimilating language (e.g., /k(#)t/ in Korean > /k(#)t/ in Russian (Kochetov et al., 2007).

Rate effects on gestural overlap have also been observed with Russian C1C2 sequences: speech rate effects were more robust in high frequency C1C2 sequences compared to low frequency clusters (Pouplier et al., 2017). Increasing speech rate was considered a possible factor triggering gestural reorganization, which in turn resulted in coronal stop deletion for Brazilian Portuguese (e.g., /nd/ → [n] in partindo, ‘leaving’ (Oliveira & Marin, 2005)).

1.1. Various Ways to Estimate Dynamically-Defined Gestural Overlap

In articulatory phonology (Browman & Goldstein, 1986, 1988, 1989, 1990, 1991, 1992), a gesture is a basic phonological unit of an event taking place in the vocal tract. To quote Browman & Goldstein (1989: 202), "… gestures are units of action that can be identified by observing the coordinated movements of the vocal tract articulators." Articulators are combined in a coordinative way to achieve a task-controlled gesture (Saltzman & Kelso, 1987). This gesture is further hypothesized to be specified for a set of task variables, constriction location (CL) and constriction degree (CD) (Browman & Goldstein, 1986). Using a task-dynamic model, track-variable movement trajectories are generated by applying phasing principles and activation time for employed gestures (Saltzman & Kelso, 1987). Intergestural coordination is represented by a gestural score for linguistically meaningful units such as a word whose y-axis has a set of articulatory tiers relevant for a given word and whose x-axis has information relevant to timing (Browman & Goldstein, 1989).

In order to feed dynamic parameter values applied to gestural scores, articulatory data acquired from kinematic studies has been used. Since human speech data is much more variable than machine-generated speech, measurements of gestural overlap vary across kinematic studies. Possible measurements are tangential velocity signals of a tract variable (e.g., lip aperture (LA), tongue tip constriction degree, tongue body constriction degree), vertical movement of an articulator, and horizontal movement of an articulator. Previous kinematic studies have selected articulator movements of interest and provided analysis of overlap measuring either tangential velocity signals, vertical/horizontal movement signals, or both (Kochetov et al., 2007; Kühnert et al., 2006; Pouplier et al., 2017; Son, 2008, inter alia). On the other hand, there has not been, to the best of our knowledge, a single preferred overlap measurement. Raw onset-to-onset lag values have been used by estimating the temporal interval between movement onset of C1 and movement onset of C2 (Son, 2008). In contrast, raw constriction time lag values between the constriction offset of C1 and the constriction onset of C2 have been employed (Kochetov et al., 2007; Son et al., 2007). In Byrd (1996), a variety of measurements were utilized–raw lag (e.g., constriction time lag, the movement onset lag, the maxima lag) as well as C1 overlap as a percentage and C2 overlap as a percentage. Referring to the constriction interval of C1C2, percentages of overlap have been calculated for the interval between the constriction offset of C1 and the movement onset of C2, as well as the interval between the constriction offset of C1 and constriction onset of C2 (Kühnert et al., 2006). More recently, normalized measurements have been used to evaluate gestural overlap–the normalized interval of C1C2 as well as C1, normalized onset lag (the movement onset time point of C2 standardized by the interval of C1), normalized plateau lag (the relative target achievement time point of C2 standardized by the interval of C1) (Pouplier et al., 2017). Normalized gestural overlap was also employed in Marin & Pouplier (2014) when referring to the constriction interval between the constriction offset of C1 and the movement onset of C2 relative to the overall constriction duration of C1C2.

In this paper, we revisit articulatory data from three Korean speakers who participated in a cross-linguistic study on Seoul-Korean and Russian (Kochetov, 2007). Several different gestural overlap measures in various non-assimilating sequences (/k(#)t/, /k(#)p/, /p(#)t/) are examined to uncover whether similar temporal lags are distributed consistently across different overlap measures. In particular, we examine whether several different measures of gestural overlap indicate any correlations in Korean non-assimilating contexts (/k(#)t/, /k(#)p/, /p(#)t/) as we consider different speech rates (fast vs. comfortable) and morphosyntactic boundaries (within-word vs. across-word boundary). In addition, limiting the scope of analysis to the tongue dorsum gesture in C1 in non-assimilating contexts (/k(#)t and /k(#)p/), we describe spatio-temporal coarticulatory characteristics of the tongue body and tongue tip articulators in terms of horizontal advancing movement as a function of place of articulation in C2, and interpret its implications on physiological limitations between two consecutive lingual gestures (/k(#)t/). Along with this, we examine whether there is greater intergestural stability observed in physically limited lingual-lingual sequences (/k(#)t/) distinct from lingual-labial sequences (/k(#)p/).

1.2. Research Questions

Firstly, we examine whether there is any similarity in temporal lags distributed consistently across three overlap measures. Previous literature has reported specific measures of interest (Kochetov et al., 2007; Pouplier et al., 2017; Son, 2008, 2011; Son et al., 2007, among others), but has not systematically compared among different measures (cf.,Byrd, 1996). Son’s (2008) kinematic study used movement onset lag in C1C2, showing more gestural overlap in assimilating contexts (/p(#)k/), compared to non-assimilating contexts (/k(#)p/) with inter-speaker variability. Byrd’s (1996) electropalatography study of heterorganic sequences (e.g., /d#g/, /g#d/, /s#g/, /g#s/, /k#s/, /s#k/) with five speakers of English in Southern/Central California examined various overlap measures, including constriction onset lag in C1C2. Kochetov et al. (2007) used raw constriction plateau lag values between the constriction offset of C1 and the constriction onset of C2. More recently, normalized measures were employed for Russian heterorganic C1C2 sequences (e.g., normalized movement onset lag, normalized plateau lag, etc.) in Pouplier et al. (2017), and to a limited extent for Seoul-Korean non-assimilating heterorganic C1C2 (/k(#)t/, /p(#)t/, /k(#)s/) sequences (e.g., normalized plateau lag values (cf., raw movement onset lag values)) in Son (2011). In this paper, we examine three normalized gestural overlap measures (i.e., normalized movement onset lag, normalized constriction onset lag, and normalized constriction plateau lag) and systematically compare them to uncover whether a similar temporal coordination is attested over different time periods of C1C2 sequences (/k(#)t/, /k(#)p/, /p(#)t/).

Secondly, we further examine temporal organization in a subset of non-assimilating sequences, lingual-lingual sequences (/k(#)t/), as compared to lingual-labial sequences (/k(#)p/). In the analysis of tongue dorsum trajectories during acoustic closure of the German dorsal stop /g/ in the context /bV1gV2/ with two speakers, Mooshammer et al. (1995) consistently observed egg-shaped advancing movement of the tongue body in all possible combinations among /i/, /u/, /a/, except for high front vowel /i/ in V1 (see also Gay, 1977; Mooshammer & Hoole, 1993). In addition, consonantal effects on horizontal displacement of the tongue dorsum (/k/, /g/, /ŋ/) were greater for voiceless velar stop /k/ across the board (/k/>/g/>/ŋ/ for Speaker1; /k/>(/g/=/ŋ/) for Speaker2). Based on Mooshammer et al.’s (1995) findings, we are presently concerned with Korean velar stop /k/ in C1 followed by either a lingual gesture /t/ or a nonlingual gesture /p/ in C2 (/k(#)t/ vs. /k(#)p/) in low central vowel contexts (/a/-to-/a/). As we examined the horizontal position of the tongue dorsum at the constriction onset of the vertical tongue dorsum gesture in C1 with respect to gestural overlap, we will ponder the implications conveyed in terms of physiological constraints (Mooshammer et al., 1995).

Lastly, we examine whether lingual-lingual sequences (/k(#)t/) sharing a physically indiscrete organ exhibit more intergestural stability compared to controls (/k(#)p/) with relatively greater articulatory freedom. Greater stability has been observed in C-V sequences and lexicalized compounds (Cho, 2001; Nam, 2007), which was taken to reflect distinct phonological representations of speakers’ grammatical knowledge. In comparing intergestural variability in the horizontal distance from the tongue dorsum to the tongue tip position measured at the constriction onset in C2, we aim to uncover whether less variability is consistently observed in the two consecutive lingual gestures, which are also used as active articulators.

2. Method

2.1. Participants and Stimuli

We revisited Kochetov et al.’s (2007) articulatory data for three non-assimilating sequences (/k(#)t/, /k(#)p/, /p(#)t/) from Seoul Korean (see Son et al. (2007) for an elaborate description of the production experiment using electromagnetic midsagittal articulometer (Perkell et al., 1994)). Previously analyzed in Kochetov et al. (2007), the original data set was collected from three subjects (two male (K1 & K3) and one female (K2)) who produced the stimuli with two speech rates (comfortable vs. fast) and two morphosyntactic conditions (across-word vs. within-word). While Kochetov et al. (2007) used the first five repetitions, pooled across subjects, in order to balance out with their Russian EMMA data for a systematic cross-linguistic comparison, we used all tokens collected (89 tokens for K1; 72 tokens for K2; 70 tokens for K3) and ran statistical analysis for each speaker. The carrier phrases were not identical across speakers so as to reduce data collection time (nanɨn __lanɨn malɨl tɨlə poassta (‘I have heard of ___) for K1; neka __lako tɨləssə (‘I have heard it as __’) for K2 and K3). Speakers treated all target words as real words and naturally produced them. The stimuli used for elicitation are listed in (1).

  1. Stimuli (mostly reproduced from Kochetov et al. (2007:1362))

    1. Within-word boundary condition1

      1. /maktambe/ ‘mild tobacco’

      2. /akpali/ ‘a tough fellow’

      3. /haptaŋ/ ‘parties merger’

    2. Across-word boundary condition2

      1. /mak#taŋkimjənsə/ ’pulling a curtain’

        /mak#tajaŋhake/ ‘varying curtain(s)’

        /ak#tahesə/ ‘with a desperate effort’

      2. /ak#palamjənsə/ ‘yearning evil’

      3. /hap#taŋkimjənsə/ ‘pulling a drawer’

2.2. Measurements

We used the function of lp_Findgest (with threshold of 0.2) in MVIEW (Tiede, 2005) for gestural demarcation (e.g., the movement onset, peak velocity of the formation duration, constriction onset, constriction maxima, constriction offset, and movement offset). We demarcated gestural landmarks for the vertical tongue dorsum gesture, vertical tongue tip gesture, and lip aperture gesture. Normalized gestural overlap of C1C2 sequences was estimated and raw lag values of interest between C1 and C2 were divided by the activation duration values of C1. Greater values represent less overlap.

We also estimated corresponding horizontal position of the tongue dorsum lined up with the time point of the constriction offset of C1 (Figure 1(i)) to determine how advanced the tongue dorsum was at the release of the constriction. Greater values represent greater anteriority. In order to calculate the horizontal distance between the tongue dorsum and the tongue tip (raw values of 1.ii subtracted from 1.iii), we estimated corresponding horizontal position of the tongue dorsum and the tongue tip lined up with the time point of the constriction onset of C2 (Figures 1(ii) & 1(iii)). Greater values represent greater distance between TDx and TTx.

Figure 1. Corresponding horizontal position of the tongue dorsum and tongue tip gestures with respect to vertical movement of C1 and C2. Greater values for horizontal movement represent greater anteriority.
Download Original Figure
2.3. Statistical Analysis

Linear models in R (R Development Core Team, 2014) were constructed for each subject. Normalized gestural overlap values (in z-scores) were fitted with the lm function from the lme4 package (Bates et al., 2015). Sequence types ((/kt/, /kp/, /pt/) or (/kt/, /kp/)), boundary types (across-word vs. within-word), and speech rates (comfortable vs. fast) were used as fixed factors. We used Tukey’s Honest Significance Difference tests for post-hoc analysis. The pairs function is used to generate scatter plots and the cor.test function to estimate Spearman’s rank correlation coefficient (rho (ρ)) using a non-parametric measure of rank correlation. Levene’s test for homogeneity of variance, using medians as the center, was used to determine the stability of gestural coordination.

3. Results

3.1. Three Normalized Gestural Overlap Measures

In Figure 2, three normalized gestural overlap measures are positively correlated with each other. This indicates that similar degrees of gestural overlap are manifested over three different time periods of C1C2 sequences (/kt/, /kp/, /pt/) (e.g., the movement onset, the constriction onset, and the constriction plateau).

Figure 2. Scatter plot of three normalized overlap measures in C1C2 sequences (/kt/, /kp/, /pt/) for (a) Speaker K1, (b) Speaker K2, and (c) Speaker K3. Also shown is the result of Spearman’s rank correlation coefficient. Greater values represent less gestural overlap.
Download Original Figure
3.2. Lingual-Lingual and Lingual-Labial Gestural Coordination
3.2.1. Inter-Speaker Variability in Normalized Constriction Plateau Lag between /kt/ and /kp/

The results for /kt/ and /kp/ sequences from three speakers are shown in Table 1(a) for K1, Table 1(b) for K2, and Table 1(c) for K3. For two speakers (K1 and K3), there is an interaction between Sequence type (/kt/ vs. /kp/) and Boundary type (across-word vs. within-word) (t=−2.51, p<0.05 for K1; t=−3.38, p<0.01 for K3) and a main effect of Sequence type (/kt/ vs. /kp/) (t=2.18, p<0.05 for K1; t=6.61, p<0.0001 for K3).

Table 1. Normalized constriction plateau lag for (a) Speaker K1, (b) Speaker K2, and (c) Speaker K3
(a) Speaker K1
Estimate Std Error t-value Pr(>|t|)
(intercept) 0.945 0.028 33.304 p<0.0001
SeqType [kt] 0.087 0.040 2.177 p<0.05
BdType within −0.004 0.039 −0.112 N.S.
SrType fast 0.018 0.040 0.448 N.S.
Seq [kt]×Bd within −0.140 0.056 −2.508 p<0.05
Seq [kt]×Sr fast 0.002 0.057 0.038 N.S.
Bd within×Sr fast −0.055 0.055 −1.000 N.S.
Seq [kt]×Bd wthn×Sr fast 0.130 0.078 1.662 N.S.
(b) Speaker K2
Estimate Std Error t-value Pr(<|t|)
(intercept) 1.096 0.032 34.520 p<0.0001
SeqType [kt] −0.043 0.045 −0.964 N.S.
BdType within −0.015 0.045 −0.332 N.S.
SrType fast −0.011 0.045 −0.248 N.S.
Seq [kt]×Bd within 0.078 0.064 1.236 N.S.
Seq [kt]×Sr fast 0.077 0.064 1.220 N.S.
Bd within×Sr fast 0.021 0.064 0.331 N.S.
Seq [kt]×Bd wthn×Sr fast 0.088 0.090 0.977 N.S.
(b) Speaker K3
Estimate Std Error t-value Pr(<|t|)
(intercept) 1.035 0.037 28.075 p<0.0001
SeqType [kt] 0.345 0.052 6.607 p<0.0001
BdType within 0.091 0.052 1.75 N.S.
SrType fast −0.008 0.052 −0.149 N.S.
Seq [kt]×Bd within −0.249 0.074 −3.377 p<0.01
Seq [kt]×Sr fast 0.003 0.074 0.044 N.S.
Bd within×Sr fast −0.047 0.078 −0.601 N.S.
Seq [kt]×Bd wthn×Sr fast 0.044 0.108 0.413 N.S.
Download Excel Table Download Excel Table Download Excel Table

With the results of post-hoc tests using Tukey HSD, we find that less overlap is observed in /kt/ sequences in the across-word boundary condition for K1 (e.g., Figure 3(a)), and less overlap is consistently present for both boundary conditions for K3 (e.g., Figure 3(b)). For Speaker K2, there is neither interaction between factors nor main effects (p>0.05).

Figure 3. Normalized constriction plateau lag in C1C2 sequences measured with Sequence type × Boundary type for (a) Speaker K1 and (b) Speaker K3. Greater values represent less gestural overlap. (The symbols ‘*’ and ‘**’ represent p<0.05 and p<0.01, respectively.)
Download Original Figure Inter-Speaker Variability in the Relationship between Gestural Overlap and Tongue Dorsum Anteriority

There is a correlation between the horizontal tongue dorsum anteriority (measured as it is lined up with the constriction offset of the tongue dorsum gesture in C1) and the gestural overlap in C1C2 (estimated with normalized constriction plateau lags). The results from two speakers (K1 and K3) indicate that the tongue dorsum gesture in C1 has progressed further in the pre-/t/ context, compared to the pre-/p/ context, and these two speakers also show less gestural overlap in lingual-lingual sequences (/kt/) (the across-word condition for K1 and both boundary conditions for K3) as shown in Figure 4.

Figure 4. A scatter plot of normalized constriction plateau overlap with respect to horizontal position of the tongue dorsum at the constriction offset of the tongue dorsum gesture in C1 for (a) Speaker K1, (b) Speaker K2, and (c) Speaker K3. Also shown is the result of Spearman’s rank correlation coefficient and regression lines.
Download Original Figure

That is, the less overlapped C1C2 is (e.g., /k(#)t/>/k(#)p/ in terms of normalized constriction plateau lag), the more advanced the horizontal tongue dorsum position is at the offset of the constriction in the dorsal gesture. This may imply that there is a chain shift: the tongue tip movement in C2 has induced more advanced tongue dorsum movement in C1 (when measured at the tongue dorsum constriction offset in C1), which, in turn, is responsible for less gestural overlap with the tongue tip gesture. This is indirectly supported by the observation where K2 did not differ either in terms of normalized constriction plateau lags or the horizontal position of the tongue dorsum at this particular time point as a function of different sequence types (/k(#)t/=/k(#)p/).

3.2.2. Horizontal Distance from the Tongue Dorsum to the Tongue Tip and Inter-Speaker Variability in Lingual-Lingual Coordination Stability

We measured horizontal distance from the tongue dorsum to the tongue tip, and both positional values are acquired by being lined up with the constriction onset of C2 (e.g., the tongue tip gesture in the pre-/t/ context and lip aperture in the pre-/p/ context). At this time point, the tongue tip is more spatially separated from the tongue dorsum in the /k(#)t/ sequences for all speakers as shown in Figure 5 (cf., no difference in gestural overlap for K2). This may be due to the fact that the tongue tip gesture of C2 is extended to the alveolar ridge for the constriction, being an active articulator for the /t/ event. In contrast, the tongue tip articulator is passively moving during the constricting gesture of LA, and it is relatively less extended out. This may also indicate that regardless of whether speakers demonstrate more gestural overlap and more advanced tongue dorsum position in /k(#)t/ (see section 3.2.1), greater distance between the tongue dorsum and the tongue tip is consistently observed in /k(#)t/ sequences across the board. Given this, we infer that two consecutive lingual gestures in the /k(#)t/ sequences are executed in the more anterior area.

Figure 5. Horizontal distance from the tongue dorsum and the tongue tip lined up with the constriction onset of C2 for (a) Speaker K1, (b) Speaker K2, and (c) Speaker K3. (The symbols ‘**’ and ‘***’ represent p<0.01 and p<0.0001, respectively.) Also shown is the result of Levene’s test for .homogeneity of variance.
Download Original Figure

Referring to variations in standard deviation values for each sequence type, we observed more stable intergestural coordination (e.g., smaller variation in standard deviation values) in the lingual-lingual gestures (/k(#)t/), compared to the lingual-labial gestures (/k(#)p/) for two speakers (F(1, 46)=9.55, p<0.01 for K2; F(1, 44)=16.74, p<0.0001 for K3). The reason for which Speaker K1 does not exhibit more stability in the lingual-lingual /k(#)t/ sequences is due to the fact that standard deviations in the lingual-labial /k(#)p/ sequences are not as large as those for Speakers K2 and K3.

4. Discussion

4.1. Inter-Speaker Variability in Gestural Overlap

In the current study, we confirm that gestural overlap (e.g., /k(#)t/ vs. /k(#)p/ vs. /p(#)t/) can be defined by either one of three arbitrarily chosen time points - the movement onset, the constriction onset, and the constriction plateau. The consistency we observe with several normalized gestural overlap measurements suggests that we should be able to arbitrarily choose one particular measurement, and use it to reliably serve whatever hypothesis one wishes to test.

Limiting the scope of analysis to the dorsal segment in C1 (e.g., /k(#)t/ vs. /k(#)p/) with respect to normalized constriction plateau overlap, speech rate effects are absent across the board, and morphosyntactic boundary effects are observed for two speakers (e.g., K1 and K3). Speaker K1 demonstrates that gestural overlap does not differ between two sequence types in the within-word context (/kt/=/kp/), and shows a longer constriction plateau lag in the across-word context (/k#t/>/k#p/). Speaker K3 consistently exhibits a longer constriction plateau lag in /kt/ as well as in /k(#)t/, compared to their counterparts (e.g., /kp/ and /k(#)p/). That is, less overlap in a lingual-lingual sequence is observed as long as there is any difference in terms of constriction plateau lag values. The results of our current study show inter-speaker variability in line with other articulatory studies: inter-speaker variability has been observed in articulatory studies of speakers with different palate shapes in multiple languages (Bulgarian, German, English (including British English, Scottish English, American English, and Australian English), Norwegian, and Polish (Brunner et al., 2009)) and three monozygotic and two dizygotic twin pairs in German (Weirich, 2010). Note that the results of morphosyntactic boundary effects, however, were not attested in Kochetov et al. (2007), where articulatory data from identical subjects were employed. Some possible reasons we suspect are that i) Kochetov et al. (2007) used data pooled across subjects, ii) in the current study, /maktambe/ (‘mild tobacco’) was used instead of their /aktam/ (‘curse’), and iii) we tested a subset (/k(#)t/ vs. /k(#)p/) instead of their comparisons among three sequence types (/k(#)t/ vs. /k(#)p/ vs. /p(#)t/). As it seems to be beyond the scope of the current study to trace what might have caused the difference in terms of morphosyntactic boundary effects between the two studies, we leave this issue for future study.

With respect to greater reduction in the fast speech rate observed in Kochetov et al. (2007), it should be mentioned that no statistical analysis (e.g., t-tests, analysis of variance, etc.) was carried out providing mere comparisons between fast and comfortable speech rates for three individual speakers. In the current study, we referred to the results of Tukey HSD tests and concluded that there is no speech rate effect. Note that gestural reduction is prone to occur more frequently in fast speech rate (Kirchner, 1998; Son, 2015) and this has been attested in place-assimilating contexts (e.g., /p(#)k/) for Korean (Son, 2008; Son et al., 2007). In contrast, we find that speech rate-dependent reduction is absent with the non-assimilating contexts of Korean: this implies that speech rate-dependent reduction is responsible for phonological processes such as place assimilation, which in turn reflects that gestural reduction is one of factors for deriving this phonological process (Jun, 1995).

4.2. Gestural Entrenchment in Two Consecutive Lingual-Lingual Movements

In Kochetov et al. (2007), less gestural overlap was observed in back-to-front sequences (/kt/ and /kp/) compared to front-to-back sequences (/pt/) in terms of constriction plateau lag values ((/kt/=/kp/)>/pt/). As a possible explanation, this was attributed to the perceptual recoverability hypothesis (Chitoran et al., 2002), where the audible release cue of C1 in the front-to-back sequences can be recovered at any point. Kochetov et al. (2007) also mentioned physiological constraints stating that two consecutive lingual-lingual gestures such as /kt/ are physiologically limited due to mutual entrenchment and thus less overlapped. Finding no statistical difference between lingual-lingual /k(#)t/ and lingual-labial /k(#)p/ sequences, they concluded that their data did not support the alternative hypothesis. In contrast, we find evidence in the current study that physiological constraints are active for two speakers (K1 and K3) to account for greater normalized constriction plateau lag values in lingual-lingual sequences (/k(#)t/), compared to lingual-labial ones (/k(#)p/). This incompatibility between the two studies may be due to the contribution of Speaker K2 to the data pooled across subjects in Kochetov et al. (2007) to the extent that it might have made the physiological effects void. Based on this, we suggest that physiological constraints be considered active, being speaker-dependent and sensitive to morphosyntactic boundary conditions in the Korean non-assimilating contexts.

Lastly, physiological constraints are further evaluated with regard to the horizontal distance from the tongue dorsum to the tongue tip–it is greater for lingual-lingual /k(#)t/ sequences where both lingual articulators are activated, compared to lingual-labial /k(#)p/ sequences where the tongue tip articulator is passively moving during the activation duration of the upper and lower lip articulators in C2. The intergestural stability measured by Levene’s test for homogeneity of variance indicates that speakers K2 and K3 demonstrate less variability in two consecutive lingual-lingual /k(#)t/ sequences compared to lingual-labial /k(#)p/ sequences. This can be partly, but not fully, attributed to physiological constraints that might have inhibited greater variability in /k(#)t/ sequences. Although speaker K1 does not exactly hold to this pattern as the other two speakers, it is worth noting that there is not greater variability in lingual-lingual /k(#)t/ sequences for this speaker, too.

Greater intergestural stability has been observed in CV sequences and lexical compounds. Synchronous coordination is observed in the syllable onset–it is considered a more stable mode of speech and accomplished faster into steady-state phasing relations (Nam, 2007; Nam et al., 2009). As Cho (2001) applied Levene’s tests for homogeneity of variance to assessing intergestural stability between gestures, less variability was also observed in lexical compounds (e.g., /pek+pal/ ‘white hair’) and within a single morpheme (/pani/ ‘name’) as compared to the non-lexicalized compounds (e.g., /pek+pal/ ‘white foot’) and across a morpheme boundary (e.g., /pan+i/ ‘class + NOM.’). Despite that Cho’s (2001) stimuli might not be balanced in terms of the frequency of occurrence (cf., Lin et al., 2014), his data has supported tighter intergestural coordination for lexical entries before morphological processes. Based upon data we have acquired in the current study, we learn that a tighter intergestural coordinative structure for physiologically entrenched tongue articulators (e.g., the tongue dorsum and the tongue tip) can be established even for clusters with morphosyntactic boundaries since they allow smaller degrees of freedom.


** This work was supported by NIH Grant DC-00403 conferred upon Catherine T. Best (PI) and Haskins Laboratories. I thank three speakers for voluntarily participating in EMMA experiments, Sean C. O’Rourke for proofreading of this paper, and three anonymous reviewers for their constructive comments and valuable suggestions. All remaining errors are my own.

1 For /kt/ sequences, /aktam/ from Kochetov et al. (2007) is replaced with /maktambe/ to balance out the number of syllables with /kp/ sequences in the within-word boundary condition for all three speakers.

2 In order to balance out the number of syllables among C1C2 sequences in the across-word boundary condition (/k#t/, /k#p/, /p#t/), /ak#tahesə/ from Kochetov et al. (2007) is replaced with /mak#taŋkimjənsə/ for speaker K1 and /mak#tajaŋhake/ for speaker K2 (cf., /ak#tahesə/ remained to be used for speaker K3 since it was only available for this speaker.).



Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1-48. Retrieved from


Browman, C. P., & Goldstein, L. Articulatory phonology. Unpublished manuscript.


Browman, C. P., & Goldstein, L. M. (1986). Towards an articulatory phonology. Phonology Yearbook, 3, 219-252.


Browman, C. P., & Goldstein, L. (1988). Some notes on syllable structure in articulatory phonology. Phonetica, 45(2-4), 140-155.


Browman, C. P., & Goldstein, L. (1989). Articulatory gestures as phonological units. Phonology, 6(2), 201-251.


Browman, C. P., & Goldstein, L. (1990). Gestural specification using dynamically-defined articulatory structures. Journal of Phonetics, 18, 299-320.


Browman, C. P., & Goldstein, L. (1991). Tiers in articulatory phonology, with some implications for casual speech. In J. Kingston, & M. E. Beckman (Eds.), Papers in laboratory phonology I: Between the grammar and the physics of speech (pp. 341-376). Cambridge, UK.: Cambridge University Press.


Browman, C. P., & Goldstein, L. (1992). Articulatory phonology: An overview. Phonetica, 49, 155-180.


Browman, C. P., & Goldstein, L. (1995). Gestural syllable position effects in American English. In F. Bell-Berti, & R. J. Lawrence (Eds.), Producing speech: Contemporary issues. For Katherine Safford Harris (pp. 19-33). New York, NY: AIP Press.


Brunner, J., Fuchs, S., & Perrier, P. (2009). On the relationship between palate shape and articulatory behavior. Journal of the Acoustical Society of America, 125(6), 3936-3949.


Byrd, D. (1996). Influences on articulatory timing in consonant sequences. Journal of Phonetics, 24(2), 209-244.


Chitoran, I., Goldstein, L., & Byrd, D. (2002). Gestural overlap and recoverability: Articulatory evidence from Georgian. In C. Gussenhoven, & N. Warner (Eds.), Papers in laboratory phonology VII (pp. 419-448). Berlin: Mouton de Gruyter.


Cho, T. (2001). Effects of morpheme boundaries on intergestural timing: Evidence from Korean. Phonetica, 58(3), 129-162.


Gay, T. (1977). Articulatory movements in VCV sequences. Haskins Laboratories Status Report on Speech Research, 49, 121-147.


Jun, J. (1995). Perceptual and articulatory factors in place assimilation: An optimality theoretic approach (Ph.D. dissertation). University of California, Los Angeles.


Kelso, J. A. S., Saltzman, E., L., & Tuller, B. (1986). The dynamical perspective on speech production: Data and theory. Journal of Phonetics, 14, 29-59.


Kirchner, R. M. (1998). An effort-based approach to consonant lenition (Ph.D. dissertation). University of California, Los Angeles.


Kochetov, A., Pouplier, M., & Son, M. (2007, August). Cross-language differences in overlap and assimilation patterns in Korean and Russian. Proceedings of the 16th International Congress International Congress of Phonetic Sciences (pp. 1361-1364). Saarbrücken.


Kühnert, B., Hoole, P., & Mooshammer, C. (2006, December). Gestural overlap and C-center in selected French consonant clusters. Proceedings of the 7th International Seminar on Speech Production (pp. 327-334). Ubatuba.


Lin, S., Beddor, P. S., & Coetzee, A. W. (2014). Gestural reduction, lexical frequency, and sound change: A study of post-vocalic /l/. Laboratory Phonology, 5(1), 9-36.


Marin, S., & Pouplier, M. (2014). Articulatory synergies in the temporal organization of liquid clusters in Romanian. Journal of Phonetics, 42, 24-36.


Mooshammer, C., & Hoole, P. (1993). Articulation and coarticulation in velar consonants. Forschungsberichte des Instituts für Phonetik und Sprachliche Kommunikation der Universität München FIPKM, 31, 249-262.


Mooshammer, C., Hoole, P., & Kühnert, B. (1995). On loops. Journal of Phonetics, 23(1-2), 3-21.


Nam, H. (2007). A competitive, coupled oscillator model of moraic structure: Split-gesture dynamics focusing on positional asymmetry. In J. Cole, & J. I. Hualde (Eds.), Papers in laboratory phonology IX (pp. 483-506). Berlin/New York: Mouton de Gruyter.


Nam, H., Goldstein, L. M., Giulivi, S., Levitt, A. G., & Whalen, D. H. (2013). Computational simulation of CV combination preferences in babbling. Journal of Phonetics, 41(2), 63-77.


Nam, H., Goldstein, L., & Saltzman, E. (2009). Self-organization of syllable structure: A coupled oscillator model. In F. Pellegrino, E. Marisco, I. Chitoran, & C. Coupe (Eds.), Approaches to phonological complexity (pp. 299-328). Berlin/New York: Mouton de Gruyter.


Oliveira, L., & Marin, S. (2005). Patterns of velum coordination in Brazilian Portuguese. Phonetics and Phonology in Iberia (PaPi). Barcelona, Spain.


Perkell, J., Cohen, M. H., Svirsky, M. A., Matthies, M. L., Garabieta, I., & Jackson, M. T.. (1992). Electromagnetic midsagittal articulometer (EMMA) systems for transducing speech articulatory movements. The Journal of the Acoustical Society of America, 92(6), 3078-3096.


Pouplier, M., Marin, S., Hoole, P., & Kochetov, A. (2017). Speech rate effects in Russian onset clusters are modulated by frequency, but not auditory cue robustness. Journal of Phonetics, 64, 108-126.


R Development Core Team. (2014). R: A language and environment for statistical computing [Computer software]. Vienna: R Foundation for Statistical Computing. Retrieved from


Saltzman, E. L., & Munhall, K. G. (1989). A dynamical patterning to gestural patterning in speech production. Ecological Psychology, 1(4), 333-382.


Saltzman, E., & Kelso, J. A. (1987). Skilled actions: A task-dynamic approach. Psychological Review, 94(1), 84-106.


Son, M. (2008). Gestural overlap as a function of assimilation contrast. Korean Journal of Linguistics, 33(4), 665-691.


Son, M. (2011). A language-specific physiological motor constraint in Korean non-assimilating consonant sequences. Phonetics and Speech Sciences, 3(3), 27-33.


Son, M. (2015). Articulatory properties of the allophonic variant [ɾ] in Korean /l/-flapping: Gestural reduction and the role of gestural overlap. Studies in Phonetics, Phonology, and Morphology, 21(3), 427-456.


Son, M., Kochetov, A., & Pouplier, M. (2007). The role of gestural overlap in perceptual place assimilation in Korean. In J. Cole, & J. Hualde (Eds.), Papers in laboratory phonology IX (pp. 507-534). New York : Mouton de Gruyter.


Tiede, M. (2005). MVIEW: Software for visualization and analysis of concurrently recorded movement data. New Haven, CT: Haskins Laboratories.


Vihman, M. M., & Greenlee, M. (1987). Individual differences in phonological development: Ages one and three years. Journal of Speech and Hearing Research, 30(4), 503-521.


Weirich, M. (2010). Articulatory and acoustic inter-speaker variability in the production of German vowels. ZAS Papers in Linguistics, 52, 19-42.