From a phonological point of view, place of articulation can be largely classified by the feature [±anterior]: four target regions (i.e., labial, dental, alveolar, and post-alveolar) have the feature [+anterior] feature and five (i.e., palatal, velar, uvular, pharyngeal, epiglottal, glottal) have the feature [-anterior] (Keating, 1991; Clements, 1985; Ladefoged & Maddieson, 1996, inter alia). For the traditional terms listed above, most places of articulation refer to fixed regions along the upper and back surfaces of the vocal tract except for epiglottal (Ladefoged & Maddieson, 1996). At the supralaryngeal level of articulation, constriction occurs by pairing between the place of articulation and a mobile articulator, which is central to generating speech sounds. First, focusing on labial place of articulation, in particular, the upper lip is the target area with two different movable articulators. Labial place is produced with the upper lip as the articulatory objective area involving the active lower lip as a movable articulator (e.g., /p/, /b/). Linguo-labial is produced with the upper lip as the articulatory objective area involving the active tongue blade articulator (e.g., /t/, /d/). Second, labio-dental is produced with the upper teeth as the articulatory objective area involving the active lower lip (e.g., /ɱ/). However, note that the lip organs are distinguished because both the upper lip and the lower lip are, in principle, physiologically movable regardless of their linguistic status (e.g., an articulatory target region with an active articulator involved). Physiologically, lip closing occurs with an elevated (not depressed) mandible and mentalis contraction; earlier in the phonological developmental stage, lower lip raising occurs by mechanically being linked with jaw raising movement. Later in the phonological developmental stage, children, by six years of age, begin to use upper lip lowering in coordination with jaw and lower lip raising movement to make a lip closing gesture beyond passive smacking (Gick et al., 2013).
Kinematic characteristics of the lips for bilabial stops in various contexts have been relatively well documented in previous literature. Browman & Goldstein (1988) showed that an onset (e.g., singleton as well as CC(C) clusters) is globally organized into a syllable. Examining X-ray microbeam data from Miller & Fujimura (1982), they analyzed vertical movement of the lower lip and the tongue tip, showing that the C-center of C(C(C)) sequences (e.g., 'peak __' (target words: 'pots', 'sots', 'lots', 'spots', 'plots', 'splots') exhibited the most stable temporal relation with an anchor (e.g., the beginning of an acoustic closure for /t/ in the second word) in English. In an electromagnetic articulography study on bilabial /p/ and /b/ in Ewe, Maddieson (2005) examined upper and lower lip movement, where both the upper and lower lips clearly moved faster in voiceless bilabial /p/ than in its voiced counterpart /b/. In terms of gestural spatial magnitude, voiceless bilabial /p/ showed moderately more compression, indicated by lower vertical position of these two articulators on the ordinate plane.
Under the hypothesis of articulatory phonology (Browman & Goldstein, 1986, 1989, 1990, 1992), "gestures are units of action that can be identified by observing the coordinated movements of the vocal tracts" (1989:202). In a computation model (Saltzman et al., 1987), a gesture is abstract and invariant at the phonological level of representation and refers to coordinated movements of articulators to achieve a linguistically meaningful task. Browman & Goldstein (1986) proposed that gestures are understood as tract variables which are specified as constriction location (CL) and constriction degree (CD). Specifically, they are lip protrusion (LP), lip aperture (LA), tongue tip constriction location (TTCL), tongue tip constriction degree (TTCD), tongue body constriction location (TDCL), tongue body constriction degree (TDCD), velic aperture (VEL), and glottal aperture (GLO). The articulators involved in task achievement consist of a set of the upper lip, lower lip, and jaw involved coordinatively in the tract variables of LP and LA, a set of the tongue tip, tongue body, and jaw in the tract variables of TTCL and TTCD, a set of the tongue body and jaw in the tract variable of TDCL and TDCD, the velum in the tract variable of VEL, and the glottis in the tract variable of GLO. Task-controlled tract variables are assembled, by hypothesis, into a bigger coordinated structure called a gestural score. In the task-dynamics model of speech production (Saltzman, 1986; Saltzman & Kelso, 1987), each gesture is mathematically represented using several parameters (e.g., mass, damping, and stiffness) and temporal periods are also specified for a given utterance (see Saltzman & Kelso (1987) for parameters of a set of equations and Nam (manuscript) for a review of task-dynamics). Articulatory studies have been done to serve the purpose of providing kinematic data entered for parameter values entered in gestures and gestural scores (Browman & Goldstein, 1995, 1988).
Lip gestures have been well studied in various languages using various methodologies tracking articulatory movements (see Browman & Goldstein's (1986, 1988, 1990, 1995) x-ray microbeam study on English and Chaga; Kochetov et al.'s (2007) electromagnetic articulography study on Korean and Russian; Löfqvist's (1996) and Löfqvist & Gracco's (1997) simultaneous electromagnetic articulography and aerodynamic study on American English; Maddieson's (2005) electromagnetic articulography study on Ewe; Smith's (1992) x-ray microbeam study on Japanese and Italian; Ladefoged & Maddieson's (1996) videotape study on Vao; Son's (2008, 2013), Son et al.'s (2007), and Son et al.'s (2012) electromagnetic articulagrphy study on Korean; Yanagawa's (2006) electromagnetic articulography study on American English, Cantonese, Taiwanese, German, French, and Japanese). For the analysis of kinematic data, the lip aperture gesture was analyzed for Korean. Using an electromagnetic midsagittal articulometer study, Son (2008) observed spatio-temporal reduction in the LA gesture in assimilating context such as /Vp(#)kV/ sequences with inter-speaker variability, indicating that more reduction of the LA gesture was observed in fast and within-word boundary conditions if it ever occurred. Comparing Korean three-way laryngeal contrast (lenis, fortis, aspirated) in four vocalic contexts (/iCi/, /aCa/, /iCa/, /aCi/), Son et al. (2012) examined kinematic data from a set of nonsense words with bilabial stops. Analyzing the LA gesture, they showed that aspirated stop /ph/ demonstrated a greater lip closing displacement than lenis stop /p/ (/ph/>/p/). In lip opening movements, fortis /p*/ was always greater in spatial displacement, acceleration duration, and overall movement duration (/p*/>/p/). Comparing the high vowel context and low vowel context, they consistently observed greater spatial displacement as well as peak velocity in both lip closing and opening movements (/aCa/>/iCi/). Browman & Goldstein (1995) also used lip aperture in the analysis of x-ray microbeam data from one subject. They examined lip constriction, which was estimated along with its corresponding LA gesture value in the analysis of syllable position effects. Systematically varying the location of pitch accents in either one of words in a carrier phrase (e.g., 'MY __ huddles/puddles/tuddles'; 'my __ HUDDLES/PUDDLES/TUDDLES') or in a target word ('POP', 'TOT'', 'CAULK'), bilabial stop /p/ was more spatially reduced in coda position, coherently patterning together with coronal and velar. In addition, they sometimes used vertical lower lip movements for the sake of graphic compatibility with other vertical movements with the tongue tip or tongue body when it was not the focus of analysis (e.g., 'say leap again' for onset /l/ vs. 'say peel again' for coda /l/).
The individual articulatory movements of the two lips were also the main focus of analysis in a simultaneous electromagnetic articulography and air pressure study. Löfqvist (1996) analyzed the upper lip, lower lip, and jaw for voiced/voiceless bilabial stops (/b/, /p/) elicited within a carrier phrase (e.g., 'say __ again') with reference to acoustic wave forms and air pressure data. In his study, the lip aperture kept changing during an acoustic silence as he analyzed kinematic data from four subjects (three American native speakers and one Swedish native speaker). To quote Löfqvist (1996:563), "The receivers continue to move during the closure due to their placement and to compression of the lip tissues...the lips may be meeting at a high velocity and also that there may be a mechanical interaction between the lips during the closure." In particular, the minimum vertical position of the upper lip coincided with neither the lip aperture minima nor the maximum vertical position of the lower lip, since the upper lip reached the lowest position at the time of target attainment and then receded upward as it yielded to the ongoing lower lip raising movement until the latter reaches its maximum vertical position. This was further related to negative lip aperture. As noted in Gick et al. (2013), overshoot (e.g., negative lip aperture) occurs in any constriction since speakers can achieve constriction between a moving articulator and its target location without delivering obligatory fine control by speakers.
It seems to be sufficient if articulatory studies provide empirical data for determining parameter values for gestures and gestural scores in the task-dynamicsmodel of speech production (Saltzman, 1986; Saltzman & Kelso, 1987). In this regard, the LA gesture in Korean has been relatively well documented in three-way laryngeal contrast and place assimilation. However, it is yet to be discovered, as was done with American English (Löfqvist, 1996), how the upper and lower lips articulate to produce labial consonants, which will ultimately enhance our understanding in a more comprehensive way. In this study, we proceed with this line of research by determining what happens at the time of maximum contact of the upper and lower lips.
The production of an utterance is hypothesized to be processed along a succession of different linguistic stages (i.e., lexicon - morphology - syntax - phonology - phonetic execution in the vocal tract) (Mihalicek & Wilson, 2011). In transformational generative grammar, an utterance is decomposed into a syntactic structure (Radford, 1988). Syntactically, the subject occurs in the specifier of an inflectional phrase (IP), which in turn precedes the complement of the IP. Using X-bar schema (Chomsky, 1993), the subject is followed by the object (e.g., [IP [NP [N' [N na]]] [I' [VP [V' [NP [N' [N pap]]] [V məknɨnta]]]]] 'I eat (a bowl of) rice') in a head-final language like Korean. Structurally, the subject and the object are governed by distinct maximal projections, two NPs, and a word boundary occurs between these two NPs.
For Korean, the spatio-temporal reduction of the lip aperture gesture in terms of LA minima was attributed to place assimilation in /...apka.../ sequences, along with more gestural overlap with inter-speaker variability (Son et al., 2007). What they observed from an electromagnetic midsagittal articulometer study was categorically reduced LA in within-word condition, which always occurred within an accentual phrase, although not all /pk/ clusters within an accentual phrase demonstrated reduced LA gestures (see Jun (1993, 2006) for intonational prosodic structure of Seoul Korean). In contrast, Jun (1996) observed partially or fully reduced lip gestures even in across-word boundary condition from an aerodynamic study on place assimilation in Korean. In another electromagnetic articulography study (Son, 2008), the target of place assimilation (e.g., /...Vp(#)kV.../) generally showed less constriction in lenis bilabial stop /p/ in the within-word condition, compared to across-word boundary, but there was no partially or fully reduced LA for most speakers (four out of five speakers indicated less constriction in the within-word condition). Meanwhile, one exception was found in one speaker out of five speakers, who showed more reduction in across-word condition and exhibited partially or fully reduced LA in that context if there was any reduction at all.
Korean also showed more spatio-temporal gestural reduction in coda compared to onset in an electromagnetic midsagittal articulometer study with regard to constriction duration as well as constriction degree (e.g., syllable-initial /k/ > syllable-final /k/ in Son (2011)). This is compatible with the results in Browman and Goldstein's (1995) X-ray microbeam study of American English where more gestural reduction in coda was observed within a single identical prosodic domain (e.g., syllable-initial /p/, /t/, /k/ > syllable-final /p/, /t/, /k/).
In this paper, we examine word boundary effects (e.g., linguistic factor) on gestural reduction confined to intervocalic syllable-initial onset as we focus on Korean bilabial lenis stop /p/ in two morpho-syntactic contexts (within-word boundary vs. across-word boundary). To serve this purpose, we proceed with lenis bilabial stop /p/ flanked by a set of homorganic low vowels (/...a(#)Ca.../), where C occurs consistently in syllable-initial position in the Korean orthography.
A further goal is to learn whether rate-dependent articulatory movement occurs in the upper lip and/or lower lip movements. Son's (2008) electromagnetic articulography study showed that the lip aperture minima indicated less constriction for labial place occurring in assimilating context /...ap(#)ka.../ in fast speech rate and with inter-speaker variability. Meanwhile, an intervocalic lateral /l/, which is phonetically executed as a flap /ɾ/ in Korean, did not demonstrate such speech rate effects on constriction degree when it was evaluated with vertical tongue tip position, not only in the low vowel context (/...ala.../) but also in the high vowel context (/...ili.../) (Son, 2015a, 2015b). In an effort to enhance our understanding of speech rate effects (e.g., paralinguistic factor) on constriction degree, on the one hand, and individual articulators involved in constriction degree, on the other, we pursue an examination of minimum vertical position of the upper lip (i.e., lowering movements) and the maximum vertical position of the lower lip (i.e., raising movements), while factoring in two speech rates (comfortable vs. fast).
Taken together, we examine maximum vertical position values of the lower lip and two minimum values of the lower lip (minimum vertical position values of the lower lip and corresponding vertical position values lined up with the maximum vertical position of the lower lip) in two word boundary conditions (across-word vs. within-word) and two speech rate conditions (comfortable vs. fast).
Eight (three male and five female) native Seoul-Korean speakers voluntarily participated in the electromagnetic articulometer study (i.e., flesh-point tracking system) and were financially rewarded1. At the time of data collection, the subjects were in their mid-twenties and early thirties, engaged in their graduate studies, and had spent their first twenty years of life in Seoul or Gyeonggi province in South Korea. Also at the time of data collection, they all resided in Connecticut, U.S.A., and were not isolated from Korean communities. They all identified themselves as native speakers of Seoul Korean, without any temporary or permanent speech or hearing impairment in the past.
Electromagnetic midsagittal articulometer (EMMA in Perkell et al., 1992) was used to derive kinematic data relating to articulator movement. The two-dimensional point-tracking system records the positional values of electric transducers (i.e., receiver coils) attached to several articulators: the upper lip, the lower lip, the tongue tip, the tongue body, the tongue dorsum, and the lower incisor. Three transmitters secured on a plastic helmet generate a magnetic field alternating at different frequencies and induce an alternating current in the transducers. This enables us to extract the distances of each of the transducers from the three transmitters, which are in turn expressed as a vector on an ordinate plane (see also Löfqvist (1993), for a more detailed description of how the electromagenetic transduction technique works). Articulatory data was sampled at 200 Hz (i.e., one frame collected every 5 milliseconds) and further smoothed by a low-pass filter of 20 Hz using post-processing procedures in Matlab software by Mathworks. For the purpose of the current study to examine the upper and lower lip movements, we limited the scope of analysis to the kinematic characteristics of the vertical movement of the two lips. Acoustic data was also acquired simultaneously at the time of articulatory data collection.
a. Target sequence /pa/
i. Within-word boundary condition.
/apai/ 'father' (North Korean dialect)
ii. Across-word boundary condition.
/pakatʃi/ '(a) gourd dipper'
b. Natural short sentence including the target sequence and its
syntactic structure using maximal projections (following Chomsky (1993) and simplified) shown in (a.i) and (a.ii). The symbol '#' represents a word boundary.
i. Within-word boundary condition.
/apai # toƞmunɨn # pukhanmalija/
[IP[NP apai toƞmunɨn] [VP[NP pukhanmal] [V ija]]]
'Father comrade is North Korean vocabulary.'
ii. Across-word boundary condition.
/tʃəna # pakatʃilɨl # pala/
[IP[NP tʃəna] [VP[NP pakatʃilɨl] [V phala]]]
'Jeona sells gourd dippers.'
We presented stimuli containing the target sequence to subjects as the presentation of stimuli was blocked by word boundary (syntactic structures are used to create different word boundaries (across-word boundary vs. within-word boundary)) and speech rate conditions. The across-word boundary condition was always acquired before the within-word boundary condition, and comfortable speech rate before fast speech rate. Eight subjects were instructed to read a given short natural sentence eight times, with the first four repetitions interrupted by four repetitions of a different short natural sentence; in total, eight repetitions for a given target word or phrase were acquired for further analysis. A total of 223 tokens from seven speakers were available for further analysis, since data from one female subject and one token from one male speaker were not included due to various reasons (e.g., stuttering and a data conversion problem). A set of stimuli within a presentation block was randomly ordered, which was consistent across blocks as well as subjects. Both stimuli sequences occurred in a potentially weakening environment, i.e., consistently in the intervocalic position.
Using the function of lp_Snapex in MVIEW (Tiede, 2005), we demarcated minimum vertical upper lip position and maximum vertical lower lip position relevant to the articulation of bilabial stop /p/. In particular, lp_Snapex used an algorithm based on velocity profiles and determined zero velocity. Also determined was corresponding vertical upper lip position as lined up with maximum vertical lower lip position. <Figures 1.a.i, 1.b.ii, and 1.c.iii> illustrate the respective specifics of the gestural demarcation superimposed on identical realtime movement trajectories of the upper and lower lips. Each figure shows one selected window of an across-word boundary /a # pa/ sequence and is captured from the temporal display in MVIEW.
As we take into account individual participant differences, we conducted linear mixed effects models in R (R Development Core Team, 2014). The results of articulatory analysis were fitted with the lmer function from the lme4 package (Bates et al., 2011). For analysis, we took as dependent variables the minimum vertical upper lip position <Figure 1.a.i>, maximum vertical lower lip position <Figure 1.b.ii>, and corresponding vertical upper lip position lined up with maximum vertical lower lip position <Figure c.iii>. We used the word boundary (across-word boundary vs. within-word boundary) and speech rate (comfortable vs. fast) conditions as fixed factors, and subjects (7 subjects) as a random factor. We draw a comparison between results of null models from mixed effects analysis and that of a full model (Speech rate + Boundary) in evaluating main effects, as well as between the result of a reduced model (Speech rate X Boundary) and that of a full model (Speech rate + Boundary) in evaluating an interaction.
Results showed no interaction between Speech rate and Boundary (χ2=1.69, p>0.05). Adding Speech rate <Table 1.c.i>, we shifted the variance that was previously seen in the random effects in the null model as shown in <Table 1.b.i> (the residual of 2.45 stands for the random variation that does not stem from the individual subjects component) to the fixed effects component (the residual of 2.37). To conclude, the vertical lower lip maxima varied with different speech rates (χ2= 6.73, p<0.01), lowering it by 0.54 mm (SE. ±0.21) at fast rate (comfortable>fast).
In terms of different word boundaries, vertical lower lip maxima also varied with Boundary (χ2=7.36, p<0.01). Including Boundary <Table 1.c.i>, we shifted the variance that was previously seen in the random effects <Table 1.a.i> (the residual of 2.46 stands for the random variation that does not stem from the individual subjects component) to the fixed effects component (the residual of 2.37). To conclude, speakers raise the lower lip at maximum constriction by 0.57 mm (SE. ±0.21) in the within-word condition (across-word < within-word). In the following, the result of Speech rate is shown in <Figure 2.a> and that of Boundary in <Figure 2.b> using a box plot.
There was no interaction between Speech rate and Boundary (χ2= 0.18, p>0.05). Adding Boundary <Table 2.c.i>, we shifted the variance that was previously seen in the random effects <Table 2.a.i> (the residual of 2.94 stands for the random variation that does not stem from the individual subjects component) to the fixed effects component (the residual of 2.84). To conclude, speakers lower the upper lip at minimum constriction by 0.63 mm (SE. ±0.23) in the across-word condition (across-word < within-word) (χ2=7.75, p<0.01). However, vertical upper lip minima were not influenced by Speech rate (χ2=0.74, p>0.05). In <Figure 3>, the results of Speech rate and Boundary are separately plotted using a box plot.
In evaluating the dependent variable of corresponding vertical upper lip position measured by aligning it with vertical lower lip maxima, neither interaction between Speech rate and Boundary nor significant effects of Speech rate were observed (χ2=0.02; χ2=1.12, all at p>0.05). Including Boundary <Table 3.c.i>, we shifted the variance that was previously seen in the random effects <Table 3.a.i> (the residual of 2.87 stands for the random variation that does not stem from the individual subjects component) to the fixed effects component (the residual of 2.77). To conclude, speakers lower the upper lip by 0.61 mm (SE. ±0.22) in the across-word condition (across-word < within-word) (χ2=7.27, p<0.01). In <Figure 4>, the results of Speech rate and Boundary type are separately plotted using a box plot.
4. Summary and discussion
Vertical lower lip maxima varied with different word boundaries, manifesting more reduction of the lower lip movement in the across-word condition. In contrast, the upper lip moved to a greater extent downwards in the across-word boundary condition in terms of minimum vertical upper lip position and corresponding vertical upper lip position measured at the time point of lower lip maxima; an articulatorily adaptive compensation might have occurred as a concurrent reaction to articulatory reduction of the lower lip. In addition, more reduction in the fast rate condition was exhibited, being confined to maximum vertical lower lip position; this is compatible with previous studies on lenition-related allophonic variations (Kirchner, 1998), complying with relatively more or frequent gestural weakening in fast rate.
A bilabial stop takes the form of a tight seal between the pairing of the movable upper and lower lips, being concurrent with an overshooting (i.e., negative lip aperture) and a spread constriction of the lips (Ladefoged & Maddieson, 1996; Gick et al., 2013, inter alia). In traditional accounts, when articulating bilabial stops, the upper lip is the target place of articulation toward which the lower lip actively changes position (Ladefoged & Maddieson, 1996). Testing word boundary effects as well as speech rate effects in vertical lower lip maxima, the results revealed that the lower lip is a movable articulator, varying with a linguistic factor (across-word boundary < within-word boundary) and a paralinguistic factor (comfortable > fast) that we were interested in. Contrary to traditional accounts, the upper lip, known as the target location of constriction in labial place, also varied with different word boundaries; therefore, the veracity of the upper lip being the target region has not been confirmed. Taken together, our current study of the bilabial voiceless stop /p/ provided empirical evidence to uphold the assertion of articulatory phonology that articulatory characteristics should be described by reference to the lip aperture (LA) gesture which includes the upper lip and lower lip (as well as the jaw) as articulators (Browman & Goldstein, 1986, 1989, 1990, 1992).
With respect to moving articulators, it is intuitive to assume greater spatial displacement of the lower lip during lip closing and opening movement since it has the specified shape of the mandibular prominence. The mandible is of use in articulating vowels and consonants (Wood, 1979; Satzman & Munhall, 1989; Browman & Goldstein, 1990; Mooshammer et al., 2007). The jaw height is known to differ depending on constriction degree of vowels (Wood, 1979) and consonantal types with different manner of articulation (e.g., gradual increasing in the order /t/ > /d/, /n/, /l/ in Keating et al. (1994) and Mooshammer et al. (2003); no change among /t/, /d/, /s/, and /ʃ/, but gradual increasing in the order loud speech > comfortable speech, being confined to /n/, and higher in the order /t/, /d/, /s/, /ʃ/ > /l/ with inter-speaker variability in Mooshammer et al. (2007)). However, examining a set of coronal stop consonants, Son et al. (2011) found invariable vertical jaw maxima among coronal /t/, /t*/, /th/, and /n/ in homorganic low vowel context with nonsense words (/aCa/). Likewise, speech rate effects were absent from constriction maxima in vertical tongue tip gesture in a high vowel context as well as low vowel context (fast=comfortable in /ala/→[aɾa] in Son (2015a); fast=comfortable in /ili/→[iɾi] in Son (2015b)). In sum, there existed some segment-specific or speech style-dependent jaw height difference despite the absence of coherent results across studies. To add to our understanding of how the jaw articulator behaves in terms of configuring a segment in Korean, it is of use to resolve related questions about jaw articulator movement in future study, as follows; i) does it serve consonantal articulation limited to functional movement of an active articulator that is elevated upwards to form constriction? (Satzman & Munhall, 1989; Browman & Goldstein, 1990), and ii) does it do so possibly beyond a simple assistance by manifesting diverse jaw position with different manner of articulation and/or jaw height difference of a single segment with different linguistic contexts (e.g., syntax, prosodic structure, speech rate/style) (Mooshammer et al., 2007)? Since these questions are beyond the scope of current study, we leave them for further analysis.
We did observe that the vertical lower lip maxima was lower (more reduction) in the across-word boundary condition, compared to the within-word condition. Comparable results were obtained in some previous studies. Byrd's (1996) electropalatography study on intergestural coordination showed, confined to one speaker (out of five speakers) in its occurrence, observed a trend of less constriction in onset /k/ if a /Vs#kV/ sequence is broken up by an immediately preceding word boundary, compared to /V#skV/ sequences and /Vsk#V/ sequences (e.g., 'Type a scrab again.', 'Type basscap again.', 'Type mask amp again.'), but the opposite pattern was not observed across speakers. We suggest that the twofold functions of oral constriction gestures (e.g., consonantal and vocalic gestures in Browman & Goldstein (1992)) may be, in part, attributed to more gestural reduction of intervocalic onset in the across-word condition. In a task-dynamics model of speech production, the timing of a gesture is governed by coupling oscillators (Saltzman & Munhall, 1989). By hypothesis, coordination among gestures can exhibit stable modes (in-phase, 0°) (to which CV sequences belong), which can be represented using coupling graphs (Goldstein et al., 2006). Since the target /p/ from the current study occurs in onset position irrespective of different word boundary conditions, this implies that a within-word /apa/ sequence is not distinguished from an across-word /a#pa/ sequence in terms of coupling graphs; constriction actions and their relative phasing between /p/ and /a/ are both invariant. However, given that V-to-V coarticulation occurs due to the twofold functional characteristics of oral gestures - consonantal and vocalic (Browman & Goldstein, 1992; Öhman, 1967), we conjecture that two instances of /a/ interrupted by a word boundary may have caused less stable modes in term of V-to-V coarticulation by gestural blending (Romero, 1996), in part, which may have in turn induced gestural spatial reduction such as that observed in the current study. We may find a more hint about more reduction in the across-word boundary condition to the general assumption of articulatory phonology; intergestural coordination mode is specified in lexical items and extensively applied to a sequence of lexical items, but articulatory consequence can vary (Browman & Goldstein, manuscript). We will leave this issue for further study.
In addition, caution should be taken since the result of the current study of the bilabial voiceless stop /p/ revealed that the lip aperture gesture, as a holistic measure, is more appropriate as the subject of articulatory study. In future study, linguistic and paralinguistic factors need to be evaluated in terms of the lip aperture gesture in intervocalic position, which ultimately conforms to the assertion of articulatory phonology, "gestures are the units of action that can be identified by observing the coordinated movements of the vocal tracts" (Browman & Goldstein, 1989:202).
We would like to conclude this study by pointing out a deficiency that needs to be improved upon in future study. Note that one set of target onsets from our data did not exhibit paradigmatic contrast in terms of syntactic or prosodic hierarchical structures; the target onset in the within-word condition is part of a word in sentence-initial or utterance-initial position (e.g., [InflP(IP)[NP(AP) σσσ σσσ ...), while the target onset in the across-word condition is part of a word which is one word away from the sentence-initial or utterance-initial position ([InflP(IP)[NP(AP) σσ [NP(AP) σσσσσ ... or [InflP(IP)[NP(AP) σσ [NP σσσσσ ...). The asymmetric distribution of stimuli stems from the fact that we have segmental contexts and syllabic structure balanced such that a lenis stop target /p/ occurs in intervocalic position (e.g., /a(#)Ca/ as a possible lenition context) and occupies onset position orthographically. Nevertheless, admitting that the stimuli conditioning was not prepared especially for us to systematically evaluate word boundary effects on articulatory reduction, neither did we find apparent evidence that word-internal onset is characterized by a strengthening/lengthening position compared to word-initial onset, one word apart from sentence-/utterance-initial position (cf., domain-initial strengthening in the order Ui>IPi>APi>Wi in Cho & Keating (2001); domain-edge lengthening in the first segment of an accentual phrase and in the final syllable of an intonational phrase in Jun (1993)).