It is challenging for English as a Foreign Language (EFL) learners to acquire English vowels. Many previous studies have demonstrated that non-native English speakers find it difficult to identify English vowels (Cho et al., 2013; Cutler et al., 2005; Flege et al., 1997; Franklin, 2009; Wang & van Heuven, 2006).
First and foremost, it has been widely acknowledged that language transfer heavily contributes to second/foreign language acquisition through numerous earlier studies. Lado (1957) proposed Contrastive Analysis Hypothesis (CAH) to explain the correlation between a native language (L1) and a second language (L2) in L2 acquisition: the greater the affinity between L1 and L2, the easier it is to learn L2 while the more dissimilarity there is, the harder to learn L2. However, CAH has been criticized and disputed on the grounds that it is insufficient for predicting the degree and directionality of difficulty for second-language learners (Eckman, 1977); furthermore, it cannot ensure that the similarity between two languages necessarily facilitates learning or that difference automatically inhibits the process.
Meanwhile, Speech Learning Model (SLM), suggested by Flege (1987), indicated that the more similarity that exists between L1 and L2, the more difficult it is for second-language learners to establish a new category for the L2 system. Additionally, Flege (1995) proposed an improved version of SLM that regards the L2 learning experience as one of the most important factors in L2 acquisition. Learning a new non-native inventory that is absent in a native language may seem difficult at first but can actually be learned with ease as learners extend their L2 experience/exposure. That is, L2 learners with an extended L2 experience in the complete attainment can reach a native-like performance in L2 acquisition (Ho, 2010).
With regard to acquiring L2 English, English vowels are more uncertain and equivocal for second-language learners of English than English consonants. This is largely due to the fact that it is difficult to elucidate the articulation of vowels compared to that of consonants, regardless of what the second language is (Franklin, 2009; Jones, 1960). Vowel articulation consists of complex movements of the tongue (e.g., the tongue height, backness of the tongue, and lip rounding), making it difficult for non-native English speakers and even L1 English speakers to pronounce English vowels correctly (Jones, 1960; Ladefoged & Disner, 2012:129). In the case of English, tenseness, the degree of tongue tension, also affects the articulation of English vowels. For this reason, when EFL learners’ L1 vowels are not distinguished by tenseness, as is the case for Chinese or Korean EFL learners, they experience greater difficulty in identifying English vowels, than those whose L1 distinguishes tense and lax vowels, such as Dutch or German EFL learners (these relevant papers will be illustrated in section 1.1).
Furthermore, compared to other languages in the world, the relatively high density of the English vowel system causes pronunciation difficulties for EFL learners insofar as they may encounter vowels in English that simply do not exist in their native language (Franklin, 2009; Maddieson, 1997). For instance, in the case of monophthongs, American English has approximately 14 or 15 vowel qualities /i, ɪ, u, ʊ, e, ɛ, æ, ɚ, ɝ ə, ʌ, ɔ, o, ɑ, (a)/ in its vowel inventory (Kenyon & Knott, 1953; Reetz & Jongman, 2009). In fact, fewer than 10% of languages in the world contain 15 or more vowel phonemes in their vowel systems (Franklin, 2009). In addition to English, Dutch and German have 13 and roughly 13 to 17 plain vowels respectively. However, approximately 60% of the world’s languages contain only six or fewer pure vowel qualities in the vowel system (Maddieson, 1997). For example, Spanish, Japanese, and Mandarin Chinese (i.e., Beijing dialect) comprise five pure vowels qualities in each vowel inventory (Franklin, 2009). On the other hand, Korean is composed of ten vowel phonemes /i, y, e, ø, ɛ, ɨ, ʌ, a, u, o/ (Yang, 1996), which is denser than Spanish, Japanese, and Mandarin Chinese but less dense than English (Franklin, 2009).
Based on the theoretical backgrounds outlined above, it can be predicted that EFL learners 1) with native languages (L1) that substantially differ from English such as Korean, Chinese, and Japanese and 2) with different levels of L2 proficiency, as claimed by SLM, result in varied patterns of English vowel production. Given these predictions, this paper concentrates on the relationship between English vowel production and L2 proficiency in L2 production by EFL learners— especially Korean EFL learners.
There have been a variety of cross-linguistic studies on English vowel production by non-native English speakers with diverse native language (L1) backgrounds. For instance, Wang & van Heuven (2006) conducted research on ten English vowels /i, ɪ, u, ʊ, ɛ, ɔ, æ, ʌ, o, e/ produced by Chinese, Dutch, and American speakers. The study classified the English vowels into two subsets: the short/lax vowels (e.g., /ɪ, ɛ, ʌ, ʊ/) and the long/tense vowels (e.g., /i, e, æ, ɔ, o, u/) then compared the two subsets through spectral (i.e., formant frequencies) and temporal (i.e., vowel duration) features across the three L1 groups. According to the results of the research, L1 English speakers were more accurately able to separate the two subsets in both spectral and temporal aspects compared to the two non-native (i.e., Chinese and Dutch) groups. Chinese speakers, however, basically failed to differentiate the short/lax vowels from their long/tense counterparts in a spectral fashion while native Dutch speakers demonstrated clear spectral differences between the vowels. This is due to the fact that Chinese does not regard the length of vowels (i.e., short/lax or long/tense vowels) as a vowel feature at the phonological level, but Dutch operates similarly to English in distinguishing phonetically lax and tense vowels in its vowel system (Wang & van Heuven, 2006). However, as for the temporal feature (i.e., vowel duration), native Chinese speakers demonstrated a clearer temporal distinction between the short/lax and long/tense vowels than Dutch EFL learners. Dutch speakers, by contrast, scarcely distinguished the tense and lax vowel contrast /u/-/ʊ/. Although both English and Dutch have tense and lax vowel subsets in their vowel inventories, there is a lack of the /u/-/ʊ/ contrast in the Dutch vowel system, preventing Dutch EFL speakers from separating the contrast (Li & Lee, 2017; Wang & van Heuven, 2006).
Studies of English have also been conducted on the production of English vowels by Japanese learners. Ingram & Park (1997) investigated the perception and production of Australian English vowels /i, ɪ, ɛ, æ, a/ by Korean and Japanese learners of English. A noteworthy finding from the study was that Japanese learners of English clearly perceived the /æ/ vowel that was absent in their L1 inventory; furthermore, the /æ/ vowel was distinguished from its near neighbor vowel /a/. The study concluded that this was at least, in part, because Japanese had phonological length as a main acoustic cue (Tsujimura, 1996), suggesting that this L1 characteristic affected the perception of non-native vowels. Meanwhile, Korean speakers rarely separated the /ɛ-æ/ contrast since the recent phonological merger /ɛ/-/e/ in their L1 vowel system was reflected in L2 production (Ingram & Park, 1997).
On the whole, these previous studies confirmed that L1 transfer causes a lack of fluency in non-native L2 pronunciation, which does not occur in the L2 learners’ native language (Best, 1991). Other empirical studies regarding English vowels produced by L1 English speakers and Korean learners of English have been discussed in section 1.2.
English and Korean have basically disparate vowel systems. The two languages share /i, e, ɛ, o, u, ʌ/ in common. However, Korean does not have the four vowels /ɪ, æ, ʊ, ɑ/ (the vowel /ɑ/ in English is not identical to the vowel /a/ in Korean) that are present in the English vowel inventory but instead uses two front rounded vowels /ø, y/ and one high central vowel /ɨ/, which are not found in English (Franklin, 2009; Yang, 1996). In addition, as mentioned above, English has tense and lax vowel subsets in its inventory such as /i/-/ɪ/, /ɛ/-/æ/, and /u/-/ʊ/ whereas there are no Korean vowels that are distinguished by the degree of tongue tension (Cho et al., 2013; Flege et al., 1997; Ingram & Park, 1997; Li & Lee, 2017; Kim, 2007; Koo, 2000; Tsukada et al., 2005; Yang, 1996). Thus, it can be expected that Korean EFL learners may have difficulty in pronouncing vowels in English that do not exist in the Korean inventory. There are a number of prior studies concerning English vowel production by native speakers of English and Korean EFL speakers.
Flege et al. (1997) compared German, Spanish, Mandarin Chinese and Korean EFL learners with native English speakers in the production of front English vowel contrasts /i/-/ɪ/ and /ɛ/-/æ/. The results from Flege et al. (1997) indicated that Korean speakers, in particular, were unduly dependent on the length of the vowels in separating the contrasts as compared to the other non-native English speakers’ groups. Yang (2008) surveyed English tense and lax vowels /i/-/ɪ/ and /u/-/ʊ/ produced by Korean and American males. Both groups temporally separated the contrasts by producing the tense vowels much longer than their lax counterparts. By contrast, the spectral distinction between the pairs was not apparent to Korean speakers but to native English speakers who presented a marked contrast between the pairs in the first formant (F1) relevant to the tongue height.
All these findings confirm that Korean EFL learners barely separate English tense and lax contrasts in a spectral manner due to the lack of tense and lax vowels in the Korean vowel system. Owing to the fact that the Korean inventory has fewer vowels than the English inventory, Korean learners of English yield a smaller English vowel space compared to native English speakers, checking them in the phonetic realization of English vowels, especially in the production of phonetically neighboring vowels (Franklin, 2009; Koo, 2000). In other words, Korean learners of English are comparable to Chinese and Japanese learners of English insofar as each group demonstrates negative L1 transfer in the production of L2 English vowels that are not present in native language (L1) inventory.
However, according to Flege’s SLM (1995), L2 learners with extended L2 experience/exposure can improve their L2 performance. Namely, these instances of negative L1 transfer can be overcome as learners’ L2 experience/exposure grows. Indeed, there are numerous empirical studies to support the claim in SLM, illustrated in section 1.3.
Numerous studies have classified EFL learners’ L2 experience according to several factors and investigated whether these factors affect L2 English production.
Flege et al. (1997) studied the effects of English-language experience in the production and perception of English front vowels /i, ɪ, ɛ, æ/ by non-native English speakers including native Spanish, German, Mandarin, and Korean speakers. The EFL learners were dispersed into relatively experienced or inexperienced groups on the basis of their length of residence in the US (mean=7.3 vs. 0.7 years). In general, experienced groups produced and perceived English vowels more accurately than relatively inexperienced ones regardless of the non-native English speakers’ groups. Wang (1988) studied the perception and production of English /i/-/ɪ/ and /ɛ/-/æ/ contrasts by Mandarin Chinese EFL learners. The Chinese learners were classified according to their length of stay in an English-speaking country (mean=1 vs. 5 years). The results suggested that the less experienced group pronounced the vowel /ɪ/ like /i/ and the vowel /æ/ like /ɛ/ while the relatively experienced group produced the vowel /ɪ/ and /æ/ similarly to native speakers.
Tsukada et al. (2005) examined the production and perception of eight English vowels /i, ɪ, e, ɛ, æ, ɑ, ʌ, u/ by native Korean adults and children. The two distinct age groups were compared to age-matched native English speakers respectively. The research yielded the following finding: native Korean children had a better understanding of identifying and separating English vowels than native Korean adults. The L1 Korean children resembled their age-matched native English speakers in both their perception and production of the eight English vowels; however, the L1 Korean adults failed to produce a native-like performance in either their perception or production of the vowels.
Kim (2007) investigated the production of English front vowels /i, ɪ, e, ɛ, æ/ by Korean L2 English learners. The Korean speakers were assigned to 10 distinct groups based on three main factors: their age of arrival in the US, length of residency in the US, and their degree of motivation to learn English. Prior to the analysis in this research, the participants filled in a questionnaire to determine their age, gender, time of arrival, the length of residence in the US, and their degree of motivation to learn English. The results showed that most of the Korean learners hardly separated the /i/-/ɪ/ and /ɛ/-/æ/ contrasts. Most importantly, only those who arrived in the U.S. before the age of 11 could pronounce English vowels in much the same way as native English speakers. Even in the case of those who had been long-time residents in the Unites States, if they had not arrived in the US during early childhood, they were more likely to struggle to distinguish the English front contrasts.
Except for the production of English vowels, Escudero et al. (2012) probed the perception of English front vowel contrast /ɛ/-/æ/ through two regional varieties of Dutch: the North Holland Dutch spoken in the Netherlands and Flemish Dutch spoken in Belgium. In fact, the lack of the English /ɛ/-/æ/ contrast was evident irrespective of regional differences of Dutch. However, the research also found that the two varieties differed in their non-native perception of the English vowels /ɛ/ and /æ/. Specifically, North Holland speakers identified English /ɛ/ more accurately than /æ/, whereas the Flemish group showed the same result of identifying both vowels.
There have been empirical studies on the effects of L2 proficiency in L2 English production. Ho (2010) investigated the influence of L2 proficiency levels in the production and perception of American English front vowels /i, ɪ, e, ɛ, æ/ by EFL learners in Taiwan. There were 40 EFL participants assigned to either a higher-level proficiency EFL group (HEFL) or a lower-level proficiency EFL group (LEFL) (20 vs. 20, respectively) through the scores on their English proficiency level tests, otherwise known as the GEPT (General English Proficiency Test), standardized English proficiency test in Taiwan. The results displayed significant L2 proficiency effects in the production and perception of English front vowels by Taiwan EFL learners. The HEFL group significantly outperformed the LEFL group in the perception of all the front vowels. As for production performance, the HEFL group produced the vowel /æ/ in a near-native fashion while the LEFL group had a better performance with the vowel /i/ and /ɛ/ than with other front vowels but failed to reach a near-native level across all the vowels.
However, most studies on the influence of L2 English proficiency in L2 English production, especially by Korean EFL learners, have typically focused on English consonants than on English vowels (Cho, 2017; Kong & Yoon, 2013; Lee, 2018; Park, 2017; Park et al., 2010). For instance, Park et al. (2010) examined whether the Korean learners’ production of English n-l sequenced words (e.g., only, fan letter, boneless) and m-l sequenced words (e.g., homeland, home loan, harmless) correlate with the influence of Korean /n/-lateralization (e.g., /non-li/ [nol.li] ‘logic’, /nan-lo/ [nal.lo] ‘stove’) and /l/ nasalization (e.g., /kam-li/ [kam.ni] ‘supervision’, /kɨm-li/ [kɨm.ni] ‘interest rate’) when they acquire the L2 English sound system. The finding of the study demonstrated that in general, the high proficiency group outperformed the low proficiency group in the production of both the n-l and m-l sequenced words. Kong & Yoon (2013) examined how Korean learners of English employ multiple acoustic cues (i.e., VOT and F0) in the perception and production of the English alveolar stop with a voicing contrast. The effects of L2 English proficiency were visible insofar as the high proficiency group had better control of inhibiting and enhancing the relevant acoustic parameters. Cho (2017) probed native Korean speakers’ production of English stops and fricatives by the rated L2 English read speech corpus spoken by Korean learners of English. The results showed that there was a correlation between higher proficiency levels and appropriate aspiration while lower levels displayed the high proportion of stops and fricatives production errors.
Overall, several learner factors, including the length of residence in the US, age (adult vs. children), regional varieties of L1, and L2 proficiency have been investigated in relation to their influence on non-native speakers’ production of English vowels. They have demonstrated that the ways to classify L2 learners correlate, to some extent, with EFL learners’ English production and perception. However, the research on the role of L2 English proficiency in Korean learners of English has focused more on the production of English consonants in comparison with that of English vowels. In this regard, the current thesis examines the correlation between Korean EFL learners’ L2 proficiency and English vowel production.
The purpose of this present study is to determine whether there is a relationship between accurate vowel production and L2 proficiency in L2 English produced by Korean EFL learners. Our working hypothesis is that the higher the levels in the rated speech corpus, the better they separate the adjacent vowel pairs.
This present study, based on the results of previous studies, investigates the relationship between English vowel production and L2 proficiency in L2 production by Korean EFL learners. To this end, this study employs rated L2 English read speech corpus, named ‘Genie SpeeCor’, spoken by Korean learners of English (Rhee, 2016). According to Mauranen (2004), in L2 production, speech corpus is helpful and needed for teaching and learning non-native speakers. In addition, the size of corpus has been considered an important matter to provide a representative corpus to permit the way the language is actually used (Campbell et al., 2007; Park, 2017). There have been various speech corpus of Korean EFL learners (Kim et al., 2004; Yoon et al., 2009) but they lack the rated corpus with detailed guidelines for rubric to evaluate Korean learners. In this regard, Genie SpeeCor is composed of 200 Korean EFL learners and rated them with detailed scoring rubric (see detailed in Chapter 2).
Compared to earlier work on Korean EFL learners’ production of English vowels that have been heavily biased toward the production of English tense and lax contrasts (Cho et al., 2013; Flege et al., 1997; Ingram & Park, 1997; Li & Lee, 2017; Kim, 2007; Tsukada et al., 2005; Yang, 2008), this study incorporates more comprehensive English vowel phonemes /i, ɪ, ɛ, æ, ʌ, ɔ, ɑ, ʊ, u/. All the nine vowels are paired with adjacent vowels (e.g, /i/-/ɪ/, /u/-/ʊ/, /ɛ/-/æ/, /ʌ/-/ɔ/, /ɔ/-/ɑ/) and each pair is compared by being phonetically measured with a spectral feature (i.e., formant frequency) that is the primary cue in separating vowel qualities. However, out of the adjacent pairs, the tense and lax contrasts /i/-/ɪ/ and /u/-/ʊ/ are acoustically measured with a temporal feature (i.e., vowel duration) as well as with a spectral feature on the grounds that both acoustic cues (i.e., formant frequency and vowel duration) play a key factor in distinguishing between the tense and lax contrast.
This study employs rated L2 English read speech corpus, named ‘Genie SpeeCor’, spoken by Korean learners of English. A total of 200 native Korean speakers participated in this speech corpus. With the exception of 10 subjects who used to live in English-speaking countries for less than five years, all of the subjects had never lived in other countries. Specifically, they consisted of three age groups: sixty elementary school students (age range: 10–12 years old; 29 males vs. 31 females), eighty middle school students (age range: 13–14 years old; 40 males vs. 40 females), and sixty adults (age range: 19–33 years old; 30 males vs. 30 females). There were 100 English sentences in each group. The participants were asked to read the text materials aloud at a casual rate through the head-set microphone, Shure WH20XLR, in a sound-controlled room. Their voices were set at 16 kHz/ 16 Bit and recorded as a PCM format (Park, 2017; Rhee, 2016). The fluency in Korean L2 English was assessed by five human raters. Three of them were native Korean speakers who were either researchers or graduate students majoring in English language and literature in 2016. The other raters were native English-speaking education experts. They had all been trained with respect to the evaluation of L2 English proficiency before implementing the pronunciation and fluency ratings. They rated the recorded audio files for L2 English proficiency through a scoring tool on the screen suggested by ETRI (Electronics and Telecommunications Research Institute) (Rhee, 2016). The evaluated text materials were assigned to five different levels of L2 English proficiency from level 1 (novice) to level 5 (mastery). The raters assessed the data according to specific phonetic features (i.e., analytic evaluation) and then evaluated them in general (i.e., holistic evaluation). Appendix A and B provide information for the analytic and holistic scoring rubric in the Genie speech corpus.
Of the age groups, only the adult group (age range: 19–33 years old; 30 males vs. 30 females) was chosen for this study to eliminate the effects of age. Furthermore, in the adult group, the data produced by male adults, not by female adults, were selected for this study. Due to the fact that most materials are densely clustered in the intermediate level (i.e., level 3) (e.g., 65 tokens in the level 1; 471 in the level 2; 1,064 in the level 3; 176 in the level 4; 65 in the level 5), the five rated levels are further redistributed into three categories: level 1–2 to the lower level, level 3 to the middle level, and level 4–5 to the higher level.
In this analysis of formant frequency, the first two formant frequencies (i.e., F1 and F2) are measured since they play the most important role in determining vowel quality. From the text materials, the nine vowels /i, ɪ, ɛ, æ, ʌ, ɔ, ɑ, u, ʊ/ are selected as follows:
CVC within a word
e.g., /ʊ/ in ‘could’ [kʊd]
CVC across word boundaries
a. C// VC e.g., /i/ in ‘this evening’ [ðɪs] [iːvnɪŋ]
b. CV //C e.g., /u/ in ‘to the’ [tu] [ðə]
VC at the beginning of a sentence.
e.g., /ɪ/ in ‘It was~’ [ɪt]
All the selected vowels have primary stress and are followed by a consonant. All the surrounding consonants are obstruents (e.g., stops /p, t, k, b, d, g/, fricatives /f, v, Ɵ, ð, s, z, ʃ, ʒ, h/, and affricates /ʧ, ʤ/) to lessen co-articulation effects on the vowels. If a vowel is surrounded by sonorants (e.g., glides /w, j/, liquids /l, r/, and nasals /m, n, ŋ/), it is fairly difficult to identify the formant boundary between the vowel and the sonorants. This is largely because sonorants have relatively high resonance (Ladefoged & Disner, 2012:77). Thus, to reduce these measurement difficulties, the vowels to which sonorants are adjacent are excluded from this study.
For the analysis of vowel duration, the tense and lax vowels /i/-/ɪ/ and /u/-/ʊ/ that 1) contain a voiced obstruent /b, d, g, v, ð, z, ʒ, ʤ/ as the following consonant and that 2) are not existent in the last syllable of a sentence are chosen. This is due to the fact that the durational difference in vowels may occur depending upon the voicing of a following consonant (i.e., voiced vs. voiceless) (Fougeron & Keating, 1997; Klatt, 1975) and that the vowel in a syllable is placed at the very end of a sentence tends to elongate (Cho et al., 2013; Klatt, 1975).
Acoustic characteristics (F1, F2, and duration) of the vowels are measured with WaveSurfer 1.8.8 software program.
To measure acoustic features (F1, F2, and duration), vowel onset is considered the point which shows the onset of periodicity in the waveform and the onset of voicing in the spectrogram as strong vertical striations of F1. Vowel offset is defined as the point representing the offset of periodicity in the waveform and a cessation of formant bands in the spectrogram. The temporal interval from the vowel onset to the vowel offset is regarded as vowel duration. F1 and F2 frequencies of a vowel are measured right in the middle of the temporal interval.
The F1 and F2 means and standard deviations (in parenthesis) of nine English vowels in the rated levels were presented in Table 1.
To compare the rated levels’ vowel spaces, the mean F1–F2 plot of the vowels across the rated levels is displayed in Figure 1.
Compared to the intermediate and lower levels (i.e., level 1–2 and 3), the higher level (i.e., level 4–5) has a relatively larger vowel space. In particular, the disparity between the rated levels is greater in the front vowels /i, æ/ and the low-back vowel /ɑ/ (F2 of the vowel /i/: M=2,063 Hz in level 1–2, M=2,124 Hz in level 3, M=2,334 Hz in level 4–5; F1 of the vowel /æ/: M=568 Hz in level 1–2, M=576 Hz in level 3, M=695 Hz in level 4–5; F1 of the vowel /ɑ/: M=666 Hz in level 1–2, M=748 Hz in level 3, M=756 Hz in level 4–5). Moreover, the results showed that the central and back vowel pairs /ʌ/-/c/ and /u/-/ʊ/ substantially overlap compared to other adjacent vowels. For a more accurate analysis of the spectral distinction between the adjacent pairs in the rated levels, spectral acoustic cues (F1 and F2) are investigated respectively.
As a matter of fact, every single adjacent pair is composed of two vowels with different tongue heights. To be specific, for the adjacent vowel pairs /ɛ/-/æ/, /ʌ/-/ɔ/, and /ɔ/-/ɑ/, the vowels /ɛ, ɔ/ are produced in the middle of the tongue height while the vowels /æ, ʌ, ɑ/ are placed low in terms of tongue height (Kenyon & Knott, 1953). Regarding the tense and lax contrasts /i/-/ɪ/ and /u/-/ʊ/, the tense vowels /i, u/ are articulated relatively higher than their lax counterparts /ɪ, ʊ/ (Alfonso & Baer, 1982).
Based on these spectral features relevant to the tongue height (F1), this section compares the F1 values for the five adjacent contrasts in the rated levels in Figure 2.
Given that F1 is inversely proportional to the tongue height, the scale of F1 goes downwards (i.e., low values at the top and high values at the bottom).
Two-way ANOVAs, the 2×3 analysis of variances (Vowel: two adjacent vowels in each pair×level: level 1–2, level 3, level 4–5) are performed to find out whether there is the influence of L2 proficiency in producing each adjacent pair with acoustically separable vowel qualities, especially in terms of tongue height (F1). In all of the statistical analysis in this work, differences relevant to a p<.05 are considered significant. The results are given in Table 2.
In the high-front tense and lax contrast /i/-/ɪ/, a two-way analysis of variance yielded significant main effects of vowel, F(1,271)=17.03, p=.000, and level, F(2,271)=10.26, p=.000. There was a significant interaction effect of Vowel×level, F(2,271)=12.20, p=.000, indicating that the level effect was greater in the lax vowel /ɪ/ condition than in the tense vowel /i/ condition (see Figure 2). Post hoc analysis using the Scheffe post hoc criterion for significance revealed that the higher level (the contrast /i/-/ɪ/: M=337 vs. 433 Hz, SD=89 vs. 95 Hz) was significantly different from the middle (M=341 vs. 350 Hz, SD=57 vs. 56 Hz) and the lower (M=341 vs. 338 Hz, SD=55 vs. 47 Hz) levels, which did not differ from each other (see Figure 2). According to the result of paired sample T-test for F1 of the contrast /i/-/ɪ/ in the rated levels, only the higher level significantly separated the contrast /i/-/ɪ/, t(61)=–4.02, p=.000. Namely, in producing the front vowels /i/-/ɪ/, unlike the middle and lower levels, only the higher level was spectrally able to distinguish the contrast with the tongue height (F1) by further lowering the tongue height when producing the lax /ɪ/ compared to the tense counterpart /i/.
With regard to the front adjacent pair /ɛ/-/æ/, the ANOVA revealed significant main effects of vowel, F(1,164)=3.99, p<.05, and level, F(2,164)=6.61, p<.01, but the Vowel×level interaction was not significant F(2,164)=1.30, p>.05. Post hoc analysis of the main effect of level revealed that the higher level (the pair /ɛ/-/æ/: M=605 vs. 695 Hz, SD=141 vs. 146 Hz) was significantly different from the middle (M=559 vs. 576 Hz, SD=80 vs. 125 Hz) and the lower (M=556 vs. 568 Hz, SD=91 vs. 94 Hz) levels, which did not differ from each other (see Figure 2). It means that in terms of the tongue height (F1), the spectral distinction between the contrast /ɛ/-/æ/ was greater in the higher level than in the middle and lower levels although there was no interaction effect of Vowel×level.
Concerning the central and back vowels /ʌ/-/ɔ/, all effects are insignificant, indicating that the pair /ʌ/-/ɔ/ was hardly distinguished by the tongue height regardless of the fluency ratings.
For the adjacent back vowels /ɔ/-/ɑ/, the main effect of vowel was significant, F(1,64)=12.47, p=.001, but the main effect of level was non-significant, F(2,64)=.83, p>.05. The interaction effect of Vowel×level was insignificant, F(2,64)=.49, p>.05. It means that in general, Korean EFL learners were spectrally able to separate the back vowels /ɔ/-/ɑ/ by the tongue height (F1).
As for the high-back tense and lax contrast /u/-/ʊ/, the ANOVA revealed that none of effects were significant, which indicates that Korean EFL learners generally find it difficult to make a spectral distinction between the pair /u/-/ʊ/ with the tongue height.
In terms of F2 associated with tongue backness, according to Alfonso & Baer (1982), the high-front tense vowel /i/ is produced relatively forward compared to its lax counterpart /ɪ/ whereas the high-back tense vowel /u/ is articulated from a relatively backward position compared to the lax vowel /ʊ/. Meanwhile, the mid-front vowel /ɛ/ is placed more forward than the low-front vowel /æ/; the mid-back vowel /ɔ/ placed backward in contrast to the low-back vowel /ɑ/ and the low-central vowel /ʌ/ (Jones, 1960).
On the basis of these spectral characteristics related to F2, the mean values of F2 for the five adjacent pairs in the rated levels are compared in Figure 3.
The scale of F2 goes downwards (i.e., low values at the top and high values at the bottom) as in the scale of F1 in Figure 2.
Two-way ANOVAs, the 2×3 (Vowel: two adjacent vowels in each pair×level: level 1–2, level 3, level 4–5) analysis of variances, are conducted to ascertain whether there is the influence of L2 proficiency in producing each adjacent pair with acoustically separable vowel qualities, especially in terms of tongue backness (F2). The results are given in Table 3.
In the high-front tense and lax contrast /i/-/ɪ/, a two-way ANOVA yielded significant main effects of vowel, F(1,271)=19.27, p=.000 and of level, F(2,271)=5.83, p<.01. There was a significant interaction effect of Vowel×level, F(2,271)=14.96, p=.000, indicating that the level effect was greater in the lax vowel /ɪ/ condition than in the tense vowel /i/ condition (see Figure 3). Post hoc analysis (Scheffe) revealed that the higher rated level (the contrast /i/-/ɪ/: M=2,334 vs. 2,004 Hz, SD=133 vs. 77 Hz) was significantly different from the middle (M=2,124 vs. 2,122 Hz, SD=174 vs. 252 Hz) and the lower (M=2,063 vs. 2,056 Hz, SD=302 vs. 150 Hz) levels, which did not different from each other (see Figure 3). The result of paired sample T-test for F2 of the contrast /i/-/ɪ/ in the rated levels showed the significant difference between the F2 values of the contrast /i/-/ɪ/ in the higher rated level, t(61)=12.49, p=.000, not in the middle and lower rated levels. To put it simply, unlike the middle and lower levels, only the higher rated level was spectrally able to distinguish between the contrast /i/-/ɪ/ with the tongue backness (F2) by moving the tongue more backward when producing the lax vowel /ɪ/ than the tense /i/.
With respect to the front adjacent pair /ɛ/-/æ/, the ANOVA revealed that none of effects were significant, indicating that with the tongue backness (F2), native Korean speakers were unable to differentiate the vowel /ɛ/ from the vowel /æ/ in general.
Concerning the central and back vowels /ʌ/-/ɔ/, none of effects are significant. That is to say, the pair /ʌ/-/ɔ/ was hardly separated by the tongue backness for Korean EFL learners.
For the adjacent back vowels /ɔ/-/ɑ/, the main effect of vowel was significant, F(1,64)=4.50, p<.05, but no other effects were significant. It means that in general, Korean EFL learners could distinguish the back vowels /ɔ/-/ɑ/ by the tongue backness (F2).
As for the high-back tense and lax contrast /u/-/ʊ/, the ANOVA revealed that none of effects were significant. In other words, all the proficiency levels failed to distinguish between the contrast /u/-/ʊ/ in terms of the tongue backness.
The mean durations and standard deviations (in parenthesis) for tense and lax contrasts /i/-/ɪ/ and /u/-/ʊ/ in the rated levels are presented in Table 4.
|English tense and lax contrasts|
|High-front vowels||High-back vowels|
|Level 1–2||122 (40)||106 (49)||174 (50)||123 (30)|
|Level 3||107 (17)||78 (14)||161 (19)||81 (13)|
|Level 4–5||90 (15)||47 (8)||170 (14)||69 (19)|
To compare the mean durations for the tense and lax contrasts in the rated levels, the results are illustrated in Figure 4.
Regardless of L2 proficiency levels, mean durations of the tense vowels /i, u/ are relatively longer than those of their lax counterparts /ɪ, ʊ/. In general, the high-back tense vowel /u/ has a longer duration than any other tense or lax vowels across all the rated levels. Most importantly, except for the high-back tense vowel /u/ in the intermediate level (i.e., level 3), every tense and lax vowel duration gradually decreases as the rated level becomes more proficient. It may suggest that the lower level (i.e., level 1–2) is more likely to produce the contrasts for an unduly longer length of time. For a more accurate analysis of the temporal distinction between the contrasts in the rated levels, a statistical analysis is conducted.
Two-way ANOVAs, the 2×3 (Vowel: tense vs. lax vowel×level: level 1–2, level 3, level 4–5) analysis of variances, are performed to check whether in terms of vowel duration, there are relationship and influence between the temporal distinction between tense and lax contrasts and L2 fluency ratings in Korean EFL learners’ production of English vowels. The results are displayed in Table 5.
In the high-front tense and lax contrast /i/-/ɪ/, a two-way ANOVA yielded significant main effects of vowel, F(1,93)=7.98, p<.01, and of level, F(2,93)=5.99, p<.01, but the interaction effect of Vowel×level was non-significant, F(2,93)=.56, p>.05. Post hoc test (Scheffe) revealed that the lower level (the contrast /i/-/ɪ/: M=122 vs. 106 ms, SD=40 vs. 49 ms) was significantly different from the middle (M=107 vs. 78 ms, SD=17 vs. 14 ms) and higher (M=90 vs. 47 ms, SD=15 vs. 8 ms) levels, which did not differ from each other (see Figure 4). In other words, the durational distinction between the contrast /i/-/ɪ/ was greater in the relatively proficient levels (i.e., the middle and higher levels) compared to in the lower level.
As for the high-back tense and lax contrast /u/-/ʊ/, the ANOVA revealed significant main effects of vowel, F(1,45)=44.96, p=.000, and level, F(2,45)=3.70, p<.05, but the Vowel×level interaction was not significant, F(2,45)=1.80, p>.05. Post hoc analysis of the main effect of level indicated that the lower level (the contrast /u/-/ʊ/: M=174 vs. 123 ms, SD=50 vs. 30 ms) was significantly different from the middle (M=161 vs. 81 ms, SD=19 vs. 13 ms) and the higher (M=170 vs. 69 ms, SD=14 vs. 19 ms) levels, which did not differ from each other (see Figure 4). It means that the temporal distinction between the contrast /u/-/ʊ/ was greater in the middle and higher levels in comparison to the lower level.
The current study investigated whether there is a relationship between accurate vowel production and proficiency levels in L2 English produced by Korean EFL learners. The results of this study suggest that the influence between English vowel production and L2 proficiency was apparent only in producing the high-front tense and lax contrast /i/-/ɪ/. The more proficient the rated levels, the better they produced the contrast /i/-/ɪ/ with acoustically separable vowel qualities. However, the other pairs failed to show the influence between vowel production and L2 proficiency levels in Korean EFL learners’ production of English vowels.
The results of this thesis revealed that the middle and lower levels showed little spectral distinction between the contrast /i/-/ɪ/ while the higher level significantly separated the contrast in a spectral manner by moving the tongue much lower and backward in producing the lax vowel /ɪ/ than in producing the tense counterpart /i/. However, in terms of vowel duration, Korean EFL learners were generally able to differentiate the tense /i/ from the lax vowel /ɪ/ by producing the tense /i/ much longer than the lax counterpart /ɪ/. Particularly, the middle and higher levels were better able to separate the contrast with the temporal feature (i.e., vowel duration) compared to the lower level. Many previous studies have demonstrated that native Korean speakers lack an understanding of the spectral distinction between the tense and lax contrast /i/-/ɪ/ insofar as there is no concept of tense and lax subsets in their L1 inventory (Flege et al., 1997; Hong, 2012; Tsukada et al., 2005). However, the findings of this study may suggest that Korean learners of English at proficient L2 fluency levels can separate the tense and lax contrast /i/-/ɪ/ according to spectral as well as temporal cues. Moreover, numerous studies have shown that Korean EFL learners unduly rely on the temporal characteristic in distinguishing between tense and lax contrasts (Flege et al., 1997; Tsukada et al., 2005; Yun, 2009), but the findings also indicate that Korean learners at relatively lower proficient L2 fluency ratings have difficulty separating the contrast /i/-/ɪ/ even with the temporal feature.
With regard to the high-back tense and lax contrast /u/-/ʊ/, the results suggested that Korean learners of English seldom distinguish between the contrast /u/-/ʊ/ in a spectral manner. However, as for vowel duration, the temporal distinction between the contrast /u/-/ʊ/ was significant across all the rated levels. The results support previous studies that Korean L1 speakers classify the high-back tense and lax vowels /u/-/ʊ/ mainly by vowel duration rather than vowel quality (Hong, 2012; Tsukada et al., 2005; Yun, 2009). However, the results revealed that the temporal distinction between the vowels was greater in the higher and middle levels than in the lower level; namely, the length effect diminished in the lower fluency rating.
The pair of adjacent front vowels /ɛ/-/æ/ was significantly distinguished by the tongue height (F1). In particular, compared to the middle and lower levels, the higher level clearly distinguished the vowels by further lowering the tongue in producing the low vowel /æ/ compared to producing the mid vowel /ɛ/. On the other hand, all the rated levels failed to separate the front vowels with the tongue backness (F2). Existing research has established that Korean learners of English find it difficult to produce the /ɛ/-/æ/ contrast with acoustically separable vowel qualities (Flege et al., 1997; Hwang & Lee, 2012; Ingram & Park, 1997; Tsukada et al., 2005). Ingram & Park (1997) showed that this was largely because of the recent phonological merger /ɛ/-/e/ in the Korean vowel system. On the other hand, the findings of this study suggest that Korean learners at proficient L2 English levels can separate the vowels /ɛ/-/æ/ in a spectral manner, especially in terms of the tongue height (F1).
Concerning the central and back vowels /ʌ/-/ɔ/, the pair was not spectrally separated across all the rated levels. This was partly because of the negative L1 transfer. To be specific, the mid-back vowel /ɔ/ is not included in the Korean L1 inventory, but the central vowel /ʌ/ is shared both in Korean and English vowel systems. Thus, Korean learners of English generally consider the vowels /ʌ, ɔ/ as the one vowel /ʌ/ which is present in Korean inventory. This finding lends support to the previous studies of Koo (2000), Hong (2012) and Tsukada et al. (2005) on the grounds that Korean learners have experience significant confusion when producing central and back vowels.
As for the back vowels /a/-/ɔ/, on the other hand, the results demonstrated that the vowels were classified both by F1 and F2 for all the rated levels, meaning that Korean EFL learners are able to distinguish the back vowels with spectral cues.
In conclusion, only in the high-front tense and lax vowels /i/-/ɪ/ was the influence between accurate vowel production and L2 proficiency apparent. The more proficient the fluency ratings, the better they separated the contrast through spectral as well as temporal cues. Besides, although there was no interaction effect of Vowel×level in the production of the adjacent front vowels /ɛ/-/æ/, the higher level showed greater spectral distinction between the vowels with the tongue height compared to the middle and lower levels. However, except for the back vowels /a/-/ɔ/, Korean EFL learners generally experience difficulty in separating the central and back vowels. It may result from native Korean speakers’ smaller vowel spaces in comparison to native English speakers. According to Koo (2000) and Franklin (2009), the Korean vowel system is less dense than the vowel system in English, making native Korean speakers articulate English vowels within relatively narrow vowel spaces in relation to native English speakers. Hence, Korean learners of English need to move their tongue more drastically when producing English vowels to avoid any confusion in separating adjacent vowels.
Meanwhile, this study lacks a controlled group of native English speakers. Thus, it is unable to fully ascertain whether the performances representing significant differences in separating the pairs are native-like or not. To see if these significant effects reach a native level, future research needs to compare the results with those of native English speakers. Moreover, apart from English monophthongs, English diphthongs should be examined for further comprehensive analysis of L2 English vowel production by Korean L1 speakers.