Establishing the Three-Way Voicing Contrast in Madurese Stops

Madurese, a Western Malayo-Polynesian language spoken on the Indonesian island of Madura, has been described as having a three-way voicing contrast (i.e. voiced, voiceless unaspirated and voiceless aspirated) in its stops. However, the fact that the VOT values for voiceless unaspirated and aspirated stops are not large and they are also followed by vowels with different height raises a question if Madurese only contrasts voiced and voiceless stops phonologically instead. The goal of this paper is to establish the phonological status of the voicing contrast in Madurese stops, arguing that Madurese can be better described as a language with a three-way contrast. For this purpose, we provide phonological evidence that includes consonant-vowel interactions, vowel harmony processes and some morphophonemic processes involving vowel height alternations. All of this evidence is also used to substantiate the proposal that consonants trigger vowel height alternation rather than vowels trigger consonant allophony.


INTROdUCTION
Madurese is a Western Malayo-Polynesian language spoken primarily on the island of Madura and a number of regencies in East Java, Indonesia. Madurese is often described as having eight surface vowels consisting of four non-high [a, ɛ, ə, ɔ] and four high [ɤ, i, ɨ, u] vowels and as contrasting voiced /b, d, ɖ, ɟ, ɡ/, voiceless unaspirated /p, t, ʈ, c, k,/ and voiceless aspirated /pʰ, tʰ, ʈʰ, cʰ, kʰ/ stops at five places of articulation (Cohn, 1993a;Cohn & Ham, 1998;Cohn & Lockwood, 1994;Stevens, 1968Stevens, , 1980Stevens, , 1991. In relation to this, three types of voice onset time (VOT) were observed by Lisker and Abramson (1964): voicing begins before the release of the stop, voicing begins after the release and voicing lags behind the release of the stops, corresponding respectively to voiced, voiceless unaspirated and voiceless aspirated stops. As has been mentioned earlier, Madurese also has three stop categories, namely voiced, voiceless unaspirated and voiceless aspirated. This being so, the three-way contrast in Madurese typologically seems to bear some resemblance to the voicing categories observed in other languages such as Thai and East Armenian (Lisker & Abramson, 1964).
The three-way contrast in Madurese is interesting because VOT, which is a common denominator for categorizing stops (Cho & Ladefoged, 1999;Lisker & Abramson, 1964), does not appear to be so prominent in distinguishing between voiceless aspirated and voiceless unaspirated stops in particular. This is indicated by the fact that the VOT distributions of these two voiceless stops overlap considerably, as shown in Figure 1 below and also Figure 2 on the following page. In addition, considering that we may not  need any onset information to distinguish between the two voicing categories given the vowels following them are different also raises a question if the so-called three-way phonological contrast in Madurese turns out to be a two-way contrast, namely voiced and voiceless stops.
There are at least three reasons why it is tempting to think that Madurese may have a twoway contrast distinguishing between voiced and voiceless stops. First, as mentioned earlier, the VOT values between the three voicing categories do not exhibit the typical characteristics of a threeway voicing contrast. This is because the VOT difference between voiceless unaspirated and voiceless aspirated stops is relatively small despite the fact that the difference has been found to be significantly different (Cohn & Ham, 1998;Cohn & Lockwood, 1994;Author et al., 2015). More importantly, as shown in Figure 2, although there also appears to be a trimodal VOT distribution in Madurese stops, two of the modes (voiceless unaspirated and voiceless aspirated stops) are very close together, making them look somewhat bimodal. This is reminiscent of the distribution that is commonly found in languages with a twoway contrast distinguishing voiced and voiceless stops.
The findings on Madurese VOTs appear to be in contrast with those on other languages which show a three-way voicing distinction such as Thai and Eastern Armenian (Lisker & Abramson, 1964).
The difference primarily derives from the fact that the VOT values of the three series of stops in those languages show robust trimodal distributions, and more importantly these distributions do not overlap much or at all. In addition, we can find a minimal triplet showing the three-way voicing distinction without noticeable vowel differences following each of the stop series in such languages. It is true that in the case of Thai, for example, there may be an influence of tones given that Thai is a tonal language. However, since we are particularly concerned with segmental differences, any suprasegmentals such as tone can be considered irrelevant in this respect. The second reason why it is tempting to consider Madurese as having a two-way contrast is related to the fact that voiceless unaspirated stops only occur before non-high vowels while voiceless aspirated stops only occur before high vowels. Put differently, the occurrence of each stop type seems to be conditioned by different vocalic environments and in this way these two voiceless stops should be considered allophonic. It is also possible that the difference between the VOTs for voiceless unaspirated and voiceless aspirated stops may simply reflect variations due to the different vowel types which follow them. In fact, there is some evidence that VOT also depends on vowel quality. For example, VOT is longer before tense vowels and shorter before lax vowels in English (Port & Rotunno, 1979). There is also evidence that VOT is longer before high vowels than before low vowels in other languages with prevoiced stops such as Hungarian (Gósy, 2001) and Canadian French (Nearey & Rochet, 1994).
The third reason is concerned with the fact that there is no minimal triplet of stops exemplifying the three-way contrast in Madurese. The true distinction is only between voiced and voiceless aspirated stops because this is the only contrast where true minimal sets can be found, for example  [calah] 'defective'. It is evident that they are not minimal pairs since the difference not only resides in the stops but also in the following vowels.
Two questions need to be addressed with regard to the voicing contrast in Madurese: (1) how should the contrast be better described, a two-way or a three-way phonological contrast?
(2) what phonological consequences for favouring one type of contrast over another? In relation to the status of the voicing contrast in Madurese, we will argue that despite not having a surface phonetic distribution of a 'classic' three-way voicing contrast as in Thai, the preferred phonological analysis for Madurese is one with a three-way contrast in stops.

dISCUSSION
In the preceding section, we mention three reasons why it is tempting to propose that Madurese could be described as a two-way voicing contrast: voiced vs. voiceless stops. This temptation partly arises from the fact that no minimal triplets showing the three-way contrast are found in Madurese because voiceless unaspirated and voiceless aspirated stops occur before vowels with different height. That is, the fact that vowels also co-vary with the preceding consonants in such a way that voiced and voiceless aspirated stops only co-occur with high vowels while voiceless unaspirated and voiceless aspirated stops only co-occur with nonhigh vowels.
As shown in the examples below, true minimal triplets in Madurese cannot be found because vowels following the stop categories also co-vary. This is particularly obvious when we look at the voiceless unaspirated stop series in which the following vowels are different from those following voiceless aspirated and voiced stops, both of which co-occur with vowels with the same height. As we can only find voiceless aspirated stops before high vowels and voiceless unaspirated stops before non-high vowels, we might argue that the two voiceless stop categories are allophonic. That is, they do not belong to phonologically different voicing categories since they may be conditioned by, or depend on, the following vowel. Thus, with this case in mind, we could argue that the stop consonants that we observe in Madurese are actually not stops with a three-way voicing contrast but ones with a twoway distinction contrasting between voiced and voiceless stops.
If this could be the case, voiceless unaspirated and voiceless aspirated stops in Madurese seem similar to voiceless aspirated and unaspirated stops in English, which are allophonic. The difference, however, lies in the fact that in Madurese voiceless aspirated and unaspirated stops occur in any position in word as long as they co-occur with the 'right' vowels. In contrast, English voiceless unaspirated and voiceless aspirated stops can be followed by any vowel type but their occurrences are not as free as those in Madurese particularly in terms of position in word. That is, in English, voiceless aspirated stops only occur in word-initial position and in second syllables with stress while voiceless unaspirated stops occur elsewhere.
Other evidence also suggests that voiceless aspirated and unaspirated stops in Madurese may belong to the same phonological voicing category, i.e. voiceless stops. This evidence comes from languages with three-way voicing contrasts such as Thai and East Armenian as observed by Lisker and Abramson (1964). Unlike in Madurese, we can find true minimal triplets without vowel alternations in those languages. The same case can also be seen in Korean, which is also known to have a three-way voicing contrast among its stops. It is important to bear in mind, however, that the three stop categories in Korean are all voiceless in utterance-initial position and for some speakers both F0 and voice quality also play a role for the distinction among the three categories (see e.g. Cho, Ladefoged, & Jun, 2002;Han & Weitzman, 1970;Kang & Guion, 2008;Kang, 2014;Kim & Duanmu, 2004;Kong, Beckman, & Edwards, 2012). However, in terms of vowel quality, they look very similar.
One relevant question to raise to this point is whether there is indeed a phonological threeway contrast in Madurese stops given that the VOT difference between voiceless unaspirated and voiceless aspirated stops is relatively small and more importantly they occur in different environments. Furthermore, the fact that the fundamental frequency of the two voiceless stops, shown in Figure 3, does not differ significantly particularly for male speakers may provide further phonetic evidence that they may belong to the same phonological category (Author et al., 2015). On the other hand, another related question to ask is, if voiceless unaspirated and voiceless aspirated stops are allophonic, why they appear to categorically exert different effects on following vowels. In the following we consider three scenarios with respect to whether Madurese has a two-or three-way contrast in its stops and decide which scenario is more parsimonious phonologically and therefore better describes the voicing system of Madurese stops. The scenarios are that Madurese may have (1) a two-way contrast distinguishing between voiced and voiceless stops, (2) a two-way maximum contrast distinguishing between voiced and voiceless aspirated stops and (3) a three-way contrast distinguishing between voiced, voiceless unaspirated and voiceless aspirated stops.
The first two scenarios assume that there are two underlying consonants (i.e. voiceless and voiced stops for the first scenario, and voiced and voiceless aspirated in the second scenario) and eight underlying vowels (a, ɛ, ə, ɔ, ɤ, i, ɨ, u). The third scenario assumes that there are three underlying consonants (voiced, voiceless unaspirated and voiceless aspirated) and four underlying vowels (a, ɛ, ə, ɔ), a scenario which has been assumed so far. The third scenario also assumes that these four underlying vowels may surface as high vowels [ɤ, i, ɨ, u] after voiced and voiceless aspirated stops and non-high vowels [a, ɛ, ə, ɔ] in word-initial position, after voiceless unaspirated stops and other consonants. Suppose Madurese has a two-way voicing contrast as in the first scenario, the contrast that may describe the system is that the language may have underlying voiced and voiceless stops and eight underlying vowels. By this account, voiceless stops are assumed to have two allophones, namely voiceless unaspirated and voiceless aspirated stops. They occur in complementary distribution. That is, voiceless unaspirated stops only occur before non-high vowels while voiceless aspirated stops only occur before high vowels. This can be schematised as in (1) below.
(1) C [-voice] → [+asp] /__ (+high vowels), where C = stop consonants Considering voiceless stops as having two allophones such as these bears a resemblance to some extent to allophonic voiceless unaspirated and aspirated stops in English discussed earlier.
In this way, we may not need to think about what phonological feature voiced and voiceless aspirated stops share in triggering vowel raising because there is no vowel raising in the first place.
In fact, by this scenario, we may have to consider that the vowels affect the consonants rather than the other way around. Consequently, we do not need to think about or look for what phonological feature voiced and voiceless aspirated stops share because they just happen to be two different voicing categories with no effects on vowels. Furthermore, considering Madurese as having a two-way voicing contrast as proposed in the first scenario may imply that there are eight underlying vowels in Madurese. In this case, we would have to view that the eight vowels (a, ɛ, ə, ɔ, ɤ, i, ɨ, u) are all phonemic; they are not allophones of the four 'underlying' non-high vowels as has been previously suggested in, for example, Stevens 1968, Cohn 1991aand Cohn 1993b. One consequence of considering that there is no feature spreading and therefore there are no four underlying vowels is that we do not have to account for what phonological feature is shared by voiced and voiceless aspirated stops that triggers vowel raising because there is no vowel raising in the first place. Thus, the issue of feature spreading and the consonant-vowel interaction becomes no longer relevant if this could be the case.
As a consequence, if we hold the assumption that there is only a two-way phonological contrast in stops and hence eight underlying vowels in Madurese, we can argue that what we have observed with respect to voicing and aspiration and their relationships to vowel height is not really unusual in that language, either areally or typologically. In this case, the voicing contrast in Madurese would be similar to its related languages such as Javanese (see e.g. Brunelle, 2010;Fagan, 1988;Hayward, 1995) and Sundanese (see e.g. Adisasmito-Smith, 2004;Kulikov, 2010;Robins, 1953), both of which show a two-way contrast, i.e. tense versus lax stops for the former and voiced versus voiceless stops for the latter. The question is whether this assumption is in line with the phonological facts of Madurese (see Cohn 1991b), one of which is that non-high vowels only occur in word-initial position while high vowels never occur in this position.
The second possible scenario is that there may be a two-way maximum contrast in Madurese, distinguishing between underlying voiced and voiceless aspirated stops (Brett Baker, personal communication). As it stands, the account of the two-way maximum contrast differs from that which proposes that the two-way contrast in Madurese is between underlying voiced and voiceless stops as in the first scenario, where voiceless stops can be realised as voiceless unaspirated and voiceless aspirated stops. However, they are similar in their assumptions that Madurese has eight underlying vowels. The difference is that the two-way maximum contrast proposes that voiced stops and voiceless unaspirated stops are allophonic; voiced stops are underlying and the voiceless unaspirated stops are the surface variant that occurs before non-high vowels. This can be represented as in the following rule: (2) C [+voice] → [-voice] /__ (-high vowels), where C = stop consonants Similar to the first scenario discussed earlier, this proposal assumes no feature spreading or consonant-vowel interactions whatsoever. However, the problem with the proposal is that it is particularly contradictory with the phonetic evidence such as VOT and F0 as shown in Figure  1 and Figure 3 earlier, where in fact voiced and voiceless unaspirated stops indeed belong to two different voicing categories. Another issue that may arise from this proposal is that we also need to explain why voiced stops become voiceless before non-high vowels, which is not a trivial phonological matter.
Brett Baker (personal communication)  (Hyslop, 2009). Tone is not contrastive for obstruents, but it appears that tone is also spreading to obstruents. In this case, obstruents in Kurtöp have been suggested to be made up of three series of stops, namely voiced, voiceless unaspirated and voiceless aspirated. Voiced stops are always followed by lower fundamental frequency while voiceless unaspirated and voiceless aspirated stops are always followed by higher fundamental frequency. With respect to VOT, voiced and voiceless unaspirated stops show some overlap in their VOT distribution, but they differ from voiceless aspirated stops. On the basis of this phonetic evidence, Hyslop (2009) suggests that there is no three-way contrast in Kurtöp, instead a maximum two-way contrast between voiced and voiceless aspirated stops whereas voiced and voiceless unaspirated stops have undergone a merger.
How is the case in Kurtöp similar to that in Madurese? The similarity between these two languages to some extent rests on the fact that there is a co-occurrence restriction between stops and properties of the following vowels. In the case of Kurtöp, voiced stops only co-occur with low F0 while voiceless unaspirated and voiceless aspirated stops only co-occur with high F0. In the case of Madurese, voiced and voiceless aspirated stops only co-occur with high vowels while voiceless unaspirated stops only co-occur with non-high vowels. However, if we take a closer look at these two languages, the co-occurrence restriction in Kurtöp appears to be phonetically motivated and therefore more natural than that in Madurese. In fact, it has been established that the fundamental frequency following voiced stops is lower than that following voiceless unaspirated and voiceless aspirated stops (Hombert & Ladefoged, 1976;House & Fairbanks, 1953;Löfqvist, Baer, McGarr, & Story, 1989;Ohde, 1984). Thus, it does not come to a surprise if we also observe the same phenomenon in Kurtöp. In addition, unlike in Madurese, any stops in Kurtöp can be followed by any possible vowels. In Kurtöp, it is voiced and voiceless unaspirated stops that show overlapping VOT distribution while in Madurese it is voiceless unaspirated and voiceless aspirated stops which exhibit considerable overlaps in VOT values.
The assumption that there may be only a twoway contrast in Madurese stops would make sense if we consider that the occurrences of voiceless unaspirated and voiceless aspirated stops as in the first scenario discussed earlier are considered environment-dependent. The question is whether the vowels with a height difference following the consonants can be counted as an environment in this case. Furthermore, considering the vowels as the environment which predicts consonant allophony, i.e. high vowels predict voiced and voiceless aspirated stops while non-high vowels predict voiceless unaspirated stops, is also phonologically problematic. This is because, as we will see, it cannot explain such things as vowel height harmony and the fact that certain vowel affixes change their height when they are affixed to words where the edgemost obstruent is a voiced or voiceless aspirated stop.
Cases where consonant allophony may fail to explain vowel height harmony in Madurese can be found in ones involving three consonants, namely l, r and ʔ. When occurring in word-medial position, these consonants are always transparent in the sense that the height of the vowels following them depends on the height of the vowels preceding them (Stevens, 1968;Trigo, 1991). That is, if the vowels preceding them are high, the vowels following them will also be realised as high. Some examples are shown in (3) below.
( On the other hand, if the vowels before l, r and ʔ are non-high, the vowels following them will also be realised as non-high. Some examples are shown in (4) below.
(4) [ Another aspect which also needs to be mentioned here is the behaviour of /s/. In word-initial position, /s/ behaves in the same manner as the other voiceless stops, nasal consonants and liquids. However, it behaves differently when it occurs in intervocalic position. In this position, the height of the vowels following /s/ depends on whether /s/ occurs morpheme-internally or at a morpheme boundary (Cohn, 1993b;Stevens, 1968 The vowel height harmony processes shown in (3), (4) and (5) above only make sense if we hold the idea that it is the consonants that determine vowel height, i.e. vowel allophony instead of the other way around, i.e. consonant allophony.
Other evidence that supports the idea that it is consonants that affect vowels comes from vowel height alternation as a result of affixation.
This can be seen in morphophonemic alternation involving a nasal prefix 'N' indicating the 'actor voice' form of verbs (Cohn, 1993b, p. 110;Davies, 2010, p. 32;Stevens, 1991, p. 363), a process known as Nasal Substitution. In this case, when the prefix 'N' replaces an underlying voiced or voiceless aspirated stop with its homorganic nasal equivalent, the following vowel subsequently becomes non-high, as exemplified in (6)  Moreover, other phonological evidence in support of the idea that it is the consonants that trigger vowel harmony comes from a process called vowel deletion. Vowel deletion, which is optional and also appears to be dialect-specific in Madurese, can occur in an open first syllable of a word consisting of at least three syllables. That is, the vowel of the word in the first syllable can undergo an optional deletion if it is preceded by a consonant and followed by an approximant, a liquid, or a glide (Davies, 2010;Stevens, 1968). As we can see in (7), even after the vowel in the first syllable is deleted and therefore in the absence of the preceding vowel, the vowel following the transparent consonants /l, r/ does not change. This indicates that the harmony trigger is the consonant preceding the transparent consonants, not the vowel itself. Another process that can be used as further evidence for vowel raising is aspiration as a result of a morphophonemic process. This type of aspiration occurs when a root-final stop, which is always realized as voiceless unaspirated in Madurese, meets with a suffix beginning with a non-high vowel. In this position, the voiceless unaspirated root-final stop will be realized as a voiceless aspirated stop and the non-high vowel suffix will subsequently be realized as a high vowel. Examples of this morphophonemic aspiration are shown in (8) below. The suffix -ɛ is attached to a noun to form an imperative verb whereas the suffix -an is attached to a verb to form a noun.  (8) also provide further evidence that it is the consonants that trigger the vowel height alternation, as opposed to vowels triggering consonant allophony. This is because the suffixes that underlyingly begin with non-high vowels become high vowels as the root-final stops become aspirated. In this case, it appears that final stops such as in (8) in fact underlyingly voiceless aspirated and that aspiration becomes neutralized in word-final position.

CONClUSION
Of the three possible scenarios discussed above, subscribing to the idea that Madurese has a threeway voicing contrast (voiced, voiceless unaspirated and voiceless aspirated) and therefore four underlying vowels (a, ɛ, ə, ɔ) favourably accounts for the voicing contrast in the language. It is true that there are no minimal triplets in Madurese due to the strict CV co-occurrence restriction.
Proposing that Madurese has only a two-way contrast fails to explain the robust consonant-vowel interaction as well as feature spreading associated with the prevocalic consonants. Put differently, the two-way contrast proposal seems to simplify the description of the consonants, but it complicates the analysis of the vowels, the vowel harmony process and the morphophonemic alternation. In addition, the proposal cannot account for the phonological patterning of voiced and voiceless aspirated stops because it will simplify the CV co-occurrence restriction to a trivial phonological phenomenon that does not require a further phonological analysis. Consequently, this would make Madurese phonologically similar to its neighbouring languages that show a two-way contrast in their stops, which is not really the case.