Diagnostics for wordhood

Compounds in Mandarin


This chapter examines the compound formation processes in Mandarin. There are two goals to be achieved in this chapter: (a) to introduce notions and facts that will be discussed in the following chapters, and (b) to examine how an analysis of formation processes helps to compounds in Mandarin. This chapter is composed of five sections. Section 2 looks at the definition of compounding. Section 3 provides an introduction to the diagnostics for determining wordhood in Mandarin. Section 4 firstly introduces how compounds are classified in the literature, and then proposes a novel classification by examining the compound formation processes. Section 5 concludes the chapter

Definition of compounding

Morphology is messy in nature (Hooper, 1979). Evidence for this comes from the arguments on the definition of compounding. It is well accepted in the field of morphological study that to provide a universally applicable definition of compounding is not an easy task (Lieber & Štekauer, 2009). To some scholars, a compound is composed of two or more free morphemes, which are words themselves when used separately (Bauer, 2001; Bloomfield, 1933; Fabb, 1998). To others, bound morphemes can be constituents of compounds as well (Gerdts, 1998; Mithun, 1997, 1999). The majority of multisyllabic words in Mandarin would not count as compounds if the first definition is adopted, since in most complex words in Mandarin, at least one of the constituents is a bound morpheme (Packard, 2000). Therefore, while some morphologists follow Bloomfield and define a compound in Mandarin as consisting of two or more free morphemes (e.g., Dai, 1992), some propose that a compound in Mandarin contains at least one bound morpheme (Y. R. Chao, 1968), and others propose that they can be composed of two or more morphemes that are either bound or free, with the former as the majority (X. Dong, 2004; Liao, 2014). This thesis follows Dong (2004) and Liao (2014) and adopts this broad definition of compounds.

Diagnostics for wordhood

This sections examines the diagnostics for wordhood. Various criteria including the grammatical, semantic, and prosodic diagnostics have been put forward in the literature. This section focuses on the criteria relevant to the present study. They are insertion and modification .


In Chapter 1, the tense and aspect marker le (‘-ed’) was employed as a diagnostic for wordhood. If a construction does not allow the insertion of le (‘-ed’) in between its constituents, it is then regarded as a word; otherwise, it is treated as a phrase. This criterion for distinguishing compounds from phrases is based on the Lexical Integrity Hypothesis (Booij, 2005a; Chomsky, 1970; A.-M. Di Sciullo & Williams, 1987; C. T. J. Huang, 1984; Lapointe, 1980; Lieber & Scalise, 2006; Selkirk, 1982; Spencer, 2005). According to this hypothesis, a word possesses the property of “syntactic atomicity” (Di Sciullo & Williams, 1987, p. 52). One aspect of the atomicity feature is that a word does not allow any type of insertion. As shown in (1a), inserting the tense and aspect marker le between the two constituents of the verbal compound pao-zou (run-walk, ‘run away’) results in an illegitimate expression. Instead, le can only be added to the end of the compound, illustrated by (1b). One diagnostic for nominal compounds is the insertion of de. The morpheme de is a phrasal marker, which can be inserted into a phrasal structure (Gu & Shen, 2001; D. Shi, 2002, 2003). The insertion of de into a nominal compound would result in ungrammaticality. For example, the structure of attributive compounds with noun output is [M N]N, in which M stands for the modifier of the head N (e.g., bai-cai (white-vegetable, ‘cabbage’)). A parallel phrase structure can be identified: [M N]NP (e.g., hao-cai (good-vegetable, ‘vegetables with good quality’)). To distinguish the compound bai-cai (white-vegetable, ‘cabbage’) from the phrase hao-cai (good-vegetable, ‘vegetables with good quality’), the modifier marker de needs to be employed. As illustrated by (2) and (3), the insertion of de into the compound bai-cai (white-vegetable, ‘cabbage’) results in an ungrammatical structure, while inserting it into the nominal phrase hao-cai (good-vegetable, ‘vegetable with good quality’) still produces an acceptable expression, and the meaning stays the same as well.



Another prediction of the Lexical Integrity Hypothesis is that a word does not permit the modification of its constituents, whereas a phrase does (Lieber & Scalise, 2006). For example, in the verbal phrase kai-guan (open/turn on-close/turn off, ‘to open/turn on and close/turn off’), adverbs can be inserted to modify kai (‘open/turn on’) and guan (‘close/turn off’), respectively. As can be seen in (4), two adverbs qing-qing-de (‘softly’) and zhong-zhong-de (‘heavily’) can be inserted before the two morphemes, respectively. The adverb qing-qing-de (‘softly’) is used to modify the verb kai (‘open/turn on’) by adding the way in which the door opens (i.e., the door opened softly). The adverb zhong-zhong-de (‘heavily’) is used to modify the verb guan (‘close/turn off’) by adding the way in which the door closes (i.e., the door closed heavily). However, as a compound, the noun kai-guan (open/turn on-close/turn off, ‘switch’) does not permit any modification of its constituents. In (5), the adjective hao (‘good’) can only precede the compound to modify the expression as a whole. Neither of its constituents can be independently modified by the adjective.

Types of compound formation processes 

Previous classification of compounds

In the history of morphological study, three mainstream approaches have been proposed for the classification of compounds. They are (a) headedness, (b) syntactic category of compounds, and (c) semantic relations between constituents. Of these approaches, the first is to classify compounds with respect to headedness, that is, whether there is a head in the compound. For many, including Bauer (2003), Bloomfield (1933), Booij (2005b), Chao (1968), Fabb (1998), Haspelmath (2002), Scalise and Bisetto (2009), Spencer (1991), and Zwicky (1985), kai-guan (open/turn on-close/turn off, ‘switch’), with the structure [V V]N, is an exocentric compound because the expression lacks a head; whereas bai-cai (white-vegetable, ‘cabbage’), with the structure [A N]N, is an endocentric compound, in which the nominal element N functions as the head. There are not many exocentric compounds in Mandarin. For instance, Ceccagno and Basciano (2007) carries out a corpus study of Mandarin compounds and classify them into three types:
coordinate, attributive, and subordinate. Their study indicates that the percentage of exocentric compounds in each of the three types are 15.8%, 20.7%, and 19%, respectively. That is to say, more than 80% compounds in Mandarin are headed. Ceccagno and Basciano’s research further indicates that there exist three different patterns of headedness in Mandarin compounds: left-headed, right-headed, and two-headed. In addition to headedness, compounds can also be categorized according to their syntactic category. For example, compounds in Mandarin are classified into nominal compounds, verbal compounds, and so on (S. Huang, 1998; Packard, 2000). A problematic situation this criterion may result in is that compounds of different internal structures may fall into the same category (Scalise & Bisetto, 2009). For example, while hei-ban (black-board, ‘blackboard’) and tuan-gou (group-buy, ‘collective buying’) are both nominal compounds, the two have different structures, with the former being [A N]N and the latter being [N V]N. The third approach is to categorize compounds in accordance with the semantic relations between the constituents. Based on this criterion, compound categories such as attributive, appositive, dvandva, appositional, and so on, are proposed (Bauer, 1983, 2009; A. M. Di Sciullo, 2005; Lieber, 2004; Scalise & Bisetto, 2009; Spencer, 1991). In Mandarin, compounds are usually classified into three types—subordinate, attributive, and coordinate (Bisetto & Scalise, 2005; Ceccagno & Basciano, 2007; Ceccagno & Scalise, 2006). Subordinate compounds refer to those in which the constituents are in a head-argument relation. Attributive compounds refer to those in which the constituents are in a modifier-head relation. Coordinate compounds are those in which the constituents are in a relation of synonymy, antonymy, or hyponomy. Of the three types, coordinate compounds are distinctly different from the other two in the sense that: (a) there is a “logical coordination between the constituents” (Ceccagno & Basciano, 2007, p. 212), and (b) the constituents usually have the same syntactic category. For example, in the coordinate compound mao-dun (spear-shield, ‘contradiction’), both mao (‘spear’) and dun (‘shield’) are nouns. The compound as a whole thus takes the structure [N N]N. What is problematic with this semantic classification of Mandarin compounds is that the distinctions between subordinate and attributive compounds are not clear-cut (Bisetto & Scalise, 2005). For example, the word kou-suan (mouth-calculate, ‘do mental calculation’), which is classified as an attributive compound in the literature (e.g., Ceccagno & Scalise, 2006), also exhibits the property of a subordinate compound. In other words, the modifier kou (‘mouth’) is an argument (instrument) of the head suan (‘calculate’).

A new perspective

In view of the problems identified in the previous section and the situation that most research on Mandarin compounds to date is restricted to disyllabic compounds (e.g., Ceccagno & Basciano, 2007; Ceccagno & Scalise, 2006; Dai, 1992; Packard, 2000; X. Zhang, 2001), a new classification of compounds is deemed necessary. Although in Chapter 1, it was introduced that multi-syllabic words in Mandarin are mostly disyllabic, there are still some consisting of more than two syllables. Excluding words with more than two syllables from research may run the risk of failing to reveal the essential properties of compounds. Therefore, this thesis proposes a new way of classifying compounds, which aims to not only provide a solution to the identified problems but also cover compounds of different lengths. This new approach is to categorize compounds according to their formation processes. The rationale behind this approach is that compounds with the same surface structure (e.g., cai-tang (vegetable-soup, ‘vegetable soup’) and fu-qi (husband-wife, ‘husband and wife’)), as well as compounds whose constituents are semantically related (e.g., kou-suan (mouth-calculate, ‘do mental calculation’) and kuang-ben (crazy-run, ‘run wildly’)), may be produced in different ways.Therefore, once the formation processes of the compounds are determined, the differences between these compounds can be determined as well. Compounds in Mandarin are classified into three types in this thesis: (a) compounds that are formed through reduplication, (b) compounds that are formed through merge, and (c) compounds that are formed through head movement.

1. Introduction
1.1 Research questions
1.2 Goals of the research
1.3 Theoretical orientation
2. Compounds in Mandarin
2.1 Introduction
2.2 Definition of compounding
2.3 Diagnostics for wordhood
2.4 Types of compound formation processes
3. Noun incorporation and pseudo noun incorporation
3.1 Argument Structure
3.2 Incorporation—head movement or phrasal movement?
3.3 Tense and aspect
3.4 Motivation for movement
3.5 Phonological form
3.6 Word order
4. [VN]V compounds in Mandarin
4.1 Deriving [VN]V compounds by head movement
4.2 Prosody of [VN]V compounds
5. A new perspective
5.1 Compounds formed through reduplication
5.2 Compounds formed through merge
5.3 Conclusion

