properties of school chinese,implications for learning to read 4

21
Properties of School Chinese: Implications for Learning to Read Hua Shu, Xi Chen, Richard C. Anderson, Ningning Wu, and Yue Xuan The properties of the 2,570 Chinese characters explicitly taught in Chinese elementary schools were systematically investigated, including types of characters, visual complexity, spatial structure, phonetic regularity and consistency, semantic transparency, independent and bound components, and phonetic and semantic families. Among the findings are that the visual complexity, phonetic regularity, and semantic transparency of the Chinese characters taught in elementary school increase from the early grades to the later grades: Characters introduced in the 1st or 2nd grade typically contain fewer strokes, but are less likely to be regular or transparent, than characters introduced in the 5th or 6th grade. The inverse relation holds when characters are stratified by frequency. Low-frequency characters tend to be visually complex, phonetically regular, and semantically transparent whereas high-frequency characters tend to be the opposite. Combined with other findings, the analysis suggests that written Chinese has a logic that children can understand and use. Chinese, the language used in the most populous country in the world, has a writing system pro- foundly different from alphabetic systems. The basic units of the Chinese writing system are characters, or hanzi, literally, the written symbols of the Chinese people. Instead of the letter-to-phoneme correspon- dences of languages with alphabetic writing sys- tems, in Chinese the correspondences are character- to-syllable. In alphabetic languages, individual letters are meaningless; a string of letters is needed to represent a morpheme. In Chinese, an individual character represents one, and typically only one, morpheme. Since the early studies 2 decades ago (Hung & Tzeng, 1981), numerous psycholinguistic experiments have investigated how Chinese adults process characters (e.g., Spinks, Liu, Perfetti, & Tan, 2000; Zhou, Marslen-Wilson, Taft, & Shu, 1999). A smaller but growing body of research has investi- gated how Chinese children acquire characters (e.g., Chan & Siegel, 2001; Ho & Bryant, 1997; Shu & Anderson, 1997; Shu, Anderson, & Wu, 2000). The purpose in this study was to provide the foundation for a better understanding of children’s acquisition of characters. We sought to describe the extent to which Chinese characters encode informa- tion about pronunciation and meaning in an identifiable and orderly way. Our method was to make a detailed analysis of each of the 2,570 characters listed by the Elementary School Textbooks (1996) to be taught explicitly in elementary school in Beijing and elsewhere. This corpus of characters can be called school Chinese, with the restriction to Grades 1 through 6 understood, although individual teachers may explicitly teach additional characters and children may indepen- dently learn additional characters that they encoun- ter in required and suggested reading. To the extent that a spoken language and associated writing system are connected in a logical way, it is only reasonable to suppose that children who understand the logic will more easily acquire and use the written language (Nagy & Anderson, 1998). On the other hand, when the pronunciation and meaning of new words cannot be reliably predicted from principles relating speech and print, then children must become familiar with words in order to pronounce them and identify their mean- ings. Goswami and her colleagues found that the size of the familiarity effect is much greater in English and French, languages with less transparent orthographies, than in German, Spanish, and Greek, languages with more transparent orthographies (Goswami, Gombert, & Fraca de Barrera, 1998; Goswami, Porpodas, & Wheelwright, 1997; Wimmer & Goswami, 1994). It would be difficult, perhaps impossible, to place Chinese on the same scale as the Western alphabetic languages studied by Goswami and her colleagues. Nonetheless, the general principle is applicable: r 2003 by the Society for Research in Child Development, Inc. All rights reserved. 0009-3920/2003/7401-0003 Hua Shu, Department of Psychology, Beijing Normal Univer- sity; Xi Chen and Richard C. Anderson, Center for the Study of Reading, University of Illinois; Ningning Wu, Department of Psychology, Beijing Normal University; Yue Xuan, Department of Educational Psychology, Pennsylvania State University. The authors are pleased to acknowledge the support of the Spencer Foundation for the completion of this research. The authors are also grateful for the help they received from Dacheng Zhang, Xianjun Zheng, and Meiling Hao. Correspondence concerning this article should be addressed to Hua Shu, Department of Psychology, Beijing Normal University, Beijing, China, 100875. Child Development, January/February 2003, Volume 74, Number 1, Pages 27–47

Upload: yuxuan-liu

Post on 28-May-2015

95 views

Category:

Education


2 download

DESCRIPTION

chinese teaching

TRANSCRIPT

Page 1: Properties of school chinese,implications for learning to read 4

Properties of School Chinese: Implications for Learning to Read

Hua Shu, Xi Chen, Richard C. Anderson, Ningning Wu, and Yue Xuan

The properties of the 2,570 Chinese characters explicitly taught in Chinese elementary schools weresystematically investigated, including types of characters, visual complexity, spatial structure, phoneticregularity and consistency, semantic transparency, independent and bound components, and phonetic andsemantic families. Among the findings are that the visual complexity, phonetic regularity, and semantictransparency of the Chinese characters taught in elementary school increase from the early grades to the latergrades: Characters introduced in the 1st or 2nd grade typically contain fewer strokes, but are less likely to beregular or transparent, than characters introduced in the 5th or 6th grade. The inverse relation holds whencharacters are stratified by frequency. Low-frequency characters tend to be visually complex, phoneticallyregular, and semantically transparent whereas high-frequency characters tend to be the opposite. Combinedwith other findings, the analysis suggests that written Chinese has a logic that children can understand and use.

Chinese, the language used in the most populouscountry in the world, has a writing system pro-foundly different from alphabetic systems. The basicunits of the Chinese writing system are characters, orhanzi, literally, the written symbols of the Chinesepeople. Instead of the letter-to-phoneme correspon-dences of languages with alphabetic writing sys-tems, in Chinese the correspondences are character-to-syllable. In alphabetic languages, individualletters are meaningless; a string of letters is neededto represent a morpheme. In Chinese, an individualcharacter represents one, and typically only one,morpheme. Since the early studies 2 decades ago(Hung & Tzeng, 1981), numerous psycholinguisticexperiments have investigated how Chinese adultsprocess characters (e.g., Spinks, Liu, Perfetti, & Tan,2000; Zhou, Marslen-Wilson, Taft, & Shu, 1999). Asmaller but growing body of research has investi-gated how Chinese children acquire characters (e.g.,Chan & Siegel, 2001; Ho & Bryant, 1997; Shu &Anderson, 1997; Shu, Anderson, & Wu, 2000).

The purpose in this study was to provide thefoundation for a better understanding of children’sacquisition of characters. We sought to describe theextent to which Chinese characters encode informa-

tion about pronunciation and meaning in anidentifiable and orderly way.

Our method was to make a detailed analysis ofeach of the 2,570 characters listed by the ElementarySchool Textbooks (1996) to be taught explicitly inelementary school in Beijing and elsewhere. Thiscorpus of characters can be called school Chinese, withthe restriction to Grades 1 through 6 understood,although individual teachers may explicitly teachadditional characters and children may indepen-dently learn additional characters that they encoun-ter in required and suggested reading.

To the extent that a spoken language andassociated writing system are connected in a logicalway, it is only reasonable to suppose that childrenwho understand the logic will more easily acquireand use the written language (Nagy & Anderson,1998). On the other hand, when the pronunciationand meaning of new words cannot be reliablypredicted from principles relating speech and print,then children must become familiar with words inorder to pronounce them and identify their mean-ings. Goswami and her colleagues found that thesize of the familiarity effect is much greater inEnglish and French, languages with less transparentorthographies, than in German, Spanish, and Greek,languages with more transparent orthographies(Goswami, Gombert, & Fraca de Barrera, 1998;Goswami, Porpodas, & Wheelwright, 1997; Wimmer& Goswami, 1994).

It would be difficult, perhaps impossible, to placeChinese on the same scale as the Western alphabeticlanguages studied by Goswami and her colleagues.Nonetheless, the general principle is applicable:

r 2003 by the Society for Research in Child Development, Inc.All rights reserved. 0009-3920/2003/7401-0003

Hua Shu, Department of Psychology, Beijing Normal Univer-sity; Xi Chen and Richard C. Anderson, Center for the Study ofReading, University of Illinois; Ningning Wu, Department ofPsychology, Beijing Normal University; Yue Xuan, Department ofEducational Psychology, Pennsylvania State University.

The authors are pleased to acknowledge the support of theSpencer Foundation for the completion of this research. Theauthors are also grateful for the help they received from DachengZhang, Xianjun Zheng, and Meiling Hao.

Correspondence concerning this article should be addressed toHua Shu, Department of Psychology, Beijing Normal University,Beijing, China, 100875.

Child Development, January/February 2003, Volume 74, Number 1, Pages 27–47

Page 2: Properties of school chinese,implications for learning to read 4

Repeated experience with written words is more-For lessFimportant depending on the regularity,transparency, and consistency of the writing system.As we explain in detail later, we take the optimisticposition that there is a logic to the formation ofChinese characters that, when understood by chil-dren, can be very useful to them for acquiring,remembering, and using characters. By way ofcontrast, conventional educational practice in Chinaseems to assume the opposite: that childish attemptsto use what is presumably regarded as the uncertainlogic of characters amount to guessing and shouldbe discouraged, that for learning to consistentlyhappen each and every character must be explicitlytaught and repeatedly practiced (Wu, Li, & Ander-son, 1999).

Children do not learn a language all at once. Theylearn it over a period of years. Whatever the overalldegree of order in written Chinese, this order maynot be fully represented in the characters thatchildren learn in the early grades. This couldinfluence the age at which children become awareof features of the writing system and begin to use thefeatures to assimilate characters. For this reason, wehave coded every character in the corpus of schoolChinese according to the grade and semester that itis listed to be taught by the Ministry of Education.

Several previous analyses have sought to deter-mine the degree of phonetic regularity of largesamples of Chinese characters (Li & Kang, 1993;Perfetti, Zhang, & Berant, 1992; Yin, 1991; Zhou,1978, 1980). Other analyses have had the goal ofevaluating the extent of semantic transparency oflarge samples of characters (Kang, 1993; Perfetti et al.1992; Wang, 1997). We review the findings fromthese analyses later, after the methods and findingsfrom the present analysis have been described. In themeantime, aside from one analysis of the phonolo-gical cues in the characters in an earlier edition of theelementary school textbooks (Shu, Wu, Zheng, &Zhou, 1998), which we also review later, everyprevious study has examined characters that pre-supposed adult reading material. This article con-tains the first comprehensive analysis of thecharacters in reading material for schoolchildren.

An estimated 80% to 90% of modern charactersare semantic–phonetic compounds (also called ideo-phonetic compounds or, simply, phonetic com-pounds) that consist of two major components.One is a semantic component (often called a radical)that gives information about the meaning of thecharacter. The other is a phonetic component (oftencalled a phonetic) that gives information about thecharacter’s pronunciation. For example, the charac-

ter /ma/ (mother) consists of the radical(female) and the phonetic /ma/.1

Written Chinese contains about 200 radicals and800 phonetics (Hoosain, 1991). Thousands of com-pound characters are formed from combinations andrecombinations of these components. Major compo-nents may be further subdivided into about 650subcomponents (Fu, 1989). The subcomponents arerecurrent patterns that do not have meanings orpronunciations themselves, but serve as the buildingblocks of major components. The stroke is thesmallest unit of the Chinese writing system. Thereare eight types of strokes. The number of individualstrokes in a character is usually treated as the indexof its visual complexity.

Most radicals and phonetics, in addition to beingcomponents of compound characters, are themselvessimple characters with independent pronunciationsand meanings. A smaller number of radicals andphonetics are bound forms. That is, they neverappear alone, but only appear as components ofcompound characters.

There is wide variability in the usefulness of theinformation conveyed by the components of seman-tic–phonetic compound characters. Some contain aradical that provides an obvious and direct clue tomeaning and, thus, can be categorized as semanti-cally transparent. For example, the character(pine) has the radical (wood), and the character

(candle) has the radical (fire). Other characters,which might be called semitransparent, contain aradical that provides a weak or indirect clue tomeaning. For example, the characters (hunting)and (sly, crafty like a fox) both contain the radical

(animal). Still other characters are semanticallyopaque inasmuch as the radical provides no clue tomeaning. For example, (mistake) has the radical(metal).

The usefulness of information provided in thephonetic component of semantic–phonetic com-pound characters also varies. Each character ispronounced with one, and usually only one, syllable.All of the approximately 1,200 syllables of modernstandard Chinese can be uniquely described in termsof three phonological elements: onset, final, andtone. There are a limited number of onsets and finalsand just four tones (five counting the neutral tone).The four tones, or voice inflections, are high, rising,low then rising, and falling.

1 Meanings of characters are enclosed in parentheses. Pronun-ciations are enclosed in back slashes. The diacritical marksrepresent tones, for example, /ma/, /ma/, /ma/, and /ma/ forthe first through fourth tones, respectively.

28 Shu et al.

Page 3: Properties of school chinese,implications for learning to read 4

Perfectly regular characters are pronounced withthe same onset, final, and tone as the phonetic inisolationFthat is, the same as the phonetic when itis being used as a simple character. For example,is pronounced /cai/, which is the same pronuncia-tion as the phonetic when the phonetic functions as asimple character, /cai/. However, there are manyirregular compound characters that have a comple-tely different pronunciation from the phonetic. Forinstance, the compound is pronounced /zu/whereas its phonetic is pronounced /qie/. Inbetween is a large group of compound charactersthat are semiregular in that the phonetic providespartial information about pronunciation. For exam-ple, the character /qıng / is pronounced with thesame syllable as its phonetic /q�ing/, but the toneis different. The character /xian/ and its phonetic

/jian/ are pronounced with the same final andtone but a different onset.

Characters with bound phonetics do not havefixed, independent pronunciations. However, thefact that, in many cases, characters sharing the samebound phonetic are related in pronunciation pro-vides potentially useful information. For example,several characters sharing the bound phonetic ,including , , , and , are pronounced /fu/.

Semantic–phonetic compound characters have aspatial structure. The most common one is the left-right structure with the radical on the left side andthe phonetic on the right side. Many major compo-nents have fixed positions within characters. Othercomponents may appear in different positions. Afew components serve a different function depend-ing on their position within characters, changingfrom semantic radical to phonetic.

Chinese characters that are not semantic–phoneticcompounds include pictographs, ideographs, seman-tic compounds, and a few other rare types. Picto-graphs are simple characters that were easily seen aspicturing objects in Ancient Chinese. For example,was the character for moon. However, over centuriesof use, pictographs have become more stylized sothat today few of them clearly represent the objectsthey denote. The other type of meaningful simplecharacter is the ideograph. Examples of ideographsinclude the characters for up and down .Modern Chinese also contains some semantic com-pound characters in which the character meaning issuggested by the combination of the meanings of itscomponents. For example, the character (letter)consists of the radical for person and the radical forsay. Pictographs, ideographs, and semantic com-pounds represent only a small percentage of pre-sent-day characters (Taylor & Taylor, 1995).

To recapitulate, in this study we sought todescribe the extent to which the characters thatconstitute school Chinese have a logic that couldenable an informed reader to extract useful cues topronunciation and meaning. The goal was to under-stand reading development, the aspects of Chinesethat children may readily understand and theaspects that children may have difficulty under-standing and that, therefore, may be slow todevelop. A subsidiary goal was to make availableto the education and research communities adetailed and comprehensive analysis of each of thecharacters in the corpus. Character-by-characterinformation should be useful to teachers, curriculumdevelopers, textbook and storybook authors, andeducational and psycholinguistic researchers.2

Method

Characters in corpus of school Chinese were drawnfrom the vocabulary lists in the 12 volumes of theElementary School Textbooks (1996) prepared by theMinistry of Education and currently used in Beijingand several other regions. Primary schools in Chinahave six grades. An academic year consists of a falland a spring semester and in each semester onevolume of the textbook is used. There are therefore12 volumes of the textbook altogether for theprimary school period. Each volume has a vocabu-lary list attached to it, which collects all the newcharacters that are supposed to be explicitly taughtin class. However, the vocabulary list does notnecessarily include all the new characters in thevolume, because, especially in the upper grades,there may be new characters that are not required tobe taught. Included in the corpus are all thecharacters that are required to be explicitly taughtthroughout the primary school period, from Grade 1to Grade 6.

The corpus contains 2,648 characters. For reasonsthat are not clear to us, about 70 characters appearmore than once in the vocabulary lists. In thestatistical analysis of the corpus, a character thatappears more than once was counted only once, atits first appearance, which left 2,570 distinct char-acters in our analysis.

Some or all of the following information is givenfor each character in the corpus, depending on thefeatures of the character:

2 A CD-ROM containing the character by character listing inMicrosoft Excel (Asian version 7.0) of the corpus of school Chinesecan be purchased for $25 from School Chinese, Center for theStudy of Reading, 51 Gerty Drive, Champaign, IL 61820, USA.Introductions and column headings are provided in both Chineseand English.

Properties of School Chinese 29

Page 4: Properties of school chinese,implications for learning to read 4

1. The serial number of the character. Thecharacters are listed in order of phoneticregularity, phonetic code, and volume num-ber, which are defined later.

2. The character.

3. The pronunciation of the character in the formof pinyin. The number 1, 2, 3, or 4 marks thetone of the character. The information aboutpronunciation is from the Modern ChineseDictionary (1998).

4. The phonetic code. This is the pronunciationof the phonetic component of a semantic–phonetic compound character. The numbermarks the rank of the phonetic component inNi (1982). For example, the phonetic code for

is /ao2/ because its phonetic componentis pronounced /ao/ and in the dictionaryis the second phonetic component pro-

nounced /ao/. The purpose of the phoneticcode is to distinguish homophonic phoneticssuch as and .

5. Phonetic regularity. The information for pho-netic regularity is described in greater detaillater.

6. The phonetic component form. The phoneticcomponent can be either an independentcharacter or a bound form. An independentcharacter is a character that appears in theModern Chinese Dictionary (1998). A boundform doesn’t appear alone but only as acomponent of a semantic–phonetic com-pound character.

7. The position of the phonetic component in thecharacter. It can appear: (a) on the right side,(b) on the left side, (c) on the top, (d) at thebottom, (e) inside the radical, (f) surroundingthe radical, or (g) other.

8. The position of the phonetic component. Iteither always appears in a fixed position orappears in different positions in differentcharacters.

9. The possibility that the phonetic comp-onent in a semantic–phonetic compoundcharacter can be a radical in anothercompound character. The phonetic compo-nent can be a radical, or it cannot be aradical.

10. Phonetic consistency. The possibilities are: (a)all the characters with this phonetic in thecorpus have the same pronunciation, (b)the characters with this phonetic in thecorpus have different pronunciations, or (c)

there is only one character in the corpus thathas this phonetic. The information on pho-netic consistency is described in greater detaillater.

11. The lesson (within volume) in which thecharacter is taught.

12. The volume in which the character is taught.

13. The frequency rank of the character in theDictionary of Chinese Character Information(1988).

14. Frequency 1. This is the frequency of thecharacter in adult reading material. Thefrequency information is from the Dictionaryof Chinese Character Information (1988).The unit of frequency is occurrences permillion.

15. Frequency 2. This is also the frequency that acharacter appears in adult reading materials,taken from the Modern Chinese FrequencyDictionary (1986). This measure of frequencywas not used in the analyses reported in thisarticle.

16. The possibility that the character can be aphonetic. Either it can be a phonetic or itcannot be a phonetic (applies only to char-acters that are not semantic–phonetic com-pound characters).

17. Structure of the character. This lists thecomponents and subcomponents of eachcharacter, in the order from left to right ifthe character is horizontally structured andfrom top to bottom if the character isvertically structured. Component is definedas the smallest configuration of strokes thatrecur in the radicals or phonetics of many cha-racters.

18. The number of components in a character.

19. Structure type. The character has a: (a) simplestructure, (b) top-bottom structure, (c) left-right structure, or (d) surrounding and half-surrounding structure.

20. The number of strokes. The information in 17,18, 19, 20 is taken from Dictionary of ChineseCharacter Information (1988).

21. Semantic transparency. The information onsemantic transparency is described in greaterdetail later.

22. The possibility that the radical is an indepen-dent character. The radical can be an inde-pendent character, or the radical is a boundform that doesn’t appear alone but only as a

30 Shu et al.

Page 5: Properties of school chinese,implications for learning to read 4

component of a semantic–phonetic com-pound character.

23. The position of the radical in the character. Itis: (a) on the right side, (b) on the left side, (c)on the top, (d) at the bottom, (e) inside thephonetic component, (f) surrounding thephonetic component, or (g) other.

24. The possibility that the radical has a fixedposition. Either it has a fixed position or it canappear in more than one position.

The reference for phonological information wasNi (1982). Ni’s monograph consists of two sections.The first section lists semantic–phonetic compoundcharacters in alphabetic order of pinyin and pro-vides the phonetic for each character. The secondsection lists phonetics in alphabetic order of pinyinand provides the semantic–phonetic compoundcharacters each phonetic can form. A character isdefined as a semantic–phonetic compound characterin the corpus if it can be found in Ni. Thepronunciation of compound characters was basedon the Modern Chinese Dictionary (1998). Thispronunciation was compared with the pronuncia-tion of the phonetic in Ni to decide the relationbetween the two.

Detailed information is also provided in thecorpus about the relation between the meaning ofeach semantic–phonetic compound character andthat of its radical. To evaluate the relation betweenthe meaning of a character and that of its radical foreach semantic–phonetic compound character in thecorpus, 30 college students in the Chinese Depart-ment at Capital Normal University in Beijing wereasked to define: (a) the meaning of the radical, (b)the direct meaning of the character, and (c) theextended meaning of the character. Dictionaries (e.g.,T’sou, 1981) were also consulted to determine themeanings of radicals. On the basis of the students’rating and information provided in reference books,two researchers classified each character in thecorpus into one of the nine categories, which areexplained in detail later. The interrater reliabilitywas .89.

Analyses were performed on types of characters,visual complexity of characters, the structure of

characters, independent and bound components(phonetics and radicals), and phonetic and radicalfamilies. Frequency of usage and grade are factors inmost analyses and are defined as follows. Factorsthat figured in one analysis are defined in the sectionin which the analysis is described.

Frequency of usage is number of occurrences permillion, as determined in a large character-frequencystudy (Dictionary of Chinese Character Information,1988), in which books and other written materialswere sampled from various fields and from 192newspapers and magazines. The frequency of 6,374characters (types) was tabulated from 21,629,372running characters of text (total tokens). This isFrequency 1 for each character. Frequency 1 wasbased on adult reading material. It would have beenpreferable to have used a measure of frequencyderived from children’s reading material, but nonewas available. We glean some reassurance from ananalysis by Carroll (1971) that implies a highcorrelation between frequency counts in adult read-ing material (Kucera & Francis, 1967) and childreading material (Carroll, Davies, & Richman, 1971).We say ‘‘implies a high correlation’’ because theevidence is indirect. Carroll found that subjectiveratings of the frequency of a set of words correlated.96 with actual frequency in adult reading materialand .95 with actual frequency in children’s readingmaterial. These correlations could not be this highwithout there also being a high correlation betweenthe actual frequencies in adult and child material.Carroll’s analysis involved English, but there is noreason to suppose that the correlation betweenfrequencies in adult and child material would beany lower in Chinese.

Grade is the grade in which a character is firstintroduced in the national reading and languagetextbooks, coded 1 through 6 for the six grades ofelementary school in China.

Results

Types of Characters

Table 1 presents the number and proportion ofcharacters introduced in each grade. The table showsthat the task of learning characters is heaviest for

Table 1

Number and Proportion of Characters by Grade

Grade 1 2 3 4 5 6 Total

Number of characters 436 709 541 358 323 203 2,570

Proportion of total .17 .28 .21 .14 .13 .08 1.00

Properties of School Chinese 31

Page 6: Properties of school chinese,implications for learning to read 4

Chinese children in the early years of primaryschool, especially in the second and third grades.The number of characters introduced decreases inhigher grades, because the higher grades focus moreon reading comprehension, and children are increas-ingly expected to learn characters on their own (Wuet al. 1999).

The characters in the corpus were coded into twogeneral categories. The first category consisted ofsemantic–phonetic compound characters. The sec-ond category contained characters that are notsemantic–phonetic compound characters or, moresimply, nonphonetic characters. Nonphonetic char-acters were further divided into four subcategories:pictographs, ideographs, semantic compounds, anda miscellaneous category that included several raretypes.

Table 2 presents the proportions of differentcategories of characters that are taught in Grade 1through Grade 6. The majority of the characters,about 72%, are semantic–phonetic compound char-acters, and 28% of the characters are not semantic–phonetic compound characters. Table 2 reveals that alarge proportion of characters introduced in Grade 1are nonphonetic characters. These include picto-graphs and semantic compounds, which are more orless directly meaningful characters. Many of thenonphonetic characters are simple characters thatserve as building blocks for semantic–phoneticcompound characters introduced in later grades.

More than 70% of nonphonetic characters becomephonetic or radical components of characters chil-dren will learn in higher grades. In Grade 2, already70% of the characters are semantic–phonetic com-pounds. The proportion increases with grade leveland is more than 80% in Grades 4, 5, and 6.

Table 3 displays the frequency distribution ofcharacters by grade. Frequency is represented infour bands: more than 100 per million, between 10and 100 per million, between 1 and 10 per million,and less than 1 per million. The overall meanfrequency of the characters in the corpus is 370 permillion. The mean frequency decreases from 1,129per million in the first grade to 44 per million in thesixth grade. As can be seen, the proportion of low-frequency characters in the textbook increases withgrade, and the proportion of high-frequency char-acters decreases with grade. Table 3 indicates thatchildren encounter more low-frequency charactersin higher grades. Putting the results together, we cansee that although children are learning fewercharacters in higher grades, the characters are oflower frequency, which presumably makes the taskmore challenging.

Visual Complexity

The basic units of Chinese characters are strokes.A stroke is written in one continuous movement. A

Table 2

Proportions of Different Types of Characters by Grade

Grade

Type 1 2 3 4 5 6 Total

Semantic-phonetic compounds .45 .70 .76 .84 .86 .81 .72

Pictographs .24 .07 .04 .03 .03 .04 .08

Ideographs .02 .00 .01 .00 .00 .00 .01

Semantic compounds .22 .15 .13 .10 .08 .10 .14

Others .08 .08 .06 .03 .03 .05 .06

Table 3

Frequency Distribution of Characters by Grade

Grade

Frequency band 1 2 3 4 5 6 Total

Above 100 per million .79 .59 .40 .24 .21 .10 .45

Between 10 and 100 per million .19 .34 .42 .53 .43 .46 .38

Between 1 and 10 per million .02 .06 .14 .17 .27 .33 .13

Below 1 per million .00 .01 .04 .06 .10 .12 .04

Mean frequency 1,129 387 212 100 76 44 370

32 Shu et al.

Page 7: Properties of school chinese,implications for learning to read 4

stroke is usually a line or a dot but it can also involvechanges of direction as in . A line can be vertical,horizontal, or diagonal; it can be straight or gentlycurved; and it can be with or without a ‘‘hook’’(Taylor & Taylor, 1995). The simple character(wood), for example, has four types of lines in it:vertical, horizontal, left slanting, and right slanting. Adot (or shorter line segment) can also point in variousdirections, such as the ones in (prosperous).

Visual complexity is usually defined in terms ofthe number of strokes in a character. The number ofstrokes of each character in the corpus was deter-mined from the same source from which frequencywas determined (Dictionary of Chinese CharacterInformation, 1988). All the characters in thecorpus were included in the analysis of visualcomplexity.

Figure 1 displays the distribution of characters inthe corpus as a function of number of strokes. Thesimplest character in the corpus, , has only 1stroke; the most complicated character, , has 24strokes. Although the number of strokes variesgreatly, 95% of the characters taught in primaryschool have fewer than 15 strokes. Most charactershave 7 to 12 strokes.

It can be seen from Table 4 that the overall trend isfor the visual complexity of characters to increase

with grade level. The percentage of visually simplecharacters (from 1 to 6 strokes) in the textbookdecreases with grade; the percentage of visuallycomplex characters (from 13 to 24 strokes) increaseswith grade. After the first grade, the percentage ofcharacters with 7 to 12 strokes remains relativelystable across grades.

A moving-window method was used to graph therelation between frequency and number of strokes.All of the characters in the corpus were listed indescending order of frequency. The window sizewas 50. The independent variable was F, the averageof the logarithms of the adjusted frequencies of the50 characters in the window, or ln(Frequency 1 0.5).The dependent variable was S, the average numberof strokes of the 50 characters in the window. Then,the window moved down one character, the firstcharacter in the window was dropped, a new onewas added, and F and S were recalculated.

Figure 2 shows that number of strokes decreasesas frequency increases. That is, frequently usedcharacters have fewer strokes and rarely usedcharacters have more strokes. The relation betweenF and S is strong and linear (R25 .856, F5 14,942.0,po.001). The relation complements the finding that,although the mean frequency of the charactersdecreases with grade, mean visual complexity

117

48

95

133

201

261

312 317

277

223235

149

9883

4435

12 14 9 2 2 1 1

0

50

100

150

200

250

300

350

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Number of strokes

Num

ber

of c

hara

cter

s

Figure 1. Distribution of characters as a function of strokes.

Table 4

Visual Complexity of Characters by Grade

Grade

Stroke count 1 2 3 4 5 6 Total

1–6 strokes .45 .20 .15 .10 .07 .11 .19

7–12 strokes .48 .65 .68 .67 .68 .64 .63

13–24 strokes .07 .15 .18 .23 .25 .26 .18

Mean strokes 7.37 9.22 9.80 10.29 10.57 10.56 9.45

Properties of School Chinese 33

Page 8: Properties of school chinese,implications for learning to read 4

increases. The moving-window method enabled arepresentation, not only of the trend line, but also ofthe variability around the trend line.

Spatial Structure of Characters

Radicals can have different positions withinsemantic–phonetic compound characters. The radi-cal can be on the left side, such as in ; on theright side, such as in ; on the top, such as in

; at the bottom, such as in ; surrounding aphonetic, such as in ; half surrounding aphonetic, such as in ; or partly or entirelysurrounded by a phonetic, such as in . Someradicals always appear in the same position withincharacters whereas other radicals may appear in twoor more positions. In the corpus of school Chinese,1,060 (57%) semantic–phonetic compound charactershave a radical whose position is fixed in everycharacter in which it appears; 790 (43%) charactershave a radical that can appear in more than oneposition.

Phonetic components can also appear in differentpositions in different characters. For example,

/gong/ can be in at least four positions: on the left in/gong/, on the right in /hong/, on the top in/gong/, and at the bottom in /kong/. The

phonetic /yuan/ is half surrounded by theradical in /yuan/ and completely surroundedby the radical in /yuan/. Some phonetics canonly appear in a particular position; /dang/, forexample, is always on the right. Among the 1,491standard semantic–phonetic compound characters,83% have phonetics that can appear in differentpositions, and 17% have phonetics that can onlyappear in a particular position. To make the situationeven more complicated, 14% of standard semantic–phonetic compound characters have phonetics thatcan serve as radicals in other characters. Forexample, /feng/ is a phonetic in /feng/ buta radical in /piao/.

Table 5 presents the proportion of semantic–phonetic compound characters with phonetics indifferent positions. To compile the table, we definedfive positions: right, left, top, bottom, and surround.The latter included characters in which the radicalsurrounds or half surrounds the phonetic or, lessoften, in which the phonetic surrounds or halfsurrounds the radical.

The table indicates that 64% of the characters havephonetics on the right, with smaller proportions inthe other positions, which may lead a child to theoversimple impression, ‘‘The part on the right tellsthe pronunciation’’ (Shu, Anderson, et al., 2000, p.61). As we see from the /gong/ example, aphonetic can be in different positions in differentcharacters. It is not surprising that the positions ofradicals and phonetics are complements of oneanother because, for instance, if the radical is onthe top of a character, the phonetic must be on thebottom.

Phonetic Regularity

Phonetic regularity is defined in terms of thecontribution of a phonetic to the pronunciation of asemantic–phonetic compound character. Based on

Ln Adjusted Frequency

1086420-2

Num

ber

of S

trokes

14

12

10

8

6

4

2

0

Figure 2. Number of strokes as a function of frequency.

Table 5

Spatial Structure of Semantic-Phonetic Compound Characters With the Phonetic in Various Positions

Structure

Left–right Top–bottom Surround

Proportion .72 .18 .10

Phonetic position Left Right Top Bottom Surround

Proportion .07 .64 .09 .10 .10

34 Shu et al.

Page 9: Properties of school chinese,implications for learning to read 4

the classification of a Chinese linguist (Ni, 1982), thesemantic–phonetic compound characters in thecorpus were divided into seven categories, rangingfrom the onset, final, and tone being the same as thephonetic to all of them being different. The cate-gories are as follows:

1. The character has the same pronunciation as itsphonetic, including the tone (e.g., the character

/qing/ and its phonetic /q�ing/).

2. The character has the same syllable as itsphonetic but a different tone (e.g., the character

/q�ing/ and its phonetic /q�ing/).

3. The character has the same final as its phoneticbut a different onset. The tone can be the same(e.g., /j�ing/ as its phonetic /q�ing/) ordifferent (e.g., /hong/ and its phonetic/gong/).

4. The character has the same onset as itsphonetic but a different final. The tone can bethe same (e.g., /jie/ vs. /jı/) or different(e.g., /da/ vs. /d�ing/).

5. The character is pronounced with a totallydifferent syllable from its phonetic. The tonecan be the same (e.g., /cai/ vs. /q�ing/)or different (e.g., /yan/ vs. /kai/).

6. Either the character or the phonetic has morethan one pronunciation. For example, canbe pronounced either as /pu/ or /piao/; thephonetic of /du/, , can be pronouncedeither as /du/ or /duo/.

7. The character lost its original phonetic some-time in the past. For example, the traditionalcharacter /j�i/ had the phonetic /x�i/.But the phonetic was lost when the characterwas simplified to /j�i/.

We defined four types of semantic–phoneticcompound characters according to phonetic regular-ity: regular (Categories 1 and 2), semiregular(Categories 3 and 4), irregular (Category 5), andothers (Categories 6 and 7). The phonetic in a regularcompound provides full information about pronun-ciation of the syllable (although the tone may vary).The phonetic in a semiregular compound providespartial information. The phonetic in an irregularcompound provides no useful information. It is hardto judge information provided by a phonetic in acompound classified ‘‘other’’ because the informa-tion is either ambiguous or has been lost. For thisreason, this type was not analyzed further.

Table 6 presents the proportions of characters inNi’s (1982) categories and our types of semantic–phonetic compound characters. The largest percen-tage is regular characters (39%), followed by semi-regular (26%), other (20%), and irregular (15%)characters. Zhou (1978) estimated that 82% of thecharacters in modern Chinese are semantic–phoneticcompounds. The percentage in primary school islower, only 72%, because a lot of characters taught inthe early grades are nonphonetic characters. Zhou(1980) estimated that the phonetics in about 40% ofthe semantic–phonetic compound characters have

Table 6

Proportions of Different Types and Categories of Semantic-Phonetic Compound Characters

Type

Regular Semiregular Irregular Other

Proportion .39 .26 .15 .20

Category 1 2 3 4 5 6 7

Proportion .23 .16 .20 .06 .15 .14 .06

Table 7

Proportions of Regular, Semiregular, and Irregular Semantic-Phonetic Compound Characters by Grade

Grade

Type 1 2 3 4 5 6

Regular .32 .38 .41 .42 .41 .43

Semiregular .25 .24 .28 .24 .27 .30

Irregular .14 .15 .15 .17 .17 .12

Other .29 .23 .16 .17 .16 .15

Properties of School Chinese 35

Page 10: Properties of school chinese,implications for learning to read 4

the same syllable as the character. The percentagedropped to 26% when the tone has to be the same.Table 6 shows that among the characters taught inprimary school, phonetics in 39% of the semantic–phonetic compound characters provide full informa-tion about syllables, which is very close to Zhou’sestimate.

Table 7 indicates that the proportion of regularcharacters increases until Grade 3 and then remainsrelatively stable. In addition, there are more regularcharacters than semiregular characters in each grade.The proportion of semiregular and irregular char-acters remains relatively stable across grades. Chil-dren in the first and second grade are introduced toa lot of characters in the ‘‘other’’ category. As theproportions of regular and semiregular charactersincrease, and the proportion of ‘‘other’’ charactersdeclines, children have more chance to becomeaware of the function of a phonetic.

To model a child’s language experience moreaccurately, we present frequency-weighted propor-tions of different types of compounds in Table 8. Thegeneral pattern is similar to that in Table 7. Table 8

also shows that in most grades, the frequency-weighted proportions of regular and semiregularcharacters are lower than the unweighted propor-tions in Table 7. On the other hand, the frequency-weighted proportions of ‘‘other’’ compounds arehigher than the unweighted proportions. This isbecause, as we discuss later with respect to themoving-window figure, the more regular characterstend to be of lower frequency. Averaging frequency-weighted proportions over grades, we again foundthat compared with the unweighted proportions inTable 7, the proportions of regular and semiregularcompounds are lower, the proportion of irregularcompounds remains the same, and the proportion of‘‘other’’ compounds is much higher. The findingsindicate that when we take frequency into con-sideration, the Chinese characters learned in pri-mary school are even less regular.

The moving-window method was used to graphthe relation between phonetic regularity and fre-quency. The independent variable was F, the meanlog-adjusted frequency of the 50 characters in thewindow. The dependent variable was R, the averageof the phonetic values of the 50 characters in thewindow. Characters in Categories 1 and 2 werecoded 1. Characters in the remaining categories werecoded 0. The characters are listed in descendingorder of frequency, and F and R were taken as thewindow moves down one character at a time.

Figure 3 indicates that, in general, phoneticregularity decreases as frequency of usage increases.That is, characters of lower frequency tend to bemore regular. The trend can be considered linear(R25 .420, F5 1,303.1, p5 .000), although there wasquite a bit of variability throughout the range. Ourresults, produced by the moving-window method,agree with Perfetti et al. (1992) but provide a moredetailed picture of the variation of phonetic reg-ularity in Chinese. The fact that characters of lowerfrequency tend to be more regular makes it possiblefor children to understand phonetic regularityamong Chinese characters as their grade level

Table 8

Frequency-Weighted Proportions of Regular, Semiregular, and Irregular Semantic-Phonetic Compound

Characters by Grade

Grade

Type 1 2 3 4 5 6 Total

Regular .22 .29 .37 .40 .49 .34 .29

Semiregular .14 .26 .20 .22 .23 28 .20

Irregular .07 .21 .20 .18 .17 .09 .15

Other .57 .24 .22 .21 .11 .28 .37

Ln Adjusted Frequency

1086420-2

Pro

po

rtio

n P

ho

ne

tica

lly R

eg

ula

r

1.0

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.0

Figure 3. Phonetic regularity as a function of frequency.

36 Shu et al.

Page 11: Properties of school chinese,implications for learning to read 4

increases and as they become more experienced withprint. First graders have only limited opportunitiesto acquire insight into the function of phoneticsbecause 55% of the characters in the first grade arenonphonetic. Nevertheless, the first year laysthe foundation for subsequent learning, because64% of the nonphonetic characters introducedin the first year become phonetics of semantic–phonetic compound characters learned in lateryears.

Semantic Transparency

Semantic transparency is defined in terms of thecontribution of a radical to the meaning of asemantic–phonetic compound character. The moresemantic information a radical provides, the moretransparent is the character. We coded the semantic–phonetic compound characters in the corpus intoeight semantic categories based on the semanticrelation between the radicals and compound char-acters containing the radicals. The categories areexplained as follows:

1. The character has exactly the same meaning asits radical. For example, both the characterand its radical mean mouth.

2. The character belongs to the category that itsradical represents. For example, (mother) isa (female).

3. The meaning of the character is directly relatedto the meaning of its radical. For example, thecharacter (cabinet) is related to the radical

(wood).

4. The meaning of the character is indirectlyrelated to the meaning of its radical. Forexample, the character (float) has somethingto do with its radical (water).

5. The extended meaning of the character isdirectly related to the meaning of its radical.For example, the original meaning of thecharacter is visit. The extended meaning is

invitation, which is directly related to themeaning of the radical (say).

6. The extended meaning of the character isindirectly related to the meaning of its radical.For example, the original meaning of is toe ofa cock. It is extended to mean distance. Itsradical means foot. Therefore, the extendedmeaning is indirectly related to the meaning ofits radical.

7. The meaning of the character is unrelated tothe meaning of its radical. For example, thecharacter (soft) is unrelated to its radical(vehicle).

8. It is difficult to define the radical of a characterbecause of simplification or other reasons (e.g.,

[crowd]).

We defined four types of semantic–phoneticcompound characters according to semantic trans-parency: transparent (Categories 1, 2, and 3),semitransparent (Categories 4, 5, and 6), opaque(Category 7), and other (Category 8). A radical in atransparent compound provides reliable informationabout meaning. A child can predict the meaning orimportant aspects of the meaning of such acompound from its radical. A radical in a semitran-sparent compound is less helpful but still informa-tive to a certain extent. A radical in an opaquecompound provides no useful information. In acompound classified as other, the information con-tained in a radical has been lost.

The proportions of the four types and eightcategories of semantic–phonetic compound charac-ters are displayed in Table 9. Overall, 88% ofsemantic–phonetic compound characters have radi-cals that are more or less informative. The informa-tion, however, is far from perfect: Only in 1 of 100semantic–phonetic compound characters can a childobtain the precise meaning of a character from itsradical. For most characters, a radical providespartial information.

Table 9

Proportions of Different Levels of Transparency of Semantic-Phonetic Compound Characters

Level of transparency

Type Transparent Semitransparent Opaque Other

Proportion 58 .30 .09 .04

Semantic category One Two Three Four Five Six Seven Eight

Proportion .01 .33 .25 .19 .05 .06 .09 .04

Properties of School Chinese 37

Page 12: Properties of school chinese,implications for learning to read 4

Table 10 shows that the proportion of transparentcharacters increases with grade. The proportions ofsemitransparent and opaque characters increase alittle bit from Grade 1 to Grade 4 but become smalleragain in Grades 5 and 6. The proportion of ‘‘other’’characters decreases from Grade 1 and is very smallin higher grades. In general, children learn moretransparent characters and fewer opaque and‘‘other’’ characters as grade level increases.

Table 11 presents frequency-weighted proportionsof different types of semantic compounds by grade.The general pattern is similar to that in Table 10. Wefound that when weighted by frequency, the propor-tions of transparent characters became lower. Thereason is, as we discuss later, that the moretransparent characters tend to be of lower frequency.Moreover, the proportions of ‘‘other’’ characters weremuch higher because these characters had muchhigher frequencies compared with other types ofcharacters. This reflects the fact that high-frequencycompounds are those that were simplified and turnedinto ‘‘other’’ characters. Taken together, the findingssuggest that the characters become less transparentwhen taking frequency into consideration.

The moving-window method was used to graphthe relation between semantic transparency andfrequency. The independent variable was meanlog-adjusted frequency of the 50 characters in thewindow, as defined earlier. The dependent variablewas the mean of the semantic values of the 50

characters in the window. Transparent characterswere coded 1 because their radicals providedreliable and straightforward information. Semitran-sparent, opaque, and ‘‘other’’ semantic–phoneticcompound characters were coded 0. The movingaverage of semantic transparency was obtained inthe same way as described in the previous sections.Figure 4 shows that semantic transparency rises asfrequency of usage decreases. Compared withphonetic regularity, semantic transparency is morestrongly related to frequency (R25 .726, F5 4,776.1,

Table 10

Proportions of Semantic-Phonetic Compound Characters With Different Levels of Semantic Transparency by

Grade

Grade

Type 1 2 3 4 5 6

Transparent .52 .56 .58 .55 .67 .65

Semitransparent .29 .30 .31 .34 .25 .27

Opaque .09 .09 .10 .10 .08 .05

Other .11 .04 .02 .02 .01 .03

Table 11

Frequency-Weighted Proportions of Semantic-Phonetic Compound Characters With Different Levels of

Semantic Transparency by Grade

Grade

Type 1 2 3 4 5 6 Total

Transparent .32 .44 .51 .42 .46 .53 .40

Semitransparent .18 .30 .32 .41 .45 .40 .27

Opaque .10 .16 .12 .16 .08 .03 .12

Other .41 .09 .05 .01 .01 .04 .21

Ln Adjusted Frequency

1086420-2

Pro

po

rtio

n S

em

an

tica

lly T

ran

sp

are

nt

1.0

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.0

Figure 4. Semantic transparency as a function of frequency.

38 Shu et al.

Page 13: Properties of school chinese,implications for learning to read 4

p5 .000), although here, too, there is considerablevariability around the trend line. The figure suggeststhat children have a better chance of becoming awareof the semantic function of radicals as they encountera greater number of less frequent characters.

Independent and Bound Components

We defined two types of components in ouranalysis: independent and bound. An independentphonetic is a character in its own right with its ownpronunciation and meaning (e.g., /ma/ [horse]).Although most bound phonetics were independentcharacters in ancient times, they have neitherpronunciation nor meaning in modern Chinese andhave to be attached to radicals. For example, thebound phonetic cannot stand on its own. Never-theless, bound phonetics can still provide usefulinformation. Characters with the same bound pho-netic are often related in pronunciation; for example,

, , , , and are all pronounced /feng/.Because the phonetics of characters in Phonetic

Categories 6 and 7 are either lost or obscure, we onlyanalyzed the phonetics of characters in Categories 1to 5. For similar reasons, we only analyzed theradicals of characters in Semantic Categories 1 to 6. Ofthe 650 phonetics in Phonetic Categories 1 to 5, 90%are independent phonetics and only 10% are boundphonetics. Similarly, 92% of the compound charactershave independent phonetics whereas 8% have boundphonetics. Because only a small percentage ofcharacters have bound phonetics, it is probablydifficult for a child to become aware of the informa-tion provided by a bound phonetic.

The mean frequency of characters with boundphonetics is 196 per million, whereas the meanfrequency of independent phonetics is 730 permillion. That is, most independent phonetics arehigh-frequency characters that a child learns inlower grades. Consequently, when a child encoun-ters an unfamiliar compound with an independentphonetic, the phonetic is likely to be familiar andmay provide some useful information about pro-nunciation.

There are also independent and bound radicals.An independent radical has both meaning andpronunciation (e.g. /mu/ [wood]). Unlike abound phonetic, a bound radical has meaning butnot pronunciation. For example, means water.Characters in Semantic Categories 1 to 6 have128 radicals. Of these radicals, 73% are independentcharacters and 27% are bound forms. Although thereare fewer bound radicals, they occur in charactersjust as frequently as independent radicals: In the

corpus 50% of the compound characters have boundradicals. Consequently, children may be familiarwith bound radicals although these bound radicalsare not independent characters.

In sum, variability in position and functionincrease the complexity of both phonetics andradicals. Because of the complexity, it may bedifficult for children to identify a phonetic or radical.Once a phonetic or radical has been identified,children are faced with the additional task ofmaking use of the information it contains. It takestime and effort to develop phonetic and radicalawareness.

Phonetic and Radical Families

Semantic–phonetic compound characters with thesame phonetic can be considered members of aphonetic family. The family also includes thephonetic when it is an independent character. Anexample of a family in the corpus is /ban/,/ban/, /ban/, /ban/, and /pan/. Thesimple character /ban/ is the shared phonetic.Characters within a family can be further dividedinto subgroups according to their pronunciations.Those with the same pronunciation are in the samesubgroup, regardless of tone. The family in theexample contains two subgroups: One includes/ban/, /ban/, /ban/, and /ban/; and theother consists of /pan /.

Phonetic consistency is the degree of congruencein the pronunciations of the characters within afamily. Tzeng, Zhong, Hung, and Lee, (1995) definedthe consistency of a character as the ratio of thenumber of the characters in its subgroup to thenumber of characters in the whole family. Accordingto this definition, the consistency of /ban/ is 4:5,because it has three other members in its subgroup.In contrast, the consistency of /pan/ is 1:5because it is the only character in the familypronounced this way. Because this definition doesnot take into account frequency, it may not accu-rately capture a child’s experience with language(Tzeng, 2001). As a matter of fact, the frequency of/pan/ is slightly higher than the sum of thefrequencies of /ban/, /ban/, /ban/, and

/ban/. If frequency has a significant impact oncharacter learning, when children encounter anunfamiliar character with /ban/ as the phonetic,they will be more likely to predict the pronunciationof this character as /pan / rather than /ban/. Toovercome this weakness, we calculated the fre-quency-weighted consistency of a character, whichis the ratio of the total frequency of its subgroup to

Properties of School Chinese 39

Page 14: Properties of school chinese,implications for learning to read 4

the total frequency of the family. The frequency-weighted consistency of /ban/ is (F 1 F 1

F 1 F )/F total 5 .49, whereas the frequency-weighted consistency of /pan/ is F /Ftotal5 .51. This definition may be more representa-tive of a child’s language experience.

Consistency is a dynamic concept. It changes as achild adds new characters to a family. In theforegoing example, a child in Grade 2 knows onlytwo of the characters, /ban/ and /ban/.Because the characters have the same pronunciation,the consistency of each character is 1.00 at this point.A third character, /pan/, is added in Grade 4. InGrade 5, the child learns two more characters,/ban/ and /ban/, and the number of charactersin the family increases to five. As a result, theconsistency of /ban/ is 1.00 in Grade 2, .45 inGrade 4, and .49 in Grade 5.

Table 12 displays the distribution of sizes ofphonetic families in the corpus. Altogether, 563families are in the corpus. Family size rangesfrom 2 to 12. If a character is the only one that itsphonetic forms in the corpus, it was excludedfrom our analysis because an ‘‘orphan’’ doesn’tqualify for our definition of family. There are 148orphan characters. Most families contain 2 to 4characters and very few families have 7 or morecharacters. The mean and median family sizesare 3.23 and 3.00 characters, respectively. Charactersin Categories 7 and 8 are excluded in all the analysesof phonetic families because the relation between thephonetic and the character is indeterminate.

A ‘‘complete family,’’ such as the one involving/ban/, contains the phonetic as a simple character aswell as semantic–phonetic compound characters.Not all families are complete. The phonetics of somesemantic–phonetic compound characters are miss-ing because they are not taught in primary school,

whereas some families have bound phonetics. In thefamilies /bei/, /pai/, and /pı/, forexample, the phonetic /bei/ is missing. About80% of the families in the corpus are complete. Therest are incomplete families, whose phonetics areeither missing or bound.

Table 13 presents the number of families, meanfamily size, and mean frequency-weighted consis-tency across different grade levels. Mean frequency-weighted consistency is the mean of consistencyvalues of all the characters in the phonetic families ineach grade. As grade level increases, both number offamilies and mean family size increase. In contrast,mean consistency decreases from Grades 1 to 6. Thedecrease in mean consistency is mainly due to theincrease in family size.

We defined a radical family in a similar way to aphonetic family. A radical family consists of all thecharacters that share the same radical. A family alsoincludes the radical when it is an independentcharacter. About 64% of the radical families arecomplete. The remaining 36% have radicals that areeither bound or not taught in primary school. It isinteresting that the most productive radicals tend tobe bound; Eight of the 10 largest families havebound radicals, such as (hand), (water),(person), and so on.

Figure 5 presents the cumulative percentages ofphonetic and radical families as a function of familysize. The range of the sizes of radical families ismuch larger than phonetic families. Whereas thelargest phonetic family has 12 members, the largestradical family, the one with the (hand) radical, has135 members. The graph makes it easy to see thatphonetic families are small whereas radical familiesvary widely in size.

Table 14 shows the distribution of radical familiesby grade. The trend is similar to phonetic families.

Table 12

Distribution of Phonetic Families

Size 2 3 4 5 6 7 8 9 10–12

Proportion .46 .23 .15 .06 .03 .02 .02 .02 .01

Table 13

Distribution of Phonetic Families by Grade

Grade

Statistic 1 2 3 4 5 6

Number of families 56 210 353 456 518 563

Mean family size 2.21 2.69 2.86 2.96 3.16 3.23

Mean consistency .73 .64 .63 .62 .62 .61

40 Shu et al.

Page 15: Properties of school chinese,implications for learning to read 4

Both the number of radical families and mean familysize increase with grade level. Children add newcharacters to a family as their vocabulary increases.Radical families are much larger than phoneticfamilies. An average phonetic family has only 3.23members, whereas an average radical family has 14.99members. Accordingly, the number of radical familiesis much smaller. There are only 124 radical families inthe corpus, about one fifth of the number of phoneticfamilies. There are fewer radical families thanphonetic families in each grade. The analysis revealsthat radicals form many more semantic compoundcharacters than phonetics on average, and there arefewer radical families than phonetic families.

Discussion

A major finding of the present analysis was that thevisual complexity, phonetic regularity, and semantictransparency of the Chinese characters taught inelementary school increase from the early grades tothe later grades. That is, characters introduced in thefirst or second grade typically contain fewer strokesbut are less likely to be regular or transparent thancharacters introduced in the fifth or sixth grade. Theinverse relation holds when characters are stratified

by frequency. Low-frequency characters tend to bevisually complex, phonetically regular, and semanti-cally transparent, whereas high-frequency characterstend to be the opposite. The findings regardingphonetic regularity replicate our earlier study ofcharacters (Shu et al., 1998).

Generally speaking, findings from the presentanalysis of the characters in school Chinese areconsistent with the findings from analyses of largesamples of characters found in unrestricted adultreading material. Kang (1993), Zhou (1978), andPerfetti et al. (1992) found that 81%, 82%, and 84%,respectively, of characters are semantic–phoneticcompounds. As indicated earlier, the percentage inschool Chinese is lower, only 72% in total, because alot of the characters taught in the early grades arenonphonetic characters. However, more than 80% ofthe characters listed to be taught in Grades 4, 5, and6 are semantic–phonetic compounds (see Table 2) orvirtually identical to the percentages reported byinvestigators examining characters from adult read-ing material.

With respect to phonetic regularity, Zhou (1980)concluded that 26% of semantic–phonetic compoundcharacters are perfectly regular and that the figurerises to 40% when tone-different characters are

0

10

20

30

40

50

60

70

80

90

100

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140

Family Size

Cum

ula

tive P

erc

enta

ges

Phonetic family

Radical family

Figure 5. Cumulative percentages of radical and phonetic families.

Table 14

Distribution of Radical Families by Grade

Grade

1 2 3 4 5 6

Number of families 48 85 103 110 120 124

Mean family size 4.02 8.12 10.80 12.82 14.13 14.99

Properties of School Chinese 41

Page 16: Properties of school chinese,implications for learning to read 4

counted as regular. The comparable percentages inschool Chinese are 23% and 39% (see Table 6), ornearly the same. However, Li and Kang (1993) andYin (1991) reported a considerably higher level ofregularity in characters from adult reading material.According to Li and Kang, 38% of the semantic–phonetic compounds have exactly the same pronun-ciation as their phonetics. An additional 18% havethe same syllable but a different tone. Therefore, 56%of the characters are regular in their analysis, whichis substantially higher than the 39% in schoolChinese. In Li and Kang’s analysis, 24% of thecharacters are semiregular, 20% are irregular. Thepercentages in school Chinese are 26% and 15%,respectively.

One reason for the discrepancy between Li andKang’s (1993) analysis and our analysis is that,especially in the lower grades, school Chinesecontains more high-frequency, and therefore lessregular, characters than does general adult readingmaterial. In the sixth-grade textbook, 43% of thecompound characters are regular and another 30%are semiregular, which are closer to the figuresreported by Li and Kang (1993). Another reason forthe discrepancy is the different treatments ofcharacters and phonetics that have more thanone pronunciation. We put these characters in an‘‘other’’ category. When a phonetic has morethan one pronunciation, Li and Kang consideredonly the pronunciation that is closest to thepronunciation of the character. They counted thedifferent pronunciations of characters with morethan one pronunciation as separate characters. Thesemoves allowed Li and Kang to minimize the ‘‘other’’category and consequently increased the percen-tages of characters classified as regular, semiregular,and irregular. Further analysis of the characters inschool Chinese that were placed in the ‘‘other’’category might have the same consequence. But it isnot clear whether children can use the phoneticinformation in these ‘‘other’’ characters.

Several studies have examined the semantictransparency of semantic–phonetic compound char-acters. Kang (1993) reported that less than 1% of thecompounds in adult reading material are transparent,86% are semitransparent, and 13% are opaque. Ontheir face, these results seem to contradict the resultsreported here, but if Kang’s definitions of transpar-ency were applied, 1% of the compound characters inschool Chinese would be classified as transparent,87% as semitransparent, and 13% as opaque or‘‘other.’’ Therefore, independent of exactly wherethe line between transparent and semitransparent isdrawn, the underlying proportions of characters with

different levels of transparency are similar in adultand primary school texts. According to Wang’s (1997)analysis of 2,500 compound characters, the radical isdirectly related to character meaning in 79% ofcharacters, indirectly related to character meaning in4%, and totally unrelated to character meaning in17%. In this case we cannot attempt a closereconciliation, but the percentages are at least roughlycomparable with those in school Chinese.

Kang (1993) also investigated the spatial structureof semantic–phonetic compound characters. Hisresults show that 74% of compounds have a left–right structure, 18% have a top–bottom structure,and the remaining 8% have some sort of surroundstructure. The proportions in school Chinese are72%, 18%, and 10%, respectively. Again, the percen-tages are similar for adult and primary school texts.

Most authorities assume that Chinese is difficultto learn to read because information about pronun-ciation is represented less systematically than inother orthographies (Hoosain, 1991; I. Taylor, 2001; I.Taylor & M. M. Taylor, 1995). The present analysisgives no reason to challenge this assumption. First,the pronunciation cues within characters are com-plicated. There are 650 phonetic components in thestandard phonetic compound characters in schoolChinese. This total does not include the 370 ‘‘other’’compound characters with obscure or ambiguouspronunciation cues or the 720 nonphonetic charac-ters that contain no cues. Second, the productivity ofphonetic components is low. Nearly half (46%) of theindependent phonetics appear in only one com-pound character in the corpus. Only a few (10%)appear in more than four characters. The mediansize of the phonetic families in the corpus is justthree characters. Third, the regularity of pronuncia-tion of compound characters is low. Only 23% ofcompound characters are perfectly regular, whereasan additional 16% are regular except for tone.Fourth, the pronunciation of characters with thesame phonetic is often inconsistent. Depending onthe grade, the average consistency in pronunciationof characters within phonetic families ranges from61% to 73%. Finally, other complexities may interferewith identifying and using pronunciation cues.Among the standard semantic–phonetic compoundcharacters in the corpus, 83% have a phoneticcomponent that can appear in different positionswithin a character; 14% have phonetics that canserve as radicals in other characters; and 8% havebound phonetics without determinate, independentpronunciations.

This summary leads to the expectation thatmastery of the pronunciation cues in characters are

42 Shu et al.

Page 17: Properties of school chinese,implications for learning to read 4

slow to develop, especially because pronunciationcues are less available and less reliable in thecharacters introduced in first and second gradesthan in later grades. The fact that pronunciation cuesare complicated and unreliable leads to the expecta-tion of large effects of familiarity and frequency.Next, we review research on learning to pronounceChinese characters to determine whether findings fitexpectations derived from the analysis of the corpus.

Three recent studies have examined children’scharacter pronunciation as a function of phoneticregularity and frequency or familiarity. Ho andBryant (1997) presented familiar semantic–phoneticcompound characters to Hong Kong first and secondgraders. Both frequency and regularity affectedpronunciation of the characters, but the effect offrequency was larger. Ho and Bryant also presenteda pseudo-character naming task. Pseudo-characterswere novel combinations of familiar radicals andfamiliar phonetics in their legal positions. A pro-nunciation was considered to be correct if it was thesame as the phonetic component alone or the sameas a real character that contained the phoneticcomponent. Pseudo-character naming was found tocorrelate with pronunciation of both compoundcharacters and two-character words. This impliesthat children who are aware of and attempt to usethe information in the phonetic component makebetter progress in learning to read.

In the second study of this type, Chan and Siegel(2001) asked Hong Kong children in the first throughsixth grades read aloud a list of characters, about80% of which were semantic–phonetic compounds.Frequency had a powerful affect on performance,especially the performance of poor readers. Youngnormal readers made more phonetic-related errorsand had significantly higher scores on a pseudo-character naming task than young poor readers.These findings imply that better readers try to usethe information in the phonetic component.

In a third recent study examining phoneticregularity and familiarity, Shu, Anderson, et al.(2000) had Beijing second, fourth, and sixth gradersrepresent the pronunciation of semantic–phoneticcompound characters by writing the pinyin. Bothregularity and familiarity (whether the character hadbeen taught in school) influenced performance, butthe effect of familiarity was considerably stronger.Familiarity had a greater influence in second gradethan in fourth or sixth grades. On the other hand,regularity had a greater influence in fourth and sixthgrades than in second grade, which suggests thatolder students were better able to use the informa-tion in the phonetic component. As in the Chan and

Siegel (2001) study, the proportion of phonetic-related errors was higher among older and moreable students, which provides converging evidencethat these students try to use phonetic information.

Research has addressed whether children’s abilityto pronounce characters is affected by some of thespecific complications of the Chinese writing system.Two studies have examined compound charactersthat are pronounced with a different tone than theirphonetics. Ho and Bryant (1997) found that com-pletely regular compound characters (same onset,final, and tone as the phonetic) were read signifi-cantly better than tone-different characters (sameonset and final, but different tone), which in turnwere read significantly better than irregular char-acters (different onset, final, and tone). Pronuncia-tion of tone-different characters was conditioned byfrequency. High-frequency tone-different characterswere read as well as regular characters, whereaslow-frequency tone-different characters were readno better than irregular characters. From this Ho andBryant concluded, ‘‘Only from repeated exposure to[tone-different] characters were . . . [children] able toassociate the sound of the phonetic component withthat of the characters as a whole’’ (p. 283).

Children’s ability to use information provided bythe phonetic component in less than completelyregular compound characters was also investigatedby Anderson, Li, Ku, Shu, and Wu (in press). In twostudy-test trials, second and fourth graders weretaught and tested on the pronunciation of four typesof unfamiliar characters: regular, tone different,onset different (same final and tone, different onsetfrom phonetic), and phonetic unknown (unknown tochildren of a specific age). Compared with phonetic-unknown characters, children learned and remem-bered the pronunciation of not only more regularcharacters, but also significantly more tone-differentand onset-different characters. Performance onregular characters was significantly better thanperformance on tone-different characters, which inturn was significantly better than performance ononset-different characters. The information in onset-different characters was less accessible to children,although the contrast of onset-different characterswith phonetic-unknown characters was statisticallysignificant.

Another important finding in Anderson et al. (inpress) was that students from Beijing were betterable to learn the pronunciations of unfamiliarcharacters than students from Guangzhou. Therewere differences between Beijing and Guangzhoustudents on regular, tone-different, and onset-differ-ent characters, but not on phonetic-unknown

Properties of School Chinese 43

Page 18: Properties of school chinese,implications for learning to read 4

characters. Children from Guangzhou speak Canto-nese until they enter immersion Mandarin programsin first grade. From then on, they receive instructionin all subjects in Mandarin. Children from Beijingremain monolingual throughout the primary schoolperiod. Although they are called dialects ofChinese, Cantonese and Mandarin are really mu-tually unintelligible languages. Presumably, childrenfrom Guangzhou were less able to use phoneticinformation to learn Mandarin pronunciationsbecause of phonological interference from Canto-nese.

Children are influenced not only by regularity,but also the consistency of semantic–phoneticcompound characters. Tzeng et al. (1995) askedthird- and sixth-grade Taiwanese children to readthree types of pseudo-characters: characters withonly regular neighbors, characters with both regularand irregular neighbors, and characters with onlyirregular neighbors. A response was consideredregular if a pseudo-character was read like itsphonetic component. They found that children mademore regular responses in the regular-only conditionthan in the irregular-only condition. The resultssuggest that children do not simply name thephonetic component. Good readers, especially, alsotake into consideration the pronunciations ofother characters that share the same phoneticcomponent.

The development of consistency awareness wasinvestigated by Shu, Zhou, and Wu (2000), amongfourth-, sixth-, and eighth-grade children and collegestudents from Beijing. The task was to judge whethera pair of characters that shared the same phonetichad the same pronunciation. One character in thepair was familiar to children and the other wasunfamiliar. There were three types of pairs. In thefirst type, the characters were regular and had aconsistent phonetic; that is, all the characters withthis phonetic had the same pronunciation. In thesecond type, the familiar character was regular butthe phonetic was inconsistent. In the third type, thefamiliar character was neither regular nor consistent.The results showed that as children grew older, theywere more likely to make a ‘‘no’’ judgment to thesecond and third type of pairs, which suggests thatthey were developing awareness of phonetic con-sistency. Moreover, awareness of consistency keptdeveloping from fourth grade to college. This is notsurprising considering that our analysis shows thatthe average phonetic family size is only 2.96 in thefourth grade and 3.23 in the sixth grade, and thatmean consistency is .62 and .61, respectively.Because of the small size of most families and

relatively low mean consistency, it takes a long timefor awareness of consistency to develop.

Recent research suggests that children can usephonological analogy to read unfamiliar characters.Ho, Wong, and Chan (1999) taught first- and third-grade Hong Kong children pairs of clue characters.Each character in a pair had the same phoneticcomponent and the same pronunciation. When thechildren had learned to read the clue characters, theywere asked to read the test characters with a pair ofclue characters in sight. A test character could havethe same phonetic component in the same position,the same phonetic component in a different position,the same radical as the clue pair, or nothing incommon. Children in both grades showed greaterimprovement from pretest to posttest in namingphonologically analogous characters than nonanalo-gous characters. Moreover, older children readanalogous characters equally well when the positionof the phonetic component changed. The resultsshow that Chinese children as young as 6 years ofage can make phonological analogies in namingunfamiliar Chinese characters.

Conditions in Ho et al. (1999) made it likely thatchildren would notice and use phonological analo-gies. Shu, Anderson, et al. (2000) studied the use ofphonological analogies under what were perhapsmore realistic conditions. Correct derivations of thepronunciation of unfamiliar bound phonetic char-acters with familiar neighbors averaged only 6%,12%, and 20% among low-, middle-, and high-abilitystudents, respectively. The authors attributed thepoor performance to limited opportunities to learn,because bound phonetic characters are only a smallpercentage of the semantic–phonetic compoundcharacters introduced in elementary school.

If children were able to use phonological analo-gies, there could be a striking improvement in theirability to predict the pronunciations of unfamiliarcompound characters. The basis for this assertion isthe finding that frequency-weighted consistencyaverages 60% or more in the upper grades. Thismeans that the strategy of giving an unfamiliarcompound character the most frequent, or domi-nant, pronunciation of the phonetic family of whichit is a member has a 60% chance of success. This isconsiderably higher than the 39% odds (odds of aregular character discounting variations in tone)from the simply strategy of giving the unfamiliarcharacter the same pronunciation as its phonetic.Moreover, the analogy strategy works with char-acters with either independent or bound phonetics,whereas the phonetic-naming strategy works onlywith characters with independent phonetics. One

44 Shu et al.

Page 19: Properties of school chinese,implications for learning to read 4

would suppose, however, that the analogy strategywould be difficult to master because of the greatnumber of small phonetic families. In fact, evidence(Shu, Anderson, et al., 2000) suggests that few if anychildren are able to use the analogy strategyconsistently.

Considered next is the extent to which schoolChinese contains information about meaning thatchildren are able to use. The present analysisindicates that 58% of the semantic–phonetic com-pound characters are transparent in the sense thattheir radicals provide obvious information aboutmeaning. An additional 30% of the characters aresemitransparent. Their radicals provide some infor-mation about meaning. Thus, altogether, 88% ofsemantic–phonetic compound characters have in-formative radicals. Moreover, transparency increasesas grade level increases and frequency decreases.That is, as children become older and moreexperienced readers, they are progressively morelikely to encounter semantic–phonetic compoundcharacters that contain strong clues to meaning.

To make use of the information that radicalsprovide, children first have to identify radicals.There are several reasons this may be a manageabletask. Compared with phonetics, the number ofradicals is limited. There are 128 radicals in thecorpus of school Chinese, of which 73% are simplecharacters children learn in the early grades. Theother 27% are bound forms, but unlike boundphonetics, bound radicals are frequent, forming50% of the compound characters in the corpus.Radical families are much larger than phoneticfamilies. On average, each radical family has 15members. Bound radicals tend to have even largerfamilies than independent radicals and, therefore,are highly familiar. Radicals are used to indexcharacters in Chinese dictionaries; thus, every timechildren consult a dictionary, they are reminded thatradicals are components of many characters. Takentogether, the limited number of radicals and theirhigh transparency, high frequency, large family size,and the experience children have using them to lookup words in the dictionary should make childrenaware of radicals at an early age.

Several studies have shown that children can usethe information in semantic radicals to derive themeanings of compound characters. Shu and Ander-son (1997) investigated the use of radicals in readingcharacters among first-, third-, and fifth-gradeBeijing children. In a multiple-choice task, the targetcharacters appeared in pinyin in two-characterwords. All the words were familiar to the childrenin oral language. The children were asked to replace

the pinyin with one of four characters. The choicesshared the same phonetic component but haddifferent radicals. The target characters varied infamiliarity as well as morphological transparency. Athird of the target characters were morphologicallytransparent, those with radicals that provide usefulinformation about the meaning of the characters. Athird of the target characters were morphologicallyopaque, with radicals that are not helpful. The restwere simple characters that didn’t have radicals. Theresults indicated that children performed better onmorphologically transparent characters than onopaque or simple characters. The transparency effectwas greater on unfamiliar and recently learnedcharacters than on familiar characters; the effectwas also greater among third and fifth graders thanfirst graders. The results suggest that children canuse the information in radicals to read charactersand that radical awareness increases throughout theprimary-school period. In the second experiment,Shu and Anderson confirmed that children’s perfor-mance was affected by familiarity of the radicals anddemonstrated an influence from the conceptualdifficulty of the words. Children were better ableto use semantic components to derive meaning ofcharacters when the semantic components werefamiliar and the words were conceptually easy.

One study suggests that even young children canuse the information in familiar bound radicals. In acreative writing task, Chan and Nunes (1998) gave 6-to 9-year-old Hong Kong children six radicals andsix phonetics and asked them to generate names forthe novel objects in six pictures. The radicals werebound and had a fixed position within characters.Each radical was related in meaning to the object inone picture. For example, the plant radical wasassociated with a peculiar looking flower. By the ageof 6, the children were able to select the appropriateradical to represent a given meaning and put it in thecorrect position alongside a phonetic. The resultssuggest that children have sensitivity to both thefunction and the position of bound radicals from anearly age.

Research indicates that children may be able touse semantic analogy to understand the meanings ofunfamiliar characters. Ho et al. (1999) had childrenlearn clue characters and then read target characterswith clue characters in sight. The task was to selectfrom four pictures the one that represented thesemantic category of the target character. There werethree types of target characters. An analogouscharacter had the same semantic component as theclue characters. A common-phonetic character hadthe same phonetic component. A control character

Properties of School Chinese 45

Page 20: Properties of school chinese,implications for learning to read 4

had nothing in common. Children made moreimprovement from pretest to posttest on semanti-cally analogous characters than on control charactersor common-phonetic characters. The improvementwas smallest in reading common-phonetic charac-ters because children sometimes mistakenly used thephonetic as a semantic cue. It seems that children areable to use semantic analogy to read unfamiliarcharacters, but they sometimes confuse phoneticcomponents with semantic components.

The foregoing studies indicate that children asyoung as 6 years old are aware of the position andfunction of semantic radicals (Chan & Nunes, 1998;see also Shu & Anderson, 1998) and their radicalawareness increases throughout the primary-schoolperiod (Shu & Anderson, 1997). Children are able tomake semantic analogies when encountering un-familiar characters in a supportive context (Ho et al.,1999). On the other hand, research also demonstratesthat children’s radical awareness is limited: Thereare many radicals young children do not know, andthey sometimes confuse radicals with phoneticcomponents.

So, does written Chinese have an orderly struc-ture from which metalinguistically aware childrencan be expected to extract useful information? Or, isChinese better described as a language that childrenmust learn through repeated exposure and memor-ization? Although research on learning to readChinese is still in its infancy, and only a handful ofstudies have been done, available evidence pointsclearly to the conclusion that written Chinese has alogic that young children can understand and use.The average effect size for phonetic regularity in fourrecent studies of Chinese children’s reading is 1.05(Anderson et al., 2001; Ho & Bryant, 1997; Ho et al.,1999; Shu, Anderson, et al., 2000). Similarly, theaverage effect size for semantic transparency in tworecent studies is .92 (Ho et al., 1999; Shu &Anderson, 1997). These are large effectsFlargeenough to be of more than theoretical interest.

At the same time, the robust effects of characterfrequency and familiarity indicate limits on theamount of information children are able to gleanfrom characters. The average effect size of familiarityor frequency in four studies of Chinese children’sreading is 2.11 (Chan & Siegel, 2001; Ho & Bryant,1997; Shu & Anderson, 1997; Shu, Anderson, et al.,2000). This is twice as large as the effect of regularityor transparency in roughly the same set of studies.Thus, the conclusion is complicated. Yes, it isimportant for Chinese children to use the logic ofthe writing system. However, there is no way forthem to escape from repeated practice if they are to

become skilled readers. Compared with the Westernchildren speaking alphabetic languages studied byGoswami and her colleagues, the task Chinesechildren face in learning to read is more like theone facing English and French children than the onefacing German, Greek, or Spanish children.

References

Anderson, R. C., Li, W., Ku, Y., Shu, H., & Wu, N. (inpress). Use of partial information in reading Chinesecharacters. Journal of Educational Psychology.

Carroll, J. B. (1971). Statistical analysis of the corpus. In J. B.Carroll, P. Davies, & B. Richman (Eds.), Word frequencybook (pp. xxi–xl). New York: American Heritage Publish-ing.

Carroll, J. B., Davies, P., & Richman, B. (Eds.). (1971). Wordfrequency book. New York: American Heritage Publish-ing.

Chan, C. K., & Siegel, L. S. (2001). Phonological processingin reading Chinese among normally achieving and poorreaders. Journal of Experimental Child Psychology, 80, 23–43.

Chan, L., & Nunes, T. (1998). Children’s understanding ofthe formal and functional characteristics of writtenChinese. Applied Psycholinguistics, 19, 115–131.

Chen, M. J., & Yuen, J. C. (1991). Effects of pinyin andscript type on verbal processing: Comparisons of China,Taiwan, and Hong Kong experience. International Journalof Behavioral Development, 14, 429–448.

Dictionary of Chinese character information (1988). Beijing,China: Science Publishers.

Elementary Education Teaching and Research Center,Beijing Education and Science Institute. (1996). Elemen-tary school textbooks. Beijing, China: Beijing Publishers.

Fang, S. P., Horng, R. Y., & Tzeng, O. J. L. (1986).Consistency effect and pseudo-character naming task.In S. K. Kao & R. Hoosain (Eds.), Linguistics, psychologyand the Chinese language (pp. 11–21). Hong Kong:University of Hong Kong Center of Asian Studies.

Fu, Y. H. (1989). A basic research on structure and itscomponent of Chinese character. In Y. Chen (Ed.),Information analysis of used character in modern Chineselanguage (pp. 154–186). Shanghai, China: ShanghaiEducational Press.

Goswami, U. (1986). Children’s use of analogy in learningto read: A developmental study. Journal of ExperimentalChild Psychology, 42, 73–83.

Goswami, U., & Bryant, P. (1992). Rhyme, analogy, andchildren’s reading. In P. B. Gough, L. C. Ehri, & R.Treiman (Eds.), Reading acquisition (pp. 49–63). Hills-dale, NJ: Erlbaum.

Goswami, U., Gombert, J., & Fraca de Barrera, L. (1998).Children’s orthographic representations and linguistictransparency: Nonsense word reading in English,French, and Spanish. Applied Psycholinguistics, 19, 19–52.

Goswami, U., Porpodas, C., & Wheelwright, S. (1997).Children’s orthographic representations in English and

46 Shu et al.

Page 21: Properties of school chinese,implications for learning to read 4

Greek. European Journal of Psychology of Education, 12,273–292.

Ho, C. S.-H., & Bryant, P. (1997). Learning to read Chinesebeyond the logographic phase. Reading Research Quar-terly, 32, 276–289.

Ho, C. S.-H., Wong, W.-L., & Chan, W.-S. (1999). The use oforthographic analogies in learning to read Chinese.Journal of Child Psychology, 40, 393–403.

Hoosain, R. (1991). Psycholinguistic implications for linguisticrelativity: A case study of Chinese . Hillsdale, NJ: Erlbaum.

Hung, D. L., & Tzeng, O. J. L. (1981). Orthographicvariations and visual information processing. Psycholo-gical Bulletin, 90, 377–414.

. (1993).[Analysis of semantics of semantic-phonetic compoundcharacters in modern Chinese]. ,

(pp. 68–83). :.

Kucera, H., & Francis, W. N. (1967). Computational analysisof present-day American English. Providence, RI: BrownUniversity Press.

, . (1993).[Analysis of phonetics of semantic-phonetic compoundcharacters in modern Chinese]. ,

(pp. 84–98). : .Modern Chinese dictionary. (Rev. ed.) (1998). Beijing, China:

Shang Wu Yin Shu Guan.Modern Chinese frequency dictionary. (1986). Beijing, China:

Beijing Language Institute Press.Nagy, W. E., & Anderson, R. C. (1984). How many words

are there in printed school English? Reading ResearchQuarterly, 19, 304–330.

Nagy, W., & Anderson, R. C. (1998). Metalinguisticawareness and literacy acquisition in different language.In D. Wagner, R. Venezky, & B. Street (Eds.), Literacy: Aninternational handbook (pp. 155–160). Boulder, CO: West-view.

Nagy, W., Anderson, R. C., Schommer, M., Scott, J. A., &Stallman, A. C. (1989). Morphological families andword recognition. Reading Research Quarterly, 24, 262–282.

. (1982). [Semantic-phonetic compound characters in modernChinese]. .

Perfetti, C. A., Zhang, S., & Berant, I. (1992). Reading inEnglish and Chinese: Evidence for a ‘‘universal’’phonological principle. In R. Frost & J. Katz (Eds.),Orthography, phonology, morphology, and meaning (pp.227–248). Amsterdam: Elsevier.

Shu, H., & Anderson, R. C. (1997). Role of radicalawareness in the character and word acquisition ofChinese children. Reading Research Quarterly, 32,78–89.

Shu, H., & Anderson, R. C. (1998). Learning to readChinese: The development of metalinguistic awareness.In J. Wang, A. W. Inhoff, & H.-C. Chen (Eds.), ReadingChinese script: A cognitive analysis (pp. 1–18). Mahwah,NJ: Erlbaum.

Shu, H., Anderson, R. C., & Wu, N. (2000). Phoneticawareness: Knowledge of orthography-phonology rela-tionships in the character acquisition of Chinesechildren. Journal of Educational Psychology, 92, 56–62.

Shu, H., Wu, N., Zheng, X., & Zhou, X. (1998). Thephonological features and distribution of Chinesecharacters in primary school textbooks. [in Chinese]Applied Linguistics, 2, 63–68.

Shu, H., Zhou, X., & Wu, N. (2000). Utilizing phonologicalcues in Chinese characters: A developmental study. ActaPsychologica Sinica, 32, 164–169.

Spinks, J. A., Liu, Y., Perfetti, C. A., & Tan, L., (2000).Reading Chinese characters for meaning: The role ofphonological information. Cognition , 76, B1–B11.

Taylor, I. (2001). Phonological awareness in Chinesereading. In W. Li, J. S. Gaffney, & J. L. Packard (Eds.),Chinese children’s reading acquisition: Theoretical andpedagogical issues (pp. 39–58). Boston: Kluwer.

Taylor, I., & Taylor, M. M. (1995). Writing and literacy inChinese, Korean, and Japanese. Philadelphia: John Benja-mins Publishing.

T’sou, B. K. Y. (1981). A sociolinguistic analysis of thelogographic writing system of Chinese. Journal of ChineseLinguistics, 9, 1–17.

Tzeng, O. J. L. (2001). Current issues in learning to readChinese. In W. Li, J. S. Gaffney, & J. L. Packard (Eds.),Chinese children’s reading acquisition: Theoretical andpedagogical issues (pp. 3–16). Boston: Kluwer.

Tzeng, O. J. L., Zhong, H. L., Hung, D. L., & Lee, W. L.(1995). Learning to be a conspirator: A tale of becominga good Chinese reader. In B. de Gelder & J. Morais(Eds.), Speech and reading: A comparative approach (pp.227–246). Hove, UK: Erlbaum.

Wang, N. (1997). The structure and meaning of Chinesecharacters and words. In D. Peng, H. Shu, H.-C. Chen(Eds.), Cognitive research on Chinese Language (pp. 35–64).Jinan, China: Shandong Educational Publisher.

Wimmer, H., & Goswami, U. (1994). The influence oforthographic consistency on reading development:Word recognition in English and German children.Cognition, 51, 91–103.

Wu, X., Li, W., & Anderson, R. C. (1999). Readinginstruction in China. Journal of Curriculum Studies, 31,571–586.

Yin, W. (1991). On reading Chinese characters: An experi-mental and neuropsychological study. Unpublished doc-toral dissertation, University of London.

Zhou, X., Marslen-Wilson, W., Taft, M., & Shu, H. (1999).Morphology, orthography, and phonology in readingChinese compound words. Language and CognitiveProcesses, 14, 525–565.

. (1978).[To what extent are the ‘‘phonetics’’ of present-dayChinese characters still phonetic]. , 146, 172–177.

. (1980). [Pronunciationof phonetics within compound characters].

.

Properties of School Chinese 47