xetexmain

112
e X E T E X Companion T E X meets OpenType and Unicode Edited by Michel Goossens (CERN) Work in progress. Version January 11,2010 Please send your comments to [email protected] ©Michel Goossens (editor) and the various contributors (see next page).

Upload: cezar-popescu

Post on 29-Nov-2014

543 views

Category:

Documents


0 download

TRANSCRIPT

Te X E TEX CompanionTEX meets OpenTvpe and UnicodeEdited bv Michel Goossens (CERN)Vork in progress. Version Ianuary 11,2010Please send your comments to [email protected] Goossens (editoi) and the vaiious contiibutois (see next page).Te copviight of the contiibutions extiacted fiom documentation of the vaiious packages (see belowfoi details) iemains with theii iespective authois. Te cuiient maintainei of this document is MichelGoossens.Work history Ianuary 2008 Initial veision (fiom LGC2 supplementaiv mateiial). Spring 2008 Adapted mateiial fiom Ionathan Kews X E TEX manual and Will Robeitsons fontspecmanual. Ianuary 2009 Adapted mateiial fiom Fianois Chaiettes arabxetex manual and Dian Yinszhspacing manual. Iuly 2009 Added mateiial contiibuted bv Vafa Khalighi desciibing his bidi package. August 2009 Added mateiial about xecjk plus intioduced coiiections and claiications suggestedbv Leo Feiies and Kaiel Pka. Ianuary 2010 Added lots of coiiections and a few suggestions foi claiications bv Tavloi Venable.ContentsList of Figures viiList of Tables ixPreface xi1 PostScript fonts and beyond 11.1 Font formats: a brief history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.1.1 Adobe and its PostScript Type 1 . . . . . . . . . . . . . . . . . . . . . . . . 21.1.2 TrueType fonts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.1.3 Two competing technologies. . . . . . . . . . . . . . . . . . . . . . . . . . 31.1.4 The best of two worlds: OpenType . . . . . . . . . . . . . . . . . . . . . . . 31.2 PostScript Type 1 and TrueType: two dierent approaches . . . . . . . . . . . . . . . . . . . . . 41.2.1 Interoperability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.3 Unicode: the universal character encoding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.4 OpenType . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.4.1 OpenType tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.4.2 OpenType features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.4.3 OpenType support today . . . . . . . . . . . . . . . . . . . . . . . . . . . 121.4.4 Interrogating OpenType fonts . . . . . . . . . . . . . . . . . . . . . . . . . 132 X E TEX: TEX meets OpenType and Unicode 192.1 X E TEX: a historical introduction and some basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.1.1 A brief history. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.1.2 X E TEX: basic principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.2 X E TEX: typesetting with glyphs, characters and fonts . . . . . . . . . . . . . . . . . . . . . . . . . 232.2.1 Accessing font with fontcong . . . . . . . . . . . . . . . . . . . . . . . . . 232.2.2 Specifying character codes. . . . . . . . . . . . . . . . . . . . . . . . . . . 252.2.3 Hyphenation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262.2.4 Font management: the basics . . . . . . . . . . . . . . . . . . . . . . . . . 272.2.5 Font mappings using TECkit . . . . . . . . . . . . . . . . . . . . . . . . . . 28CONTENTS2.2.6 Line breaks and justication . . . . . . . . . . . . . . . . . . . . . . . . . . 292.2.7 Unicode Character/glyph model . . . . . . . . . . . . . . . . . . . . . . . . 302.2.8 Using OpenType via ICU Layout. . . . . . . . . . . . . . . . . . . . . . . . . 302.2.9 X E TEXs hyphenation support . . . . . . . . . . . . . . . . . . . . . . . . . . 322.2.10 Running xetex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332.3 Supplementary commands introduced by X E TEX. . . . . . . . . . . . . . . . . . . . . . . . . . . . 332.3.1 Specifying languages and scripts . . . . . . . . . . . . . . . . . . . . . . . . 342.3.2 Specifying optional features . . . . . . . . . . . . . . . . . . . . . . . . . . 352.3.3 Support for pseudo-features . . . . . . . . . . . . . . . . . . . . . . . . . . 362.3.4 Commands extracting information from OpenType fonts . . . . . . . . . . . . . 362.3.5 Maths fonts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392.3.6 Encodings, linebreaking, etc. . . . . . . . . . . . . . . . . . . . . . . . . . . 412.3.7 Graphics and pdfTEX-related commands . . . . . . . . . . . . . . . . . . . . . 422.4 fontspec . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 432.4.1 Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 432.4.2 Latin Modern defaults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 432.4.3 Maths ddling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 432.4.4 A rst overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442.4.5 Font selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442.4.6 Default font families . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452.5 X E TEX and other engines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473 Handling all those scripts 493.1 Writing systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493.1.1 Basic terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503.1.2 History of writing systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 503.1.3 Types of writing systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513.1.4 Language Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543.1.5 Freely available Unicode encoded fonts . . . . . . . . . . . . . . . . . . . . . 543.1.6 Directionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543.1.7 Writing systems on computers . . . . . . . . . . . . . . . . . . . . . . . . . 553.2 Bidirectional typesetting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553.2.1 Using The bidi Package . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553.2.2 Basic Direction Switching . . . . . . . . . . . . . . . . . . . . . . . . . . . 563.2.3 Typesetting Short RTL and LTR texts. . . . . . . . . . . . . . . . . . . . . . . 573.2.4 Multicolumn Typesetting . . . . . . . . . . . . . . . . . . . . . . . . . . . 573.2.5 More peculiarities for RTL typesetting . . . . . . . . . . . . . . . . . . . . . . 583.2.6 Tabular material in RTL mode. . . . . . . . . . . . . . . . . . . . . . . . . . 603.3 Languages using the Arabic alphabet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613.3.1 ArabTEX: Arabic typography with TEX . . . . . . . . . . . . . . . . . . . . . . 623.3.2 ArabX E TEX: Arabic typography with X E TEX. . . . . . . . . . . . . . . . . . . . . 643.3.3 Arabic presentation forms . . . . . . . . . . . . . . . . . . . . . . . . . . . 753.4 Typesetting Chinese . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 793.4.1 The xeCJK Package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 793.4.2 The zhspacing package . . . . . . . . . . . . . . . . . . . . . . . . . . . . 843.5 Examples of the use of Unicode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 883.5.1 Unicode fonts and editors . . . . . . . . . . . . . . . . . . . . . . . . . . . 883.5.2 Examples of Unicode texts . . . . . . . . . . . . . . . . . . . . . . . . . . . 89ivch-fiont.tex,v: 2.02 2010/01/10Contents4 Unicode mathematics 914.1 Unicode for handling math across platforms and applications . . . . . . . . . . . . . . . . . . . 914.2 X E TEX handling mathematics fonts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92Index of Commands and Concepts 95People 100ch-fiont.tex,v: 2.02 2010/01/10vList of Figures1.1 Using OpenTvpes advanced tvpogiaphic featuies in Adobe InDesign . . . . . . . . . . . . . . 131.2 Opentvpe Unicode suppoit in OpenOce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141.3 Miciosofs Fonts Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.1 Complexities when dealing with vaiious languages . . . . . . . . . . . . . . . . . . . . . . . . . 192.2 Sciipts used in vaiious paits of the woild . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202.3 Asian sciipts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202.4 List of featuies foi the sciipts and languages suppoited bv the Miciosof Aiial andAdobe Minion fonts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403.1 Wiiting svstems used in the woild todav . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513.2 Examples of six Aiabic calligiaphic stvles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62List of Tables2.1 Mathematics svmbol tvpes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393.1 Indic consonantvowel combinations in vaiious Indic abugidas . . . . . . . . . . . . . . . . . 543.2 AiabTEXs input conventions foi Aiabic and Peisian . . . . . . . . . . . . . . . . . . . . . . . . 633.3 All arabxetex input conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75PrefaceTis fiee booklet desciibes X E TEX and its X E LATEX vaiiant. Afei an intioduction to the OpenTvpe andUnicode technologies, it desciibes howX E TEXextends the TEXengine to optimallv use OpenTvpe fontsdiiectlv and allow vou to handle Unicode-encoded souices.Vaiious LATEX packages have been developed iecentlv to take advantage of X E TEXs new function-alities, and those aie desciibed next.Tis compilation of tools has been wiitten in close collaboiation with the authois: Ionathan Kew(X E TEX development), Will Robeitson fontspec and unicode-math), Fianois Chaiette (arabxetex), andDian Yin (zhspacing). Coiiections and feedback has also been ieceived fiom Adam Buchbindei, LeoFeiies, Rik Kabel, and Kaiel Pka.Comments aie welcome and can be addiessed to [email protected] GoossensIanuary 2010C H A P T E R 1PostScript fonts and beyond1.1 Font formats: a brief history. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 PostScript Type 1 and TrueType: two dierent approaches . . . . . . . . . . . . . . . . . . . . . . . . 41.3 Unicode: the universal character encoding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.4 OpenType . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5In this chaptei we look at the most basic tvpe of giaphical object in documents: the chaiacteis thatfoim the woids. Chaiactei shapes (glvphs) aie not a diiect pait of the TEX svstem; all TEX wants toknow about them is some metiic infoimation, such as theii width oi height. It is the task of the post-piocessing stage (the backend of pdfTEX oi a device diivei, such as dvips which ieads the .dvi le asoutput bv TEX) to pioduce the actual giaphical iepiesentation of the page. Foi this stage infoimationabout the actual shapes of the chaiacteis is needed and this infoimation is stoied in so-called fonts(collections of chaiacteis) foi which manv dieient stoiage foimats exist. Tus in piinciple anv exist-ing font can be used with TEX piovided that the metiic infoimation TEX needs is available oi can begeneiated and that a pioceduie exists that undeistands the foimat in which the fonts aie stoied andcan inseit it into the output le.Donald Knuth developed a companion piogiam to TEX, MetaFont, foi geneiating fonts to be usedwith TEX (Chaptei 3 of Te LaTeX Graphics Companion looked biiev at MetaFonts diawing capabil-ities). Foi quite some time onlv fonts designed with MetaFont weie available to TEX useis, with theiesult that TEX oi LATEX documents had an easilv identied look and feelmainlv a iesult of the use ofthe Computei Modein fonts. Given that the TEX communitv is veiv small compaied to that of otheitvpesetting svstems veiv few font designeis have pioduced fonts in MetaFont. Teiefoie, access foiTEX engines to the liteiallv thousands of fonts available commeiciallv in othei foimats, in paiticulaiPostSciipt, TiueTvpe, and, moie iecentlv, OpenTvpe, has become a must.Although at the beginning it was quite dicult to integiate PostSciipt fonts into LATEX packages,the ielease of LATEX2 and its new font selection scheme (NFSS, see Chaptei 7 of [5]) made accessingthe laige set of PostSciipt fonts moie stiaightfoiwaid. Nowadavs, documents ioutinelv combine TEXssupeiioi tvpesetting qualitv with all the piofessionallv designed tvpefaces pioduced, mainlv in Post-Sciipt, but also in TiueTvpe and OpenTvpe. Te cuiient chaptei will intioduce vou to solutions toachieve this in a convenient wav.Afei a histoiic oveiview of modein font technologies, including a biief desciiption of theii ie-spective technical capabilities, we take a closei look at the basic issues conceined with tvpesetting andhow TEX and PostSciipt, woiking togethei, addiess this pioblem (how metiic infoimation is handled,the dieient tvpes of TEX and PostSciipt fonts, how thev aie encoded, i.e., how one can access individ-1 POSTSCRIPT FONTS AND BEYONDual chaiacteis of a font,etc.) We then explain how vou can use the basic PostSciipt fonts, as thev aiedened in the PSNFSS svstem (a collection of small packages and accompanving les foi LATEX), whichmakes it easv to use a laige numbei of common PostSciipt fonts out of the box) and howto easilv down-load and install a fewinstances of fieelv available fonts. We extend the discussion to wheie to downloadand install the LATEX suppoit les foi commeiciallv available fonts that vou might have bought. Sincemanv LATEX useis have de facto access to a lot of TiueTvpe fonts that come with theii opeiating svstem,we devote the next section to the use of TiueTvpe fonts with pdatex, in paiticulai how one can usea laige Unicode TiueTvpe font foi tvpesetting in manv dieient sciipts and languages. We aie thenieadv to discuss a few iecent LATEX packages which take advantage of the eniiched possibilities of theOpenTvpe technologv. We end the chaptei with a discussion of Fontname, also know as the Beiivfont naming scheme, which is impoitant to uniquelv identifv and handle all LATEX suppoit les of thelaige numbei of fonts that aie available on cuiient opeiating svstem.1.1 Font formats: a brief historyTe cuiient main font foimats aie PostSciipt Tvpe 1 (Tvpe 1), TiueTvpe (TT), and OpenTvpe (OT), anintegiated supeiset of the ist two. All thiee aie based on font outline technologies, aie multi-platfoim,and have theii technical specications openlv available. Tese foimats can be iun on anv iecent com-putei platfoim and theii chaiactei outlines (glvphs) aie desciibed mathematicallv as functions op-eiating on points, lines and cuives. Te chaiactei iepiesentations aie iesolution independent and canbe scaled to anv size. Tese technologies implement hinting bv associating additional infoimationwith each chaiactei to help the iasteiization engine optimize theii iepiesentation on anv given outputdevice.1.1.1 Adobe and its PostScript Type 1When Adobe launched PostSciipt in 1984, it suppoited two dieient tvpes of fonts foimats: Tvpe 1,'the moie sophisticated one with suppoit foi hinting and data compiession, and Tvpe 3, a moie geneial(almost all PostSciipt giaphics opeiatois aie allowed) but less optimized vaiiant. At ist Adobe did notpublish the specication of its PostSciipt Tvpe 1 foimat (the Tvpe 3 spec was public), which helpedAdobe take a laige pait of the commeicial tvpogiaphv maiket but upset the othei font foundiies.Apple, which also was founded in the eailv nineteen eighties, adopted PostSciipt as page desciip-tion language foi its Apple LaseiWiitei piintei in 1985. Soon also othei high-end image setting ma-chines adopted PostSciipt as theii native language. At about the same time the intioduction of af-foidable desktop publishing sofwaie, such as Pagemaker, Freehand, set o a ievolution in page lavouttechnologv, and PostSciipt backends appeaied foi most giaphics piogiams, thus adding to the poten-tial maiket foi piofessional PostSciipt Tvpe 1 fonts. Because of its ieliabilitv, its wide selection of fontsavailable, its clevei iasteiizing engine and supeiioi hinting mechanism, histoiicallv PostSciipt has beenthe piefeiied font foimat of piofessional designeis, publisheis and piintshops.Concuiientlv Adobe had developed aninteiactive veisionof PostSciipt, called Display PostScript,that ian (somewhat slowlv) on peisonal computeis to allow displaving PostSciipt data on-scieen. Al-though some computei manufactuieis agieed to take out (and pav) sofwaie licences, Apple and Mi-ciosof weie quite unwilling to pav the iovalties iequested bv Adobe and, moieovei, to hand contiol toAdobe ovei a vital pait of theii opeiating svstem.In the ist pait of the 1990s Adobe also developed the PostSciipt Tvpe 1 multiple mastei (MM)foimat as an extension of PostSciipt Tvpe 1. Essentiallv, it allows two (oi moie) design vaiiations to beencoded on a given design axis (such as weight, width, optical size). Afeiwaids, anv in-between state'See http://partners.adobe.com/public/developer/en/font/T1_SPEC.PDF.2xetex-opentvpe.tex,v: 2.01 2009/06/151.1 Font formats: a brief history(instance) mav be geneiated bv the usei as iequiied.'1.1.2 TrueType fontsTe majoi svstem sofwaie vendois (Apple, Miciosof, IBM) had been thinking about scalable fonttechnologv suppoit at the level of theii iespective opeiating svstems since thev iealized that it wouldguaiantee muchbettei scieendisplav, compaiedto pie-geneiatedbitmaps whichonlv look goodat theiidesignsizes, andunacceptablv jaggedat all otheis. Foi instance inthe late 1980s Apple haddevelopedanin-house scalable font technologv, Royal, latei ienamed to TiueTvpe. Te TiueTvpe specication waspublic and alieadv in 1991 native TiueTvpe suppoit appeaied in Apples Mac Svstem 7 and MiciosofsWindows 3.1.TiueTvpe fonts use a dieient outline model fiom PostSciipt, and also the appioach to hintingis dieient. Te font instances contain both scieen and piintei font data in a single component. Tismakes the fonts easv to install. Although TiueTvpe fonts suppoit Unicode and can theoieticallv containovei 65.000 chaiacteis, thev iaielv featuie moie that some 220 chaiacteis. Moieovei, TiueTvpe fontfoimats aie platfoim-dependent.1.1.3 Two competing technologiesAdobe ieacted to the advent of TiueTvpe bv publishing in 1990 the PostSciipt Tvpe 1 font foimatspecication [1]. A fewveais latei, it intioduced the Adobe Type Manager (ATM) sofwaie, which scalesPostSciipt Tvpe 1 fonts foi scieen displav, and suppoits imaging on non-PostSciipt piinteis.Tus bv the end of the 1990s theie weie two widelv-used outline font specications, TiueTvpe,built into the opeiating svstems used bv most desktop computeis, and PostSciipt Tvpe 1, the de factostandaid foi the giaphic aits and the publishing industiv. Moieovei, as time went bv, the piacticaldieiences had begun to blui. On the one hand, suppoit foi TiueTvpe became standaid in PostSciipt3, while on the othei hand, besides native TiueTvpe suppoit, PostSciipt Tvpe 1 iasteiizing technologvwas incoipoiated into Windows 2000, Windows XP, and Mac OS X.1.1.4 The best of two worlds: OpenTypeTe OpenTvpe` font foimat was jointlv developed bv Adobe and Miciosof to combine the best featuiesof the TiueTvpe and PostSciipt Tvpe 1 technologies. It was ist piesented in 1996 and its use andsuppoit has been steadilv incieasing since about 2000.OpenTvpe fonts contain both the scieen and piintei font data in a single component. Te Open-Tvpe foimat can contain eithei TiueTvpe oi PostSciipt font data. It suppoits expanded chaiactei sets(up to 65.000) and special tvpogiaphic featuies. Tese mav include vaiious veisions of guies (tabulai,old-stvle, lining), small caps, ligatuies, oidinals, and othei extias. While OpenTvpe allows tvpe design-eis to build complex fonts, not manv fonts take advantage of these possibilities. Most OpenTvpe fontsavailable todav aie simplv conveited PostSciipt fonts, limited to 220 chaiacteis in a set.OpenTvpe fonts aie platfoim independent and can thus be used on all opeiating svstems.'Te technologv nevei ieallv took o and since 2000 Adobe has abandoned developing multiple mastei fonts since mostapplications cannot handle them and foi a laige majoiitv of useis it ofen makes moie economic sense to buv a fontset as mul-tiple sepaiate fonts. Adobe now concentiates on ieleasing OpenTvpe fonts to ieplace theii multiple mastei equivalents (e.g., theMinion and Mviiad tvpefaces).See e.g., http://developer.apple.com/fonts/, and http://www.microsoft.com/typography.`See Adobes Web pages http://store.adobe.com/type/opentype/main.html,and http://blogs.adobe.com/typblography/TT%20PS%20OpenType.pdf,oi Miciosofss Web page http://www.microsoft.com/typography/OTSPEC/default.htm.xetex-opentvpe.tex,v: 2.01 2009/06/1531 POSTSCRIPT FONTS AND BEYOND1.2 PostScript Type 1 and TrueType: two dierent approachesTiueTvpe and PostSciipt Tvpe 1 fonts use dieient mathematical iepiesentations to desciibe the cuivesdening the font outlines.' OpenTvpe, being a supeiset, can have eithei kind of outlines.TiueTvpe desciibes its cuives bv quadiatic B-splines, while PostSciipt Tvpe 1 uses cubic Bzieicuives. Tis means, in piactice, that the shapes of ieal-woild fonts tend to take moie points in Tiue-Tvpe, even though the kind of mathematics used to desciibe the cuives is simplei. Anv quadiatic splinecanbe conveited to a cubic spline with essentiallv no loss. Acubic spline canbe conveited to a quadiaticwith aibitiaiv piecision, but theie will be a slight loss of accuiacv inmost cases. Tus it is easv to conveitTiueTvpe outlines to PostSciipt Tvpe 1 outlines (the Tvpe 42 PostSciipt font foimat is a PostSciiptwiappei aiound a TiueTvpe font foi use in PostSciipt inteipieteis), haidei to do the ieveise.Te appioach to hinting is dieient in both technologies. PostSciipt Tvpe 1 takes a declarativeappioach and lets a smart PostSciipt inteipietei do the woik. It tells the iasteiizei what featuies oughtto be contiolled, and the iasteiizei inteipiets these using its own intelligence to decide how to do it.Teiefoie, when the PostSciipt inteipietei is upgiaded, the iasteiization can be impioved.On papei, the hinting potential of TiueTvpe` should be supeiioi to that of PostSciipt Tvpe 1 fonts,since TiueTvpe hints can do all that PostSciipt Tvpe 1 can, and moie. Indeed TiueTvpe takes an al-gorithmic oi piogiamming appioach and uses the veiv exible and complete instiuctions set of theTiueTvpe language. Tus TiueTvpe puts all the hinting infoimation into the font to contiol exactlvhow it will appeai when iasteiized. TiueTvpe inteipieteis can be quite dumb and limit themselves tosimplv execute what thev have been instiucted to do. Tus, although a TiueTvpe font developei cannetune what happens when a font is iasteiized undei dieient conditions, it iequiies seiious eoit,expeitise, and high-end tools to actuallv take advantage of this gieatei hinting potential. As a iesult,high-qualitv TiueTvpe fonts, which exploit the tiue potentials of TiueTvpe hinting aie quite iaie. Moie-ovei, when using complex hinting the intioduction of a new iasteiizei might iequiie majoi changes tothe TiueTvpe code in oidei to be able to optimallv displav existing fonts.PostSciipt Tvpe 1 needs two sepaiate les foi its font data: one foi the chaiactei outlines (.pfb),and the othei foi the metiics data (.afm on Linux, .pfm on Windows), containing chaiactei widths,keining paiis, and a desciiption of how to constiuct composites. TiueTvpe fonts have all the data ina single le. Neveitheless this single TiueTvpe font le is ofen twice laigei than the two PostSciiptTvpe 1 les combined due to the piesence in the TiueTvpe fonts of extensive hinting instiuctions.Geneiallv speaking, PostSciipt Tvpe 1 fonts have some advantages simplv fiom being the longei-established standaid, especiallv foi seiious giaphic aits woik. Seivice buieaus aie standaidized on, andhave laige investments in, PostSciipt Tvpe 1 fonts. Most of the fonts which have expeit sets of oldstvle guies, extia ligatuies, tiue small capitals and the like aie in that foimat.1.2.1 InteroperabilityIn piinciple one can mix TiueTvpe and PostSciipt Tvpe 1 fonts with the caveat that the TiueTvpe andPostSciipt Tvpe 1 instances of the fonts mav not have exactlv the same names on the given opeiatingsvstem. Indeed, the fact that fonts exist with identical menu names oi PostSciipt Tvpe 1 font namesconfuses the opeiating svstem oi the application piogiams, with ofen unpiedictable iesults.Also, if using Windows, one mav ndthat metiicallv-similai PostSciipt Tvpe 1 fonts get substitutedfoi the Windows TiueTvpe svstem fonts at output time: Times ^ew Roman becomes Times Roman, andArial becomes Helvetica. Although the basic spacing of the substituted fonts is identical, theii keiningpaiis aie not. Tis can cause text to ieow (i.e., line endings in a paiagiaph mav diei) if one switchesbetween two almost identical fonts if voui tvpesetting piogiam (e.g., TEX) suppoits keining paiis.'See http://www.truetype.demon.co.uk/articles/ttvst1.htm.See Dadid Lemons Basic Type 1 hinting (http://www.pyrus.com/downloads/hinting.pdf).`See the URL http://www.microsoft.com/typography/hinting/tutorial.htm, Vincent Connaies Basic hintingphilosophies and TrueType instructions.4xetex-opentvpe.tex,v: 2.01 2009/06/151.3 Unicode: the universal character encodingTus caie must be taken to ensuie that vou use the coiiect font all thiough the complete pioductionchain.1.3 Unicode: the universal character encodingUnicode is an inteinational standaid' foi iepiesenting chaiacteis using a multi-bvte platfoim-independent encoding foi coveiing all the woild languages (including some aiticial ones, such asmathematical svmbols and the inteinational phonetic alphabet). Unicode deals with chaiacteis iatheithan glvphs. Tat is, it onlv deals with semantic iathei than tvpogiaphic distinctions (with a few ex-ceptions foi compatibilitv with existing standaids). Teiefoie theie is no place foi glvph vaiiants, suchas unusual ligatuies, old stvle numbeis, oi small caps within Unicode itself; the Unicode standaid as-sumes that such distinctions will be made elsewheie. Teiefoie, font foimats, which suppoits suchdistinctions, such as OpenTvpe (see Section 1.4), need to be laveied on top of Unicode. Alan Woodsmaintains a useful website (http://www.alanwood.net/unicode/) which desciibes numeiousiesouices foi Unicode and multilingual suppoit in HTML, fonts, web biowseis and othei applications.Most cuiient opeiating svstems (Linux, Mac OS X and Windows XP) have diiect suppoit foi Uni-code at the basic svstemlevel. Foi instance, apait fiomswitching between dieient language kevboaids,these opeiating svstems oei means of diiectlv accessing anv Unicode chaiactei in anv font (e.g., onMac OS Xvia the Character Palette and on Miciosof Windows XP oi Vista via the Character Map utilitvin System Tools in the Accessories submenu.)1.4 OpenTypeTe OpenTvpe font foimat was developed jointlv bv Miciosof and Adobe as an extension of the Tiue-Tvpe font foimat. OpenTvpe addiesses the following goals: suppoits PostSciipt Tvpe 1 outlines and hints; suppoits TiueTvpe tables and hints; suppoits advanced tvpogiaphic featuies bv wav of new tables foi glvph positioning and substitu-tion; suppoits multiple platfoims; suppoits inteinational chaiactei sets bv using Unicode; oeis bettei piotection foi font data; featuies smallei le sizes to make font distiibution moie ecient.Sometimes OpenTvpe fonts aie iefeiied to as TiueTvpe Open v.2.0 fonts. PostSciipt Tvpe 1 dataincluded in OpenTvpe fonts mav be diiectlv iasteiized oi conveited to the TiueTvpe outline foimatfoi iendeiing, depending on which iasteiizeis have been installed in the host opeiating svstem. Useisdo not need to know which outlines aie actuallv piesent. One can sav that OpenTvpe enteis TiueTvpeand PostSciipt Tvpe 1 in a common wiappei. OpenTvpe tables include the cuiient TiueTvpe tablesplus some additional tables foi advanced tvpogiaphic featuies. Te iepiesentation of PostSciipt Tvpe 1font sofwaie in an OpenTvpe font uses Adobes Compact Font Foimat (CFF) with Tvpe 2 chaistiings,which is a moie compact iepiesentation of the same infoimation in PostSciipt Tvpe 1 (a gain of abouta factoi of two, on aveiage, when no glvphs and featuies aie added).'Te cuiient veision is 5.0 [7] and it has been dened bv the membeis of the Unicode Consoitium, which includes majoicomputei coipoiations, sofwaie pioduceis, database vendois, ieseaich institutions, inteinational agencies, vaiious usei gioups,and inteiested individuals, see http://www.unicode.org.xetex-opentvpe.tex,v: 2.01 2009/06/1551 POSTSCRIPT FONTS AND BEYONDTe OpenTvpe foimat suppoits features equivalent to most of the advanced featuies of existingTiueTvpe and PostSciipt foimats, such as Adobes CID technologv foi Asian fonts, and extended mul-tilingual chaiactei sets. Howevei, multiple mastei fonts aie not pait of the OpenTvpe specication.OpenTvpe fonts mav contain moie than 65,000 glvphs, which allows a single font le to contain manvnonstandaid glvphs, such as old-stvle guies, tiue small capitals, fiactions, swashes, supeiiois, infeii-ois, titling letteis, contextual and stvlistic alteinates, and a full iange of ligatuies. OpenTvpe fonts thusoeis iich linguistic suppoit combined with advanced tvpogiaphic contiol. Featuie-iich Adobe Open-Tvpe fonts aie ofen distinguished bv the woid Pio, being pait of the font name. OpenTvpe fonts canbe installed and used alongside PostSciipt Tvpe 1 and TiueTvpe fonts.OpenTvpe, which is based on Unicode, signicantlv simplies font management and the pub-lishing piocess bv ensuiing that all of the iequiied glvphs foi a document aie contained in one cioss-platfoim font le thioughout the woikow.Te text model of OpenTvpe is that applications stoie text using the undeilving Unicode chaiac-teis, and applv foimatting to get at the specic desiied glvphs. In addition to the Unicode mapping ofdefault glvphs, the font has OpenTvpe lavout tables which tell it which glvphs to use when othei foimsaie desiied instead, such as small caps oi swashes. Tese tables also specifv which glvphs should tuininto ligatuies, oi when a sciipt font needs dieient glvphs foi a lettei when it is at the beginning, middleoi end of a woid, oi is a woid bv itself.Having the tiansfoimations distinct fiomthe undeilving text enables table-diiven automatic glvphsubstitution, which does not need to be one foi one; one glvph can be substituted foi seveial (such asthe ligatuie, which iemembeis that the undeilving text contains the chaiacteis f-f-i in seaiching),oi multiple glvphs can be substituted foi a single one. Glvph substitution can be context sensitive, oiit can be activated bv explicit usei demand. Tis featuie might not appeai essential foi Latin-basedlanguages, such as Spanish and English, but it becomes mandatoiv foi piopei tvpesetting of languagesthat use complex sciipts, such as Aiabic oi the Indic languages, since having letteis take dieientfoims based on theii position in the woid is a basic pait of how Aiabic woiks.OpenTvpe lavout featuies can be used to position oi substitute glvphs. Foi anv chaiactei, theie isa default glvph and positioning behavioi. Te application of lavout featuies to one oi moie chaiacteismav change the positioning, oi substitute a dieient glvph.Teie aie seveial advantages of using a laige OpenTvpe font ovei cuiientlv available expeit setsand alteinates. Fiist, one onlv has to deal with one font le, iathei than being clutteied with a wholeset of supplemental fonts. Second,theie can be keining between glvphs that might otheiwise have beenin sepaiate fonts. Finallv, the usei can tuin on ligatuies, smallcaps, oi old-stvle guies, much like boldoi italic stvling, without switching fonts.Histoiicallv, some of the highest qualitv tvpefaces have includeddieient designs foi dieient piintsizes. Rathei than using its multiple masteis technologv, most of Adobes OpenTvpe fonts now includefoui optical size vaiiations: caption, iegulai, subhead and displav. Called Opticals, these vaiiationshave been optimised foi use at specic point sizes. Although the exact intended sizes vaiv bv familv,the geneial size ianges include: caption (68 point), iegulai (913 point), subhead (1424 point) anddisplav (2572 point).1.4.1 OpenType tablesOpenTvpe font les contain tables that contain eithei TiueTvpe oi PostSciipt outline font data andthe data in these tables aie used bv iendeiing piogiams to iendei the TiueTvpe oi PostSciipt glvphs.Moieovei, some of the data is independent of the paiticulai outline foimat used.'OpenTvpe fonts ist contain a numbei of required tables.'Te stiuctuie of an OpenTvpe font le is desciibed at the URL http://www.microsoft.com/typography/otspec/otff.htm; a shoit desciiption of the contents of the tables is at the URL http://www.microsoft.com/typography/otspec/recom.htm.6xetex-opentvpe.tex,v: 2.01 2009/06/151.4 OpenTypecmap Chaiactei to glvph mappinghead Font headeihhea Hoiizontal headeihmtx Hoiizontal metiicsmaxp Maximum piolename Naming tableOS/2 OS/2 and Windows specic metiicspost PostSciipt infoimationFoi OpenTvpe fonts based on TiueTvpe outlines, the following tables aie used:cvt Contiol Value Tablefpgm Font piogiamglyf Glvph dataloca Index to locationprep CVT PiogiamFoi OpenTvpe fonts based on PostSciipt anothei set of tables containing data specic to PostSciiptfonts aie used instead of the tables listed above:CFF PostSciipt font piogiam (compact font foimat)VORG Veitical OiiginOpenTvpe fonts mav contain bitmaps of glvphs, in addition to outlines. Hand-tuned bitmaps aieespeciallv useful in OpenTvpe fonts foi iepiesenting complex glvphs at veiv small sizes. If a bitmap foia paiticulai size is piovided in a font, it will be used bv the svsteminstead of the outline when iendeiingthe glvph. Foi OpenTvpe fonts containing bitmap glvphs thiee tables aie available:EBDT Embedded bitmap dataEBLC Embedded bitmap location dataEBSC Embedded bitmap scaling dataFinallv, advanced tvpogiaphv, veitical tvpesetting and othei special functions aie suppoited withthe following tables:BASE Baseline dataGDEF Glvph denition dataGPOS Glvph positioning dataGSUB Glvph substitution dataJSTF Iustication dataDSIG Digital signatuiegasp Giid-tting/Scan-conveisionhdmx Hoiizontal device metiicskern KeiningLTSH Lineai thieshold dataPCLT PCL 5 dataVDMX Veitical device metiicsvhea Veitical Metiics headeivmtx Veitical MetiicsFuitheimoie, OpenTvpe fonts use a set of sciipt, language and featuie tags to stiuctuie the infoi-mation in theii tables.Script tags identifv the sciipts iepiesented in an OpenTvpe font. Each sciipt coiiesponds to a con-tiguous chaiactei code iange in Unicode. Sciipt tags aie foui-bvte chaiactei stiings composed of up tofoui letteis in the ASCII chaiacteis iange 0x20-0x7E, padding with blanks (0x20) if iequiied. A listof sciipts and theii tags follows.'dflt Defaultarab Aiabicarmn Aimenianbeng Bengalibopo Bopomofobrai Biaillebyzm Bvzantine Musiccans Canadian Svllabicscher Cheiokeecyrl Cviillicdeva Devanagaiiethi Ethiopicgeor Geoigiangrek Gieekgujr Gujaiatiguru Guimukhijamo Hangul Iamohang Hangulhani CIK Ideogiaphichebr Hebiewkana Hiiagana'See http://www.microsoft.com/typography/otspec/scripttags.htm foi an up-to-date list.xetex-opentvpe.tex,v: 2.01 2009/06/1571 POSTSCRIPT FONTS AND BEYONDknda Kannadakana Katakanakhmr Khmeilao Laolatn Latinmlym Malavalammong Mongolianmymr Mvanmaiogam Oghamorya Oiivarunr Runicsinh Sinhalasyrc Sviiactaml Tamiltelu Teluguthaa Taanathai Taitibt Tibetanyi YiWhen the table with the list of sciipts is seaiched foi a sciipt, and no entiv is found, and theieexists an entiv foi the DFLT sciipt, then this entiv must be used. Fuitheimoie, the default sciipt canonlv contain a single, default, language.Language system tags identifv the language svstems suppoited in an OpenTvpe font. What is meantbv a language svstem in this context is a set of tvpogiaphic conventions foi how text in a given sciiptshould be piesented. Such conventions mav be associated with paiticulai languages, with paiticulaigenies of usage, with dieient publications, and othei such factois. Foi example, paiticulai glvph vaii-ants foi ceitain chaiacteis mav be iequiied foi paiticulai languages, oi foi phonetic tiansciiption oimathematical notation.Note that two oi moie languages mav follow the same conventions oi that moie than one set oftvpogiaphic conventions can applv to a given language. Teiefoie language svstem tags do not coiie-spond in a one-to-one mannei with languages.'Language svstem tags aie foui-bvte chaiactei stiings composed of up to foui chaiacteis in theASCII chaiacteis iange 0x20-0x7E, padding with blanks (0x20) if iequiied. A list of languages andtheii language svstem tags follows.dflt DefaultABA AbazaABK AbkhazianADY AdvgheAFK AfiikaansAFR AfaiAGW AgawALT AltaiAMH AmhaiicAPPH Phonetic tiansciiption(Ameiicanist conventions)ARA AiabicARI AaiiARK AiakaneseASM AssameseATH AthapaskanAVR AvaiAWA AwadhiAYM AvmaiaAZE AzeiiBAD BadagaBAG BaghelkhandiBAL BalkaiBAU BauleBBR BeibeiBCH BenchBCR Bible CieeBEL BelaiussianBEM BembaBEN BengaliBGR BulgaiianBHI BhiliBHO BhojpuiiBIK BikolBIL BilenBKF BlackfootBLI BalochiBLN BalanteBLT BaltiBMB BambaiaBML BamilekeBRE BietonBRH BiahuiBRI Biaj BhashaBRM BuimeseBSH BashkiiBTI BetiCAT CatalanCEB CebuanoCHE ChechenCHG Chaha GuiageCHH ChattisgaihiCHI ChichewaCHK ChukchiCHP ChipewvanCHR CheiokeeCHU ChuvashCMR ComoiianCOP CopticCRE CieeCRR CaiiieiCRT Ciimean TataiCSL Chuich SlavonicCSY CzechDAN DanishDAR Daigwa'See http://www.microsoft.com/typography/otspec/scripttags.htm foi an up-to-date list of language tagsand the coiiespondece to the ISO 639 codes, which identifv individual languages as well as foi ceitain collections of languages.8xetex-opentvpe.tex,v: 2.01 2009/06/151.4 OpenTypeDCR Woods CieeDEU Geiman (Standaid)DGR DogiiDHV DhivehiDJR DjeimaDNG DangmeDNK DinkaDUN DunganDZN DzongkhaEBI EbiiaECR Eastein CieeEDO EdoEFI EkELL GieekENG EnglishERZ EizvaESP SpanishETI EstonianEUQ BasqueEVK EvenkiEVN EvenEWE EweFAN Fiench AntilleanFAR FaisiFIN FinnishFJI FijianFLE FlemishFNE Foiest NenetsFON FonFOS FaioeseFRA Fiench (Standaid)FRI FiisianFRL FiiulianFTA FutaFUL FulaniGAD GaGAE GaelicGAG GagauzGAL GalicianGAR GaishuniGAW GaihwaliGEZ GeezGIL GilvakGMZ GumuzGON GondiGRN GieenlandicGRO GaioGUA GuaianiGUJ GujaiatiHAI HaitianHAL HalamHAR HaiautiHAU HausaHAW HawaiinHBN Hammei-BannaHIL HiligavnonHIN HindiHMA High MaiiHND HindkoHO HoHRI HaiaiiHRV CioatianHUN HungaiianHYE AimenianIBO IgboIJO IjoILO IlokanoIND IndonesianING IngushINU InuktitutIPPH Phonetic tiansciiption (IPAconventions)IRI IiishIRT Iiish TiaditionalISL IcelandicISM Inaii SamiITA ItalianIWR HebiewJAN IapaneseJAV IavaneseJII YiddishJUD IudezmoJUL IulaKAB KabaidianKAC KachchiKAL KalenjinKAN KannadaKAR KaiachavKAT GeoigianKAZ KazakhKEB KebenaKGE Khutsuii GeoigianKHA KhakassKHK Khantv-KazimKHM KhmeiKHS Khantv-ShuiishkaiKHV Khantv-VakhiKHW KhowaiKIK KikuvuKIR KiighizKIS KisiiKKN KokniKLM KalmvkKMB KambaKMN KumaoniKMO KomoKMS KomsoKNR KanuiiKOD KodaguKOK KonkaniKON KikongoKOP Komi-PeimvakKOR KoieanKOZ Komi-ZviianKPL KpelleKRI KiioKRK KaiakalpakKRL KaielianKRM KaiaimKRN KaienKRT KooieteKSH KashmiiiKSI KhasiKSM Kildin SamiKUI KuiKUL KulviKUM KumvkKUR KuidishKUU KuiukhKUY KuvKYK KoivakLAD LadinLAH LahuliLAK LakLAM LambaniLAO LaoLAT LatinLAZ LazLCR L-CieeLDK LadakhiLEZ LezgiLIN LingalaLMA Low Maiixetex-opentvpe.tex,v: 2.01 2009/06/1591 POSTSCRIPT FONTS AND BEYONDLMB LimbuLMW LomweLSB Lowei SoibianLSM Lule SamiLTH LithuanianLUB LubaLUG LugandaLUH LuhvaLUO LuoLVI LatvianMAJ MajangMAK MakuaMAL Malavalam TiaditionalMAN MansiMAR MaiathiMAW MaiwaiiMBN MbunduMCH ManchuMCR Moose CieeMDE MendeMEN MeenMIZ MizoMKD MacedonianMLE MaleMLG MalagasvMLN MalinkeMLR Malavalam RefoimedMLY MalavMND MandinkaMNG MongolianMNI ManipuiiMNK ManinkaMNX Manx GaelicMOK MokshaMOL MoldavianMON MonMOR MoioccanMRI MaoiiMTH MaithiliMTS MalteseMUN MundaiiNAG Naga-AssameseNAN NanaiNAS NaskapiNCR N-CieeNDB NdebeleNDG NdongaNEP NepaliNEW NewaiiNHC Noiwav House CieeNIS NisiNIU NiueanNKL NkoleNLD DutchNOG NogaiNOR NoiwegianNSM Noithein SamiNTA Noithein TaiNTO EspeiantoNYN NvnoiskOCR Oji-CieeOJB OjibwavORI OiivaORO OiomoOSS OssetianPAA Palestinian AiamaicPAL PaliPAN PunjabiPAP PalpaPAS PashtoPGR Polvtonic GieekPIL PilipinoPLG PalaungPLK PolishPRO PiovencalPTG PoitugueseQIN ChinRAJ RajasthaniRBU Russian BuiiatRCR R-CieeRIA RiangRMS Rhaeto-RomanicROM RomanianROY RomanvRSY RusvnRUA RuandaRUS RussianSAD SadiiSAN SanskiitSAT SantaliSAY SavisiSEK SekotaSEL SelkupSGO SangoSHN ShanSIB SibeSID SidamoSIG Silte GuiageSKS Skolt SamiSKY SlovakSLA SlavevSLV SlovenianSML SomaliSMO SamoanSNA SenaSND SindhiSNH SinhaleseSNK SoninkeSOG Sodo GuiageSOT SothoSQI AlbanianSRB SeibianSRK SaiaikiSRR SeieiSSL South SlavevSSM Southein SamiSUR SuiiSVA SvanSVE SwedishSWA Swadava AiamaicSWK SwahiliSWZ SwaziSXT SutuSYR SviiacTAB TabasaianTAJ TajikiTAM TamilTAT TataiTCR TH-CieeTEL TeluguTGN TonganTGR TigieTGY TigiinvaTHA TaiTHT TahitianTIB TibetanTKM TuikmenTMN Temne10xetex-opentvpe.tex,v: 2.01 2009/06/151.4 OpenTypeTNA TswanaTNE Tundia NenetsTNG TongaTOD TodoTRK TuikishTSG TsongaTUA Tuiovo AiamaicTUL TuluTUV TuvinTWI TwiUDM UdmuitUKR UkiainianURD UiduUSB Uppei SoibianUYG UvghuiUZB UzbekVEN VendaVIT VietnameseWAG WagdiWA WaWCR West-CieeWEL WelshWLF WolofXHS XhosaYAK YakutYBA YoiubaYCR Y-CieeYIC Yi ClassicYIM Yi ModeinZHP Chinese PhoneticZHS Chinese SimpliedZHT Chinese TiaditionalZND ZandeZUL Zulu1.4.2 OpenType featuresFeatures piovide infoimation about howto use the glvphs in an OpenTvpe oi TiueTvpe font to iendei asciipt oi language. Foi example, an Aiabic font might have a featuie foi substituting initial glvph foims,and a Kanji font might have a featuie foi positioning glvphs veiticallv. All OpenTvpe Lavout featuiesdene data foi glvph substitution, glvph positioning, oi both.Each OpenTvpe Lavout featuie has a featuie tag that identies its tvpogiaphic function and eects.Bv examining a featuies tag, a text-piocessing client can deteimine what a featuie does and decidewhethei to implement it. All tags aie foui-bvte chaiactei stiings composed of a limited set of ASCIIchaiacteis (iange 0x20-0x7E).A featuie denition does not necessaiilv piovide all the infoimation iequiied to piopeilv imple-ment glvph substitution oi positioning actions. Ofen, a text-piocessing client mav need to supplv ad-ditional data' In all cases, the text-piocessing client is iesponsible foi applving, combining, and aibi-tiating among featuies and iendeiing the iesult.Te list of featuies iegisteied bv Miciosof togethei with a shoit desciiption follows.aalt Access All Alteinatesabvf Above-base Foimsabvm Above-base Maik Position-ingabvs Above-base Substitutionsafrc Alteinative Fiactionsakhn Akhandsblwf Below-base Foimsblwm Below-base Maik Position-ingblws Below-base Substitutionscalt Contextual Alteinatescase Case-Sensitive Foimsccmp Glvph Composition andDecompositionclig Contextual Ligatuiescpsp Capital Spacingcswh Contextual Swashcurs Cuisive Positioningc2sc Small Capitals Fiom Capi-talsc2pc Petite Capitals Fiom Capi-talsdist Distancesdlig Discietionaiv Ligatuiesdnom Denominatoisexpt Expeit Foimsfalt Final Glvph on Line Altei-natesfin2 Teiminal Foims #2fin3 Teiminal Foims #3fina Teiminal Foimsfrac Fiactionsfwid Full Widthshalf Half Foimshaln Halant Foimshalt Alteinate Half Widthshist Histoiical Foimshkna Hoiizontal Kana Alteinateshlig Histoiical Ligatuieshngl Hangulhojo Hojo Kanji Foims (IIS X0212-1990 Kanji Foims)'As an example let us considei the init featuie whose function is to piovide initial glvph foims. Nothing in the featuieslookup tables indicates when oi wheie to applv this featuie duiing text piocessing. Hence, to coiiectlv use this featuie in Aiabictext wheie initial glvph foims appeai at the beginning of woids, text-piocessing clients must be able to identifv the ist glvphposition in each woid befoie making the glvph substitution.Moie details about each featuie aie available at the Miciosof OpenTvpe site http://www.microsoft.com/typography/otspec/featuretags.htm, oi Adobe developeis site http://partners.adobe.com/public/developer/opentype/index_tag3.htmlxetex-opentvpe.tex,v: 2.01 2009/06/15111 POSTSCRIPT FONTS AND BEYONDhwid Half Widthsinit Initial Foimsisol Isolated Foimsital Italicsjalt Iustication Alteinatesjp78 IIS78 Foimsjp83 IIS83 Foimsjp90 IIS90 Foimsjp04 IIS2004 Foimskern Keininglfbd Lef Boundsliga Standaid Ligatuiesljmo Leading Iamo Foimslnum Lining Figuieslocl Localized Foimsmark Maik Positioningmed2 Medial Foims #2medi Medial Foimsmgrk Mathematical Gieekmkmk Maik to Maik Positioningmset Maik Positioning via Sub-stitutionnalt Alteinate AnnotationFoimsnlck NLC Kanji Foimsnukt Nukta Foimsnumr Numeiatoisonum Oldstvle Figuiesopbd Optical Boundsordn Oidinalsornm Oinamentspalt Piopoitional AlteinateWidthspcap Petite Capitalspnum Piopoitional Figuiespref Pie-Base Foimspres Pie-base Substitutionspstf Post-base Foimspsts Post-base Substitutionspwid Piopoitional Widthsqwid Ouaitei Widthsrand Randomizerlig Requiied Ligatuiesrphf Reph Foimsrtbd Right Boundsrtla Right-to-lef alteinatesruby Rubv Notation Foimssalt Stvlistic Alteinatessinf Scientic Infeiioissize Optical sizesmcp Small Capitalssmpl Simplied Foimsss01 Stvlistic Set 1ss02 Stvlistic Set 2ss03 Stvlistic Set 3ss04 Stvlistic Set 4ss05 Stvlistic Set 5ss06 Stvlistic Set 6ss07 Stvlistic Set 7ss08 Stvlistic Set 8ss09 Stvlistic Set 9ss10 Stvlistic Set 10ss11 Stvlistic Set 11ss12 Stvlistic Set 12ss13 Stvlistic Set 13ss14 Stvlistic Set 14ss15 Stvlistic Set 15ss16 Stvlistic Set 16ss17 Stvlistic Set 17ss18 Stvlistic Set 18ss19 Stvlistic Set 19ss20 Stvlistic Set 20subs Subsciiptsups Supeisciiptswsh Swashtitl Titlingtjmo Tiailing Iamo Foimstnam Tiaditional Name Foimstnum Tabulai Figuiestrad Tiaditional Foimstwid Tiid Widthsunic Unicasevalt Alteinate Veitical Metiicsvatu Vattu Vaiiantsvert Veitical Wiitingvhal Alteinate Veitical HalfMetiicsvjmo Vowel Iamo Foimsvkna Veitical Kana Alteinatesvkrn Veitical Keiningvpal Piopoitional AlteinateVeitical Metiicsvrt2 Veitical Alteinates and Ro-tationzero Slashed Zeio1.4.3 OpenType support todayAs an example of how publishing applications can exploit OpenTvpes lavout featuies we can look atOpenTvpe suppoit in Adobes Illustrator, InDesign and Photoshop' piogiams. Tese include automaticsubstitution bv alteinate glvphs in an OpenTvpe Pio font (ligatuies, small capitals, and piopoitionalold-stvle guies, veitical shif of punctuation in an all-caps setting). Moieovei, anv alteinate glvphsin OpenTvpe fonts mav be selected manuallv via the Insert Character palette (see Figuie 1.1 on thefacing page). Tese OpenTvpe Pio fonts oei a full iange of accented chaiacteis to suppoit all centialand eastein Euiopean languages, and manv of them also contain suppoit foi the Cviillic and Gieekalphabets.Featuie suppoit acioss Miciosofs Oce applications exists foi those featuies that aie necessaivfoi language suppoit, such as contextual substitutions foi Aiabicand onlv in the languages whichiequiie them (e.g., Woid 2003 does contextual substitutions foi Aiabic, but not foi English).'See http://www.adobe.com/products/XXX/main.htm, wheie XXX stands foi illustrator, indesign, and pho-toshop, iespectivelv.12xetex-opentvpe.tex,v: 2.01 2009/06/151.4 OpenTypeFiguie 1.1: Using OpenTvpes advanced tvpogiaphic featuies in Adobe InDesign. Lef: selection of au-tomatic substitution of ligatuies and old-stvle guies on a menu. Right: select and inseit anv alteinateglvph Insert Character palette.Openoce on all suppoited platfoims has a somewhat similai appioach to Miciosofs Oce suitein that it allows one to use the chaiacteis piesent in the font but does not ieallv piesent an inteiface tothe advanced tvpogiaphic featuies (see Figuie 1.2 on the next page).Tat leaves us with the availabilitv of the fonts themselves. Aiound the veai 2000 theie weie onlv ahandful of OpenTvpe fonts, and almost all of them weie fiom Adobe. Nowadavs, theie aie thousandsavailable fiomovei two dozen font foundiies. Foi instance, the entiie Adobe Tvpe Libiaiv of ovei 2,200fonts has been tianslated into the OpenTvpe foimat, URW has ieleased ovei 1,000 OpenTvpe fonts,and othei laige foundiies, such as Linotvpe and Agfa Monotvpe, as well as most smallei foundiies, aiealso cieating OpenTvpe fonts. Most of Miciosofs svstem fonts, and Apples Iapanese svstem fonts, aieOpenTvpe. Similailv, OpenTvpe is being embiaced bv majoi tvpe foundiies foi non-alphabetic sciipts,such as Chinese and Iapanese.Howevei, it is not enough foi a font to be in the OpenTvpe foimat to be suie that it has extendedlanguage suppoit oi extia tvpogiaphic featuies. Teiefoie, befoie puichasing, vou should examine thefeatuies piesent in a font.' To inspect a font that vou alieadv have on voui Miciosof Windows svstem,vou can install the Font Properties Extension fiom Miciosof. Tis add-on allows vou to iight-click ona font to displav a much expanded set of piopeities, which includes language suppoit and OpenTvpelavout featuies (see Figuie 1.3 on page 15).1.4.4 Interrogating OpenType fontsEddie Kohleis otnfo piogiam piints infoimation about an OpenTvpe font.> otfinfo --help'Otfinfo' reports information about an OpenType font to standard output.Options specify what information to print.Usage: otfinfo [-sfzpg] [OTFFILES...]Query options:-s, --scripts Report font's supported scripts.-f, --features Report font's GSUB/GPOS features.-z, --optical-size Report font's optical size information.-p, --postscript-name Report font's PostScript name.-a, --family Report font's family name.'In the case of Adobe, wheie cuiientlv not all fonts ieleased in OpenTvpe foimat have signicant added featuies oi extendedlanguage suppoit, vou biowse all fonts in the Adobe Type Library fiom the URL http://store.adobe.com/type/main.html, so that vou can inspect the font vou aie inteiested in. Othei font vendois oei similai possibilities.Pait of his lcdf tools, see www.lcdf.org/type/.xetex-opentvpe.tex,v: 2.01 2009/06/15131 POSTSCRIPT FONTS AND BEYONDFiguie 1.2: OpenTvpe Unicode suppoit in OpenOce. Te top panel shows text in vaiious alphabetsand the bottom panel the chaiacteis available in the Gieek pait of font lavout.-v, --font-version Report font's version information.-i, --info Report font's names and designer/vendor info.-g, --glyphs Report font's glyph names.-t, --tables Report font's OpenType tables.Other options:--script=SCRIPT[.LANG] Set script used for --features [latn].-V, --verbose Print progress information to standard error.-h, --help Print this message and exit.-q, --quiet Do not generate any error messages.--version Print version number and exit.> otfinfo --info texmf-commercial/fonts/opentype/adobe/minionpro-regular.otfFamily: Minion ProSubfamily: RegularFull name: Minion ProPostScript name: MinionPro-RegularVersion: Version 2.012;PS 002.000;Core 1.0.38;makeotf.lib1.6.6565Unique ID: 2.012;ADBE;MinionPro-Regular14xetex-opentvpe.tex,v: 2.01 2009/06/151.4 OpenTypeFiguie 1.3: Miciosofs Fonts Extension utilitv displavs OpenTvpe featuies foi MinionPio-Regulai andthe suppoited Chaiactei sets foi MviiadPio-Bold when vou iight-click on the font (Tis utilitv, ttfext,adds seveial new piopeitv tabs to the standaids piopeities dialog box, such as infoimation ielating tofont oiigination and copviight, the tvpe sizes to which hinting and smoothing aie applied, and the codepages suppoited bv extended chaiactei. It can be downloaded fiomhttp://www.microsoft.com/typography/TrueTypeProperty21.mspx.)Designer: Robert SlimbachVendor URL: http://www.adobe.com/type/Trademark: Minion is either a ...Copyright: 2000, 2002, 2004 ...License URL: http://www.adobe.com/type/legal.html> otfinfo --script texmf-commercial/fonts/opentype/adobe/minionpro-regular.otfcyrl Cyrillic latn.DEU Latin/German (Standard)grek Greek latn.MOL Latin/Moldavianlatn Latin latn.ROM Latin/Romanianlatn.AZE Latin/Azeri latn.SRB Latin/Serbianlatn.CRT Latin/Crimean Tatar latn.TRK Latin/Turkish> otfinfo --tables texmf-commercial/fonts/opentype/adobe/minionpro-regular.otf64 BASE 54 head132417 CFF 36 hhea5228 DSIG 6652 hmtx40074 GPOS 6 maxp13872 GSUB 1533 name96 OS/2 32 post4048 cmap> otfinfo --features texmf-commercial/fonts/opentype/adobe/minionpro-regular.otfxetex-opentvpe.tex,v: 2.01 2009/06/15151 POSTSCRIPT FONTS AND BEYONDaalt Access All Alternates c2sc Small Capitals From Capitalscase Case-Sensitive Forms cpsp Capital Spacingdlig Discretionary Ligatures dnom Denominatorsfina Terminal Forms frac Fractionshist Historical Forms kern Kerningliga Standard Ligatures lnum Lining Figuresnumr Numerators onum Oldstyle Figuresordn Ordinals ornm Ornamentspnum Proportional Figures salt Stylistic Alternatessinf Scientific Inferiors size Optical Sizesmcp Small Capitals ss01 Stylistic Set 1ss02 Stylistic Set 2 sups Superscripttnum Tabular Figures zero Slashed ZeroIust van Rossums ttx utilitv' can decompile the contents of an OpenTvpe font and output it inXML foimat. Tis comes in handv if vou want to studv the contents of a given font (e.g., its tables) oi(slightlv) modifv it.> ttx --helpusage: ttx [options] inputfile1 [... inputfileN]TTX 2.0b1 -- From OpenType To XML And BackIf an input file is a TrueType or OpenType font file, it will bedumped to an TTX file (an XML-based text format).If an input file is a TTX file, it will be compiled to a TrueTypeor OpenType font file.Output files are created so they are unique: an existing file isnever overwritten.General options:-h Help: print this message-d Specify a directory where the output files areto be created.-v Verbose: more messages will be written to stdout about whatis being done.Dump options:-l List table info: instead of dumping to a TTX file, list someminimal info about each table.-t Specify a table to dump. Multiple -t optionsare allowed. When no -t option is specified, all tableswill be dumped.-x Specify a table to exclude from the dump. Multiple-x options are allowed. -t and -x are mutually exclusive.-s Split tables: save the TTX data into separate TTX files pertable and write one small TTX file that contains referencesto the individual table dumps. This file can be used asinput to ttx, as long as the table files are in thesame directory.-i Do NOT disassemble TT instructions: when this option is given,all TrueType programs (glyph programs, the font program and the'Wiitten in Pvthon and pait of the FontTools toolset (sourceforge.net/projects/fonttools).16xetex-opentvpe.tex,v: 2.01 2009/06/151.4 OpenTypepre-program) will be written to the TTX file as hex datainstead of assembly. This saves some time and makes the TTXfile smaller.Compile options:-m Merge with TrueType-input-file: specify a TrueType or OpenTypefont file to be merged with the TTX file. This option is onlyvalid when at most one TTX file is specified.-b Don't recalc glyph bounding boxes: use the values in the TTXfile as-is.Tus, to decompile a font myfont.otf just specifv:> ttx myfont.otfTis will wiite a le myfon.ttx in the diiectoiv wheie the font le iesides. If vou aie onlv inteiestedin two tables (e.g., GSUB and GPOS), specifv them on the command line:> ttx -t GSUB -t GPOS myfont.otfTo conveit an XML le myfont.ttx back into an OpenTvpe oi TiueTvpe le is similailv easv:> ttx myfont.ttxIt vou want to intioduce modications (e.g., given in XML foimat in the le myfontmods.ttx) intoan OpenTvpe le, use the -m option, as follows:> ttx -m myfont.otf myfontmods.ttxA moie explicit example with the font MinionPio follows.> ttx -l /texlive/2007/texmf-commercial/fonts/opentype/adobe/minionpro-regular.otfListing table info for"/texlive/2007/texmf-commercial/fonts/opentype/adobe/minionpro-regular.otf":tag checksum length offset tag checksum length offset---- ---------- ------- ------- ---- ---------- ------- -------BASE 0x086729a7 64 199052 CFF 0x101232c2 132417 6032DSIG 0x446dbd94 5228 199116 GPOS 0xx71552700 40074 158976GSUB 0xx3bf7bcba 13872 145104 OS/2 0x40e57e9f 96 320cmap 0x0cedc8f1 4048 1952 head 0xx2167aded 54 220hhea 0x09140bb5 36 276 hmtx 0xx37425493 6652 138452maxp 0x067f5000 6 312 name 0x3cf7b183 1533 416post 0x0x47ffce 32 6000ttx -d. -t head /texlive/2007/texmf-commercial/fonts/opentype/adobe/minionpro-regular.otfDumping "/texlive/2007/texmf-commercial/fonts/opentype/adobe/minionpro-regular.otf"to "./minionpro-regular.ttx"...Dumping 'head' table...> less ./minionpro-regular.ttx

xetex-opentvpe.tex,v: 2.01 2009/06/15171 POSTSCRIPT FONTS AND BEYOND

Foi ieasons of eciencv TiueTvpe and OpenTvpe font instances can be giouped into collection(.ttc), so that dieient fonts can shaie common tables to desciibe glvphs. Some piogiams aie notable to extiact the vaiious font components fiom such a collection. To help with this pioblem a smallutilitv, ttc2ttf, exists to extiact the font instances fiom a collection.18xetex-opentvpe.tex,v: 2.01 2009/06/15C H A P T E R 2X E TEX: TEX meets OpenTypeand Unicode2.1 X E TEX: a historical introduction and some basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.2 X E TEX: typesetting with glyphs, characters and fonts . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.3 Supplementary commands introduced by X E TEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332.4 fontspec . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 432.5 X E TEX and other engines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47X E TEX is a tvpesetting svstem based on a meigei of e-TEX with Unicode and modein font technologies.Ionathan Kew is the main developei behind X E TEX. X E TEXs main aim is to deal with the complexities(notice the coloied paits on the chaiacteis in Figuie 2.1) needed to tvpeset texts in the vaiious sciiptsused in the woild (Figuie 2.2 on the next page), in paiticulai in Asia (Figuie 2.3 on the following page).Figuie 2.1: Complexities when dealing with vaiious languages2 X E TEX: TEX MEETS OPENTYPE AND UNICODEFiguie 2.2: Sciipts used in vaiious paits of the woildFiguie 2.3: Asian sciipts20xetex-geneial.tex,v: 2.02 2009/06/152.1 X E TEX: a historical introduction and some basicsWe stait the chaptei with an intioduction, a shoit histoiv and an oveiview of the basic opeiatingpiinciples of X E TEX(Section 2.1). X E TEXs chaiactei/glvph model, its tvpesetting algoiithm and the wavit handles fonts is the subject of Section 2.2.Section 2.3 piesents in detail the supplementaiv commands intioduced bv X E TEX, in paiticulai itsextension to TeXs \font command to take full advantage of the possibilities of the OpenTvpe fonts.A LATEX inteiface to X E TEXs font handling is piesented in Section 2.4.2.1 X E TEX: a historical introduction and some basicsX E TEX' was developed at SIL bv its authoi Ionathan Kew. One of X E TEXs impoitant aims is to allowthe TEX engine to diiectlv use fonts available on the opeiating svstem. Technicallv this is implementedbv augmenting TEXs \font command so that it asks the host opeiating svstem to locate a given font(using its ieal name, as known to the opeiating svstem, not some civptic lename, e.g., la Beiiv) inwhatevei font collection available. Tis means that all fonts known on a svstem and available to theusei inteiface become usable foi tvpesetting in XeTeX and with the same names. Hence it is no longeinecessaiv to iun anv TEX-specic pioceduies (e.g., fontinst, oi applv one of the iecipes desciibed eai-liei in this chaptei). When X E TEX is instiucted to use a font, it locates the actual font le itself (it canhandle all thiee vaiiants OpenTvpe, PostSciipt Tvpe 1, and TiueTvpe), and no longei needs a .tfm le.XeTeXs paiagiaph building ioutine thus obtains metiic infoimation about the chaiactei glvphs diiectlvfiom the font le. In addition, it has to take caie of the complexities of mapping chaiacteis to glvphs,paiticulailv in cuisive and non-Latin sciipts. Teiefoie, XeTeX does not build its paiagiaphs fiom listsof chaiacteis, but fiom woids, each of which consists of a whole iun of consecutive chaiacteis in agiven font. Linguistical and tvpogiaphical tiansfoimations and eects aie delegated to the appiopii-ate lavout engine (X E TEX has inteifaces to ATSUI,` ICU, and SILs Giaphite). Te iesult is an aiiavof glvphs and theii positions that iepiesent woids as laid out using the cuiient font. Fiom this list ofwoids, which aie inteileaved with glue, penalties, etc., a paiagiaph is built. Of couise, when hvphen-ation is iequiied, woids mav have to be taken apait and ieassembled afeiwaids using possible bieakpositions. Neveitheless the basic idea iemains: collect iuns of chaiacteis, hand them down as completeunits to a font iendeiing libiaiv, which is capable of handling the lavout at the level of the individualglvphs.X E TEX woiks with an extended veision of the existing dvipdfmx PDF diivei, wheie the help ofIin-Hwan Cho has to be acknowledged. Akiia Kakutos W32tex (http://www.fsci.fuk.kindai.ac.jp/kakuto/win32-ptex) has contiibuted a lot to make X E TEXavailable on Miciosof Windows.Ross Mooie has woiked on giaphics and coloi diiveis, while Mivata Shigeiu has impioved the handlingof veitical text and CIK suppoit in both X E TEX itself and the diivei, and piovides suppoit foi PSTricksgiaphics.'Tis section is based on an inteiview with X E TEXs authoi Ionathan Kew. Foi the full text of the inteiview see http://tug.org/interviews/interview-files/jonathan-kew.html.SIL (initiallv known as the Summer Institute of Linguistics, see http://www.sil.org foi moie infoimation) was cieatedin 1934. It now has about 5,000 collaboiatois coming fiom ovei 60 countiies. SILs main activitv is the linguistic investigation ofsome 1,800 languages spoken bv moie than a billion people in moie than 70 countiies. In paiticulai, SIL publishes Ethnologue,languages of the world (http://www.ethnologue.com/), a book which desciibes 6912 languages spoken on eaith.`Apple Tvpe Seivices foi Unicode Imaging is the technologv behind all text diawing in Mac OS X, and is thus available onthat platfoim onlv. ATSUI allows ne contiol ovei lavout featuies, piovides advanced multilingual text-piocessing seivices, andsuppoits high-end tvpogiaphv. Foi details see http://developer.apple.com/documentation/Carbon/Conceptual/ATSUI_Concepts/.International Components for Unicode. ICU is a widelv poitable set of C/C++ and Iava libiaiies pioviding Unicode and global-ization suppoit foi sofwaie applications. ICU ensuie that applications give the same iesults on all platfoims and between C/C++and Iava sofwaie, see http://www.icu-project.org/.Giaphite is a pioject to piovide iendeiing capabilities foi complex non-Roman wiiting svstems. Giaphite iuns on vaiiouscomputei platfoims and allows the cieation of smait fonts which suppoit displaving in wiiting svstems with vaiious complexbehaviois. Details aie available at http://scripts.sil.org/RenderingGraphite.xetex-geneial.tex,v: 2.02 2009/06/15212 X E TEX: TEX MEETS OPENTYPE AND UNICODELATEX integiation foi X E TEX is foi a laige pait the woik of Will Robeitson.' Although X E TEX ac-cepts Unicode input and suppoits OpenTvpe fonts, X E TEXs inteifaces with OpenTvpe fonts is iatheilow-level. Foi instance, font featuies, such as using loweicase numbeis instead of uppeicase numbeis,aie activated with haid to iemembei stiings such as +onum. Will Robeitsons fontspec package pio-vides a moie ieadable and easv to use inteiface to such things with kevval-tvpe options, such as "Num-bers=Lowercase" foi the above example. If vou want to use a new OpenTvpe (oi TiueTvpe) font,vou no lonei need to mess aiound with extia les foi font metiics and font denitions. It is sucientto declaie voui new font with the command \setmainfont in the pieamble to select that font as themain document font.Bv default, LATEXs NFSS mechanism onlv deals well with macioscopic font vaiiations, such asweight, shape and size. fontspec extends LATEXs font handling bv pioviding suppoit foi font featuies,which allow the usei at anv point in the document to vaiv a bioad iange of tvpogiaphical details bvusing dieient font instances.2.1.1 A brief history Apiil 2004: X E TEX 0.3 was ielased to the TEX communitv (on Mac OS X onlv) and oeied: integiated Unicode suppoit access to all fonts installed on the computei AAT (Apple Advanced Tvpogiaphv) foi tvpogiaphic featuies Ouicktime foi giaphics suppoit Febiuaiv 2005 : X E TEX 0.9 was ieleased with as featuies: Opentvpe suppoit compatibilitv with moie impoitant LATEX packages Apiil 2006 (BachoTEX): X E TEXfoi Linux was ieleased (ist public announcement of the availabilitvof X E TEX) Iune 2006: Akiia Kakuto announces the availabilitv of X E TEX on MS Windows Febiuaiv 2007: TEXLive 2007 contains X E TEX 0.996 foi all suppoited binaiv platfoims Septembei 2007: X E TEX 0.997 available with MikTeX 2.7 (beta) Fall 2008: TEXLive 2008 contains X E TEX 0.999 foi all suppoited binaiv platfoims Summei 2009: TEXLive 2009 contains X E TEX 0.999.5 foi all suppoited binaiv platfoims2.1.2 X E TEX: basic principles based on e-TEXs tvpesetting engine includes TeX--XeT (commands \beginL, \endL, \beginR and \endR activated with\TeXXeTstate=1) foi bi-diiectional tvpesetting (Aiabic, Hebiew, etc.) Unicode encoding (UTF-8 oi UTF-16) used bv default most LATEXextensions (e.g., graphics, xcolor, geometry, crop, hyperref, pgf) nowautomaticallv detectthe piesence of the X E TEX engine and aie compatible with it diiectlv uses OpenTvpe, TiueTvpe and PostSciipt fonts installed on the svstem without the needto cieate TEX-specic les (.tfm, .vf, .fd, etc.)'See http://tug.org/interviews/interview-files/will-robertson.html foi an inteiviewwith Will Robeit-son.22xetex-geneial.tex,v: 2.02 2009/06/152.2 X E TEX: typesetting with glyphs, characters and fonts piovides access to OpenTvpe featuies (ligatuies, swash, glvph alteinatives, dvnamic attachment ofaccents, etc.) thanks to Unicode piovides access to chaiacteis in extended alphabetic (Latin, Cviillic, Gieek,Aiabic, Devanagaii, etc.) and complex sciipts. allows the concuiient use of multiple sciipts in a single document thus making piocessing multi-lingual texts much simpleiX E TEXs diiect use of Unicode chaiacteis as input and of OpenTvpe Unicode-encoded fonts makespie-piocessois oi complex macios foi handling composite chaiacteis oi complex sciipts mostlv un-necessaiv. As an example let us considei the wav TEX and X E TEX handle some inputTEX input X E TEX input typeset output notes\'{a} \`{e} \^{o} tvpical accents\c{c} \AA composed chaiacteisd\v{z}abe {\dj}ak dabe ak dabe ak moie composed chaiacteis--- \char"2014 specic ligatuie in TEX fonts$\alpha$ \char"1D6FC mathematical svmbol (plane 1){\dn acchaa} TEX needs ad hoc piepiocessoi2.2 X E TEX: typesetting with glyphs, characters and fontsX E TEX delegates the iendeiing of Unicode chaiacteis to the freetype libiaiv' and uses the font congu-iation libiaiv fontcong foi accessing font les (othei than TEX-specic fonts). Te fontcong libiaivlets vou conguie, customize and manage fonts foi all applications which need to access fonts piesenton voui computing svstem.2.2.1 Accessing font with fontcongTe infoimation conceining fonts is stoied in XML foimat` and vou, as usei, should specifv wheie vouiOpenTvpe fonts live in the le $HOME/.fonts.conf, as in the following example of such a le.

/home/goossens/texlive/2007/texmf-update/fonts/opentype/home/goossens/texlive/2007/texmf-commercial/fonts/opentype/home/goossens/texlive/2007/texmf-dist/fonts/opentype

On Miciosof Windows, when iunning MikTeX, the le fonts.conf contains a line to includethe le localfonts.conf. Both these les live in the diiectoivc:\Documents and Settings\All Users\Application Data\MiKTeX\2.7\fontconfig\config'See http://sourceforge.net/projects/freetype/.See http://fontconfig.org/wiki/. You need at least fontcong veision 2.4 foi X E TEX to function coiiectlv.`Tese les use a svntax dened bv a giammai specied as a DTD (/etc/fonts/fonts.dtd). Te svstem-wide congu-iation le lives in /etc/fonts/fonts.conf.xetex-geneial.tex,v: 2.02 2009/06/15232 X E TEX: TEX MEETS OPENTYPE AND UNICODETe le localfonts.conf has the following content.

C:\WINNT\FontsC:\Program Files\MiKTeX 2.7\fonts/type1C:\Program Files\MiKTeX 2.7\fonts/opentypec:\TeXlive2007\texmf-dist\fonts\opentypec:\TeXlive2007\texmf-update\fonts\opentypec:\TeXlive2007\texmf-commercial\fonts\opentype

Note that MiKTeX includes bv default Miciosof Windows (\WINNT\Fonts), as well as its own stan-daid font diiectoiies. We added thiee othei ones fiom the TEXLive tiees (as in the example above).Te fontcong libiaiv comes with thiee piogiams, two foi pioviding infoimation about the fontles declaied (i.e., ndable bv fontcong) on voui svstem (fc-match and fc-list), and one (fc-cache) foi(ie)geneiating a font cache of all fonts (a fc-cache command should be issued each time a new fontis installed oi deleted).> fc-list --helpusage: fc-list [-vV?] [--verbose] [--version] [--help] [pattern] element ...List fonts matching [pattern]-v, --verbose display status information while busy-V, --version display font config version and exit-?, --help display this help and exit> fc-match --helpusage: fc-match [-svV?] [--sort] [--verbose] [--version] [--help] [pattern]List fonts matching [pattern]-s, --sort display sorted list of matches-v, --verbose display entire font pattern-V, --version display font config version and exit-?, --help display this help and exit> fc-cache --helpusage: fc-cache [-frsvV?] [--force|--really-force] [--system-only] [--verbose] [--version] [--help] [dirs]Build font information caches in [dirs](all directories in font configuration by default).-f, --force scan directories with apparently valid caches-r, --really-force erase all existing caches, then rescan-s, --system-only scan system-wide directories only-v, --verbose display status information while busy-V, --version display font config version and exit-?, --help display this help and exit> fc-list 'Minion Pro'Minion Pro,Minion Pro Subh:style=Italic Subhead,ItalicMinion Pro:style=Bold ItalicMinion Pro,Minion Pro SmBd Cond Capt:style=Semibold Cond Caption,RegularMinion Pro,Minion Pro Cond Disp:style=Bold Cond Display,BoldMinion Pro,Minion Pro Disp:style=Display,RegularMinion Pro,Minion Pro SmBd Subh:style=Semibold Italic Subhead,ItalicMinion Pro,Minion Pro SmBd Cond Capt:style=Semibold Cond Italic Caption,ItalicMinion Pro,Minion Pro Capt:style=Bold Caption,BoldMinion Pro,Minion Pro Cond Subh:style=Bold Cond Italic Subhead,Bold ItalicMinion Pro,Minion Pro SmBd:style=Semibold,RegularMinion Pro,Minion Pro Cond Disp:style=Bold Cond Italic Display,Bold ItalicMinion Pro:style=Regular...Many more lines...Minion Pro:style=BoldMinion Pro,Minion Pro Cond:style=Bold Cond,BoldMinion Pro,Minion Pro Cond:style=Bold Cond Italic,Bold ItalicMinion Pro,Minion Pro SmBd:style=Semibold Italic,ItalicMinion Pro,Minion Pro Disp:style=Italic Display,Italic24xetex-geneial.tex,v: 2.02 2009/06/152.2 X E TEX: typesetting with glyphs, characters and fonts2.2.2 Specifying character codesTe ist step towaids Unicode suppoit in TEX is to expand the chaiactei set bevond the oiiginal 256-chaiactei limit. At the lowest level, this means changing inteinal data stiuctuies thioughout, wheieveichaiacteis weie stoied as 8-bit values. As Unicode scalai values mav be up to U+10FFFF, an obviousmodication would be to make chaiacteis 32 bits wide, and tieat Unicode chaiacteis as the basicunits of text.Howevei, in X E TEXa piagmatic decision was made to woik inteinallv with UTF-16 as the encodingfoim of Unicode, making chaiacteis in the engine 16 bits wide, and handling supplementaiv-planechaiacteis using UTF-16 suiiogate paiis. Tis choice was made foi a numbei of ieasons: X E TEXuses opeiating svstemapplications piogiaminteifaces that expect UTF-16 encodedstieams,so woiking with this encoding foim avoids the need foi conveision at this inteiface. Manv of standaidTEXs inteinal tables aie implementedas 256-element aiiavs indexedbv chaiacteicode. InX E TEXthese aiiavs have beenenlaigedto 65,536 elements eachto allowthemto be indexedbv UTF-16 code values.' Tese pei-chaiactei aiiavs aie used to implement chaiactei categoiies, used in paising input textinto tokens, as well as case conveisions and space factoi (a piopeitv used to modifv woid spacingfoi punctuation in Roman tvpogiaphv). In piactice, it seems unlikelv that theie will be a gieat needto customize these chaiactei piopeities foi individual supplementaiv-plane chaiacteis. Tev aieunlikelv to be wanted as escape chaiacteis oi othei special categoiies of TEX input; need not havethe lettei piopeitv that allows them to be pait of TEX contiol sequences; and piobablv do notneed to be included in automatic hvphenation patteins.In view of these factois, X E TEX woiks with UTF-16 code units, and Unicode chaiacteis bevondU+FFFF cannot be given individuallv-customized TEX piopeities. Tev can still be included in docu-ments, howevei, and will iendei coiiectlv (given appiopiiate fonts) as the UTF-16 suiiogate paiis willbe piopeilv passed to the font svstem.X E TEX uses Unicodes 16-bit UTF-16 encoding chaiacteis encoded in 16 bits uses Unicodes UTF-16 encoding exception: a few ancient dieientlv-encoded fonts extension of TEX piimitives \char, \chardef accept numbeis up to 65536 foui-digit notation using the svntax ^^^^abcd\char"5609^^^^6167 = Unicode chaiacteis in the uppei (> 0) planes use of suiiogates (standaid UTF-16) all iight foi tvpesetting'In piinciple, using full 32-bit wide aiiavs would be possible but thev would make foi extiemelv laige aiiavs and have a veivlaige memoiv footpiint. Some kind of spaise aiiav implementation would be necessaiv, but this iequiies signicant additionaldevelopment andtesting, andmight impact peifoimance of kev innei-looppaits of the TEXsvstem. Teiefoie the moie piagmatic16-bit appioach has been adopted.xetex-geneial.tex,v: 2.02 2009/06/15252 X E TEX: TEX MEETS OPENTYPE AND UNICODE does not allow text manipulation in the input stieam on the level of the individual chaiactei incieased size foi inteinal code tables foi \catcode, \lccode, \uccode, \sfcode X E TEX plain initialises its tables with Unicode code points \lowercase{DIN} din \uppercase{Esi eyama kl mae nuvwo a v la}ESI EYAMA KL MAE NUVWO A V LA \catcode`\=\active \def{...}X E TEXs default input encoding is Unicode (UTF-8 oi UTF-16). X E TEX automaticallv detects theencoding used in the input le. If a non-Unicode encoding is used, it has to be specied with a\XeTeXinputencoding command (see page 41). Such histoiical encodings aie handled with theICU conveision ioutines.2.2.3 HyphenationAt the moment X E TEX ieuses TEXs hvphenation patteins bv adding an extia Unicode lavei pio-vided bv language-specic inteimediate les in the xu-hyphen diiectoiv'. An example of such a le(xu-frhyph.tex which handles the Fiench patteins) follows.%%%%%%% xu-frhyph.tex (Wrapper for XeTeX to read frhyph.tex)\begingroup\expandafter\ifx\csname XeTeXrevision\endcsname\relax\else% frhyph.tex uses ^^xx for T1 characters% redefine them to access the required Unicode characters% (only \oe{} actually matters here!)\input xu-t1.tex\fi\input frhyph.tex\endgroupIt is seen that xu-frhyph.tex ist loads the geneiic le xu-t1.tex, which makes the letteis in theT1-encoded hvphenation pattein les active to map them onto theii Unicode equivalents. Pait of thecontents of that les follows.%%%%%%%%%%%%%%%%%%%%%%%% xu-t1.tex %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% make T1 letters \active and map them to Unicode character codes% (for use when loading hyphenation patterns that use ^^xx notation% to represent characters in T1 font encoding, or literal 8-bit% bytes if read using \XeTeXinputencoding "bytes")\catcode`\"=12 % ensure " isn't active or otherwise "weird"\catcode`\^=7 % ensure ^ is the proper catcode for hex notation%\catcode"B0=\active \def^^b0{^^^^0159} % rcaron...\catcode"DF=\active \def^^df{SS} % SS\catcode"F7=\active \def^^f7{^^^^0153} % oe\catcode"D7=\active \def^^d7{^^^^0152} % OE'With TEXLive this diiectoiv is at texmf-dist/tex/generic/xu-hyphen.26xetex-geneial.tex,v: 2.02 2009/06/152.2 X E TEX: typesetting with glyphs, characters and fonts% we don't handle the non-letter codes in the control range% but we'd better handle dotless-i (for Turkish)\catcode"19=\active \def^^19{^^^^0131} % dotlessiFoi languages that do not use the Latin alphabet othei similai iedenitions aie made in the in-teimediate les. On top of that fullv UTF-8encoded les exist foi ancient, monotonic and polvtonicmodein Gieek and foi Coptic.To hvphenate woids coiiectlv hvphenation patteins have also been extended to 16 bits. As de-sciibed pieviouslv, an inteiface between 8-bit pattein les and X E TEXs 16-bit vaiiants exists. Foi puieUnicode pattein les aie simple Unicode data, without need of commands oi active chaiacteis, as thefollowing examples show.% hyphenate before and after independent vowel111111% hyphenate following an independent vowel but never before21|21|2.2.4 Font management: the basicsX E TEX can use all modein font foimats (PostSciipt Tvpe 1, TiueTvpe, OpenTvpe) and gives access toall fonts on voui computei. Moieovei, X E TEX lets vou still use TEX-specic font les, such as tfm. Telattei aie useful foi math fonts oi foi non-Unicode encoded input les.X E TEXextends TEXs \font command (as explained latei). In paiticulai, vou can specifv the actualname of a font, iathei than its somewhat aiticial 8-chaiactei equivalent in the Fontname scheme.'Examples aie \font\rm="Adobe Caslon Pro" at 14pt \rm Bonjour GUT2007 !Bonjour GUT2007 ! \font\it="Trebuchet MS" at 14pt \it Bonjour GUT2007 !Bonjour GUT2007 ! \font\ch="Viva Std" at 14pt \ch Bonjour GUT2007 !Bonjour GUT2007 !A PDF post-piocessoi (bv default xdvipdfmx on Linux) can use the thiee font foimats mentionedabove. xdvipdfmx has access to all fonts usable bv xetex, i.e., those in font diiectoiies declaied to font-cong oi in TEXs texmf font tiees of (this is in analogv to dvips). On the othei hand, xdvipdfmx hasno suppoit foi bitmap fonts and limited xdvipdfmx geneiates PDF bv default. It onlv includes the chai-acteis of a font that aie actuallv iefeienced into the PDF le. xdvipdfmx can geneiate an inteimediateextended DVI foimat (.xdv). Tis inteimediate foimat can be useful when xetex encounteis an eiioiand does not geneiate a PDF le. In that case vou can use the following two-step piocess to investigatethe pioblem (note the use of the veibositv switch -vv).> xelatex -no-pdf mydocument> xdvipdfmx -vv -E mydocument.xdv'Fontname is maintained bv Kail Beiiv. Its documentation is available as an electionic document on CTAN at: info/fontname.xetex-geneial.tex,v: 2.02 2009/06/15272 X E TEX: TEX MEETS OPENTYPE AND UNICODE2.2.5 Font mappings using TECkitTECkit (cuiientlv veision 2.2, see http://scripts.sil.org/TECkit) is a low-level toolkit in-tended to be used bv othei applications that need to peifoim encoding conveisions (e.g., when im-poiting legacv data into a Unicode-based application). Te piimaiv component of the TECkit packageis theiefoie a libiaiv that peifoims conveisions; this is the TECkit engine. Te engine ielies on map-ping tables in a specic binaiv foimat (foi which documentation is available); theie is a compilei thatcieates such tables fiom a human-ieadable mapping desciiption (a simple text le).Widelv-used TEXkevboaiding conventions such as \'{e} oi \pounds aie implementedvia TEX macios (and theiefoie easilv adapted foi Unicode-compliant fonts, bv modifving the maciodenitions). In addition, theie aie a few established conventions that aie implemented as ligatuie iulesassociated with standaid TEX fonts; these include --- (em-dash), ?` (Spanish inveited :),and a few moie. In piinciple, smait font technologies such as AAT and OpenTvpe could implementthese same ligatuies, pioviding the same behavioi as tiaditional TEX fonts. But as these conventionsaie peculiai to the TEX woild, it is not iealistic to expect them to be piovided in mainstieam, geneial-puipose fonts.Although it would usuallv be possible to simulate these ligatuies via macio piogiamming, it isdicult to ensuie that iepiogiamming widelv-used text chaiacteis such as the hvphen, question maik,and quotation maiks will not inteifeie with othei levels of maikup in the souice document. Instead,X E TEX piovides a mechanism known as font mappings, wheiebv a mapping of Unicode chaiacteis isassociated with a paiticulai font, and applied to all stiings of text being measuied oi iendeied in thatfont. Tis is implemented using the TECkit mapping engine.While TECkit was piimaiilv designed to conveit between legacv bvte encodings and Unicode, it canalso be used to peifoim tiansfoimations on a Unicode text stieam, using the same mapping languageand text conveision libiaiv. Te following shows the le tex-text.map (in fact its binaiv equivalenttex-text.tec, which usuallv lives in the texmf tiee in subdiiectoiv texmf/fonts/misc/xetex/fontmapping/), which piovides suppoit foi noimal TEX conventions.; TECkit mapping for TeX input conventions Unicode characters; used with XeTeX to emulate Knuthian ligatures; Copyright 2006 SIL International.; You may freely use, modify and/or distribute this file.LHSName "TeX-text"RHSName "UNICODE"pass(Unicode)U+002D U+002D U+2013 ; -- -> en dashU+002D U+002D U+002D U+2014 ; --- -> em dashU+0027 U+2019 ; ' -> right single quoteU+0027 U+0027 U+201D ; '' -> right double quoteU+0022 > U+201D ; " -> right double quoteU+0060 U+2018 ; ` -> left single quoteU+0060 U+0060 U+201C ; `` -> left double quoteU+0021 U+0060 U+00A1 ; !` -> inverted exclamU+003F U+0060 U+00BF ; ?` -> inverted questionWhen associated with a standaid Unicode-compliant font in X E TEX, this has the eect of imple-28xetex-geneial.tex,v: 2.02 2009/06/152.2 X E TEX: typesetting with glyphs, characters and fontsmenting the legacv TEX conventions foi dashes and quotes, as shown in the next example, withoutiequiiing anv TEX-specic featuies in the smait fonts themselves.Exa.2-2-1 !'Typing ''quotes''---and dashes---the TEX way!!Typing quotesand dashesthe TEX way!\font\TestA="Times New Roman" at 9pt\TestA !`Typing "quotes"(1--2)---and``dashes''---the \TeX\ way!\par\bigskip\font\TestB="Times New Roman:mapping=tex-text" at 9pt\TestB !`Typing "quotes"(1--2)---and``dashes''---the \TeX\ way!\parWhile this mechanism, associating a mapping dened in teims of Unicode chaiactei sequences,was ist devised in oidei to suppoit legacv TEX input conventions, it can also be applied in othei wavs.Te following example shows how to tvpeset a single fiagment of input text in two sciip