creating interfaces: localization language & other issues character codes homework: preparation...

17
Creating Interfaces: Localization Language & other issues character codes Homework: preparation for future topics

Upload: wendy-morton

Post on 26-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Creating Interfaces: Localization Language & other issues character codes Homework: preparation for future topics

Creating Interfaces: Localization

Language & other issues

character codes

Homework: preparation for future topics

Page 2: Creating Interfaces: Localization Language & other issues character codes Homework: preparation for future topics

Finish presentations

• Everyone post constructive comments on at least 2 other projects.

• (Note: catch up on other postings.)

Page 3: Creating Interfaces: Localization Language & other issues character codes Homework: preparation for future topics

Many, interconnected issues

• Create web site for use in several specific 'local' places.

• Create multiple web sites, each for use in specific place.– in an efficient, effective manner so any underlying

common content does not need to be duplicated (and commonality diluted).

• Develop tools (networking s/w, standards, etc.) that promote Web as "global, interoperable tool of communication"– www.w3c.org

Page 4: Creating Interfaces: Localization Language & other issues character codes Homework: preparation for future topics

Localization• not just language

– language is not just character code– UCS (universal character set) and UNICODE, many, many

related standards to address encoding issues.

• dates– local date and also way to express 'western' date

• time• money• position on and flow across page• acceptable images, photography, icons• ?

Page 5: Creating Interfaces: Localization Language & other issues character codes Homework: preparation for future topics

Character code

• Note: European languages plus several other 'small' alphabets easily handled.

• We/I (typical monolingual American) can't hardly appreciate the challenge:– two Chinese (kanji) character sets: modern

(China) and traditional (Taiwan + most of the Chinese diaspora)

– 'ruby': symbols 'over' ideographs

Page 6: Creating Interfaces: Localization Language & other issues character codes Homework: preparation for future topics

http://www.cs.tut.fi/~jkorpela/chars.html#codecharacter repertoire: A set of distinct characters.character code: A mapping, often presented in tabular form,

which defines a one-to-one correspondence between characters in a character repertoire and a set of nonnegative integers.

character encoding: A method (algorithm) for presenting characters in digital form by mapping sequences of code numbers of characters into sequences of octets. In the simplest case, each character is mapped to an integer in the range 0 - 255 according to a character code and these are used as such as octets. Naturally, this only works for character repertoires with at most 256 characters. For larger sets, more complicated encodings are needed. Encodings have names, which can be registered.

Page 7: Creating Interfaces: Localization Language & other issues character codes Homework: preparation for future topics

charset

Using the terms just defined, the charset attribute in an HTML meta tag means encoding

<meta http-equiv="Content-Type" content= "text/html;charset=utf-8" />

<meta http-equiv="Content-Type" content= "text/html;charset=ISO-8859-1" />

Page 8: Creating Interfaces: Localization Language & other issues character codes Homework: preparation for future topics

Language

• Attribute of html tag

<html lang="en-us">

MAY be used by browsers (spell-check, hyphenation, speech synthesizers), search engines, other tools.

See two-letter codes:

www.w3c.org/WAI/ER/IG/ert/iso639.htm

Page 9: Creating Interfaces: Localization Language & other issues character codes Homework: preparation for future topics

… more• A glyph is a presentation of a particular shape which a

character may have when rendered or displayed. – speak of same glyph in italic, bold, etc.

• A repertoire of glyphs comprises a font. In a more technical sense, as the implementation of a font, a font is a numbered set of glyphs. The numbers correspond to code positions of the characters (presented by the glyphs). Thus, a font in that sense is character code dependent. An expression like "Unicode font" refers to such issues and does not imply that the font contains glyphs for all Unicode characters.

Page 10: Creating Interfaces: Localization Language & other issues character codes Homework: preparation for future topics

Examples

• ASCII is a character repertoire, code and encoding. Note: confusion about 7 vs 8 bit ASCII

• ISO Latin 1 alias ISO 8859-1 standard defines a repertoire, code and encoding of which ASCII is a subset. ISO 8859 is a family of many encodings, indicated by the –n. ISO 8859-5 handles Cyrillic.

Page 11: Creating Interfaces: Localization Language & other issues character codes Homework: preparation for future topics

Unicode … provides a unique number for every character, no matter

what the platform, no matter what the program, no matter what the language. This is the goal.

The Unicode Standard has been adopted by such industry leaders as Apple, HP, IBM, JustSystem, Microsoft, Oracle, SAP, Sun, Sybase, Unisys and many others. Unicode is required by modern standards such as XML, Java, ECMAScript (JavaScript), LDAP, CORBA 3.0, WML, etc., and is the official way to implement ISO/IEC 10646.

It is supported in many operating systems, all modern browsers, and many other products. The emergence of the Unicode Standard, and the availability of tools supporting it, are among the most significant recent global software technology trends.

Page 12: Creating Interfaces: Localization Language & other issues character codes Homework: preparation for future topics

Note

• Unicode goal is universal coverage…

• Unicode is product of a consortium of 'mostly US companies'.

• Some controversy in its treatment of things– Combining certain kanji characters

Page 13: Creating Interfaces: Localization Language & other issues character codes Homework: preparation for future topics

Unicode consortium

• Go to http://www.unicode.org/unicode/standard/WhatIsUnicode.html

• Examine the Translations on the left. See what language characters do not appear on your computer. – Select one and

– Go to Display Problems and see if you can fix it.

Page 14: Creating Interfaces: Localization Language & other issues character codes Homework: preparation for future topics

XML progress• XML 1.0 to XML 1.1• Issue: complaint that new standard had features to

suit IBM• The IBM-specific problem that XML 1.1 aims to

fix has to do with a special character that designates to IBM mainframe systems the end of a line of text. XML 1.0 chokes on that character, but version 1.1 would recognize it.– ZDNet News: http://zdnet.com.com/2100-1104-

962392.html

Page 15: Creating Interfaces: Localization Language & other issues character codes Homework: preparation for future topics

Techniques

• One web site / screen provide options to go to different pages– use symbols/icons that are meaningful to audience

• tricky. Flags may not be appropriate.

– use images containing text in the specific language– risky choice: hope that computer/platform/browser has

character encoding and font to display language– poor choice: use English word for other language.http://www.lionbridge.com/ Example of company/site

supporting 'global reach'.

Page 16: Creating Interfaces: Localization Language & other issues character codes Homework: preparation for future topics

quiz What is the word in that language for

– Spanish – Chinese (Mandarin? Hainese?)– Korean– Japanese– Hebrew– Russian– French– Finnish– Arabic (Classical?, ?)– Hindi (Urdu?, ?)

What is the direction of text? What is the format for dates? Time? Money?, relevant cultural issues?

Page 17: Creating Interfaces: Localization Language & other issues character codes Homework: preparation for future topics

Homework

• Next: Accessibility discussion, exercises

• Prepare – download Instant Saxon: standalone translator

for xml and xslt.– download Nokia Mobile Internet Toolkit. Need

to register (no costs). – register with studio.tellme.com