globalisation & computer systems week 4 writing systems and their implications for globalisation...

14
Globalisation & Computer systems Week 4 writing systems and their implications for globalisation character representation ASCII extended ASCII code pages Practical: code pages in VB

Upload: griffin-gaines

Post on 28-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Globalisation & Computer systems Week 4 writing systems and their implications for globalisation character representation ASCII extended ASCII code pages

Globalisation & Computer systems

Week 4 writing systems and their

implications for globalisation character representation

ASCII extended ASCII code pages Practical: code pages in VB

Page 2: Globalisation & Computer systems Week 4 writing systems and their implications for globalisation character representation ASCII extended ASCII code pages

Week 6 Writing systems and their

implication for globalisation Directionality (Arabic, Hebrew) Code space: Chinese Context sensitive characters: Arabic Compositionality (Amharic)

Page 3: Globalisation & Computer systems Week 4 writing systems and their implications for globalisation character representation ASCII extended ASCII code pages

Representation bits and bytes characters code points glyphs fonts standardization

Page 4: Globalisation & Computer systems Week 4 writing systems and their implications for globalisation character representation ASCII extended ASCII code pages

Representation What is a bit?

‘a binary digit’, i.e either 0 or 1 What is a byte?

‘the fixed no. of bits that can be treated as a unit by the computer hardware’

A byte can be used to express a character such as “A”

Page 5: Globalisation & Computer systems Week 4 writing systems and their implications for globalisation character representation ASCII extended ASCII code pages

Representation ASCII:

American standard code for information interchange

A standard character encoding system The bytes were originally 7-bits Given this, how many bit patterns? Each pattern maps onto a decimal code

point, and that maps onto a character

Page 6: Globalisation & Computer systems Week 4 writing systems and their implications for globalisation character representation ASCII extended ASCII code pages

Representation Glyphs

the pictures used to represent a given character; many to one:

The character “A” -> AAAAAAAAA

Page 7: Globalisation & Computer systems Week 4 writing systems and their implications for globalisation character representation ASCII extended ASCII code pages

Representation Glyphs

the pictures used to represent a given pictures used to represent a given character; many to one:

The character “A” -> AAAAAAAAA

Fonts the collection, or ‘picture gallery’ of

glyphs

Page 8: Globalisation & Computer systems Week 4 writing systems and their implications for globalisation character representation ASCII extended ASCII code pages

Representation ASCII:

The problem with 7-bit bytes… What about French la tête What about Greek κεφαλη

Extend ASCII to 8-bit bytes ISO (International organization for

standardization) Now 256 bit-patterns

Page 9: Globalisation & Computer systems Week 4 writing systems and their implications for globalisation character representation ASCII extended ASCII code pages

Representation Extended ASCII:

With 8-bit bytes you get 256 bit-patterns

For consistency, the first 128 code-points remain the same from ISO-7

The next 128 used for a range of languages

For each language, you need an interpretation of these 128 code points

The encoding is handled by a code page

Page 10: Globalisation & Computer systems Week 4 writing systems and their implications for globalisation character representation ASCII extended ASCII code pages

Representation Extended ASCII:

For code point 154: CP_EASTEUROPE (code page 1250): š CP_RUSSIAN (code page 1251): љ What about code point 65 for these two

code pages? Now represent your names with your

own orthographies in mind, using the code pages

Page 11: Globalisation & Computer systems Week 4 writing systems and their implications for globalisation character representation ASCII extended ASCII code pages

Representation Code pages in VBPublic Enum ValidCharsets ANSI_CHARSET = 0 GREEK_CHARSET = 161 THAI_CHARSET = 222End Enum Private Sub Form_Load()Dim X As New StdFont X.Charset = 161 X.Bold = True X.Size = 8 X.Name = "Times New Roman" Set frmTest.Font = X Set frmTest.Label1.Font = X Set frmTest.Text1.Font = X frmTest.Label1.Caption = Chr(181) + Chr(225) + Chr(226) frmTest.Text1.Text = Chr(181) + Chr(225) + Chr(226)

End Sub

Page 12: Globalisation & Computer systems Week 4 writing systems and their implications for globalisation character representation ASCII extended ASCII code pages

Representation and UNICODE

What about Chinese? Thousands of characters – 256 bit-

patterns clearly not enough

Page 13: Globalisation & Computer systems Week 4 writing systems and their implications for globalisation character representation ASCII extended ASCII code pages

Representation and UNICODE

What about Chinese? Thousands of characters – 256 bit-

patterns clearly not enough Make the bytes bigger… Bytes have 16-bits, which gives

65536 bit-patterns UNICODE

Page 14: Globalisation & Computer systems Week 4 writing systems and their implications for globalisation character representation ASCII extended ASCII code pages

UNICODE – design principles Reference:

The Unicode Standard, Version 3. 2000.

Online: http://www.unicode.org/unicode/uni2book/