characters and unicode java data types. unicode characters, like the letters of the alphabet and...

13
CHARACTERS AND UNICODE Java Data Types

Upload: edward-todd

Post on 02-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CHARACTERS AND UNICODE Java Data Types. Unicode Characters, like the letters of the alphabet and other printable symbols, are represented internally in

CHARACTERS AND UNICODE

Java Data Types

Page 2: CHARACTERS AND UNICODE Java Data Types. Unicode Characters, like the letters of the alphabet and other printable symbols, are represented internally in

Unicode

Characters, like the letters of the alphabet and other printable symbols, are represented internally in the computer by a numerical code.

The coding system that Java relies on for characters is called Unicode.

Unicode assigns an integer value to each separate printable character.

Unicode is based in part on a previous coding system known as ASCII.

Page 3: CHARACTERS AND UNICODE Java Data Types. Unicode Characters, like the letters of the alphabet and other printable symbols, are represented internally in

Unicode

Though Unicode handles thousands of characters, this will only cover the first 128 codes. English alphabet Keyboard symbols Codes that don’t represent printable characters

Page 4: CHARACTERS AND UNICODE Java Data Types. Unicode Characters, like the letters of the alphabet and other printable symbols, are represented internally in

Control codes

The first 32 codes are non-printable control codes.

They are referred to as control codes because they originated as instructions for communicating with or controlling a device.

Most of these codes are of no use to the programmer.

But there are some, such as the line feed and carriage return codes, which may be useful for printed output.

Page 5: CHARACTERS AND UNICODE Java Data Types. Unicode Characters, like the letters of the alphabet and other printable symbols, are represented internally in

Control codes

System.out.print(“\n”);The escape sequence “\n” causes the pair of

control codes, line feed and carriage return, to be generated. Similarly, “\t” causes a horizontal tab to be generated.

Although not literally printable, these characters can affect the appearance of the output.

Page 6: CHARACTERS AND UNICODE Java Data Types. Unicode Characters, like the letters of the alphabet and other printable symbols, are represented internally in

Control codes

Code Name Description

16 DLE Data link escape

17 DC1 Device control 1

18 DC2 Device control 2

19 DC3 Device control 3

20 DC4 Device control 4

21 NAK Negative acknowledge

22 SYN Synchronous idle

23 ETB End of transm. Block

24 CAN Cancel

25 EM End of medium

26 SUB Substitute

27 ESC Escape

28 FS File separator

29 GS Group separator

30 RS Record separator

31 US Unit separator

Code Name Description

0 NUL Null

1 SOH Start of Heading

2 STX Start of text

3 ETX End of text

4 EOT End of transmission

5 ENQ Enquiry

6 ACK Acknowledge

7 BEL Bell

8 BS Backspace

9 HT Horizontal tab

10 LF Line feed

11 VT Vertical tab

12 FF Form feed

13 CR Carriage return

14 SO Shift out

15 SI Shift in

Page 7: CHARACTERS AND UNICODE Java Data Types. Unicode Characters, like the letters of the alphabet and other printable symbols, are represented internally in

Unicode characters

The Unicode values from 32 to 127 are printable, except for code 127, which stands for deletion.

The letters of the alphabet, small and capital, and the various punctuation marks are arranged as they are for historical reasons.

The sort order for characters or text items is based on their Unicode values.

Page 8: CHARACTERS AND UNICODE Java Data Types. Unicode Characters, like the letters of the alphabet and other printable symbols, are represented internally in

Unicode characters

Page 9: CHARACTERS AND UNICODE Java Data Types. Unicode Characters, like the letters of the alphabet and other printable symbols, are represented internally in

New data type: char

We can now consider the character type, char.‘char’ is effectively an integer type.It is possible to cast between something typed

char and something typed int.The underlying value is always an integer.The type (int or char) determines whether the

integer value or the associated Unicode character is displayed.

The following program illustrates the idea that an integer can be cast to a character and printed out as such.

Page 10: CHARACTERS AND UNICODE Java Data Types. Unicode Characters, like the letters of the alphabet and other printable symbols, are represented internally in

Char

public class UnicodeChars{

public static void main(String[] args){

char myChar;int i;i = 65;myChar = (char) i;System.out.println(myChar); //Prints A

}}

Page 11: CHARACTERS AND UNICODE Java Data Types. Unicode Characters, like the letters of the alphabet and other printable symbols, are represented internally in

Char

In this program, the char type takes the integer value of 65, and finds the character associated with it, capital letter A.

It isn’t always necessary to use a number to assign a value to a char type. You can directly set the character you want. myChar = ‘A’; Note that we use single quotes instead of double

quotes to denote a single character value. Double quotes signify a string, while single quotes signify a character.

Page 12: CHARACTERS AND UNICODE Java Data Types. Unicode Characters, like the letters of the alphabet and other printable symbols, are represented internally in

Finding the integer value of a char

It is also possible to find the integer value of a char type, by casting it to an int, as shown below: Char myChar = ‘A’;int i = (int) myChar;System.out.println(i); //Prints 65

Page 13: CHARACTERS AND UNICODE Java Data Types. Unicode Characters, like the letters of the alphabet and other printable symbols, are represented internally in

Lab

Prepare yourselves! You are now ready for questions 39 to 42 on the assignment sheet. The first couple of questions consists of determining a

character from an integer, and an integer from a character.

The last couple will involve taking an integer/char type, and turning it into the other type.