computer architecture and assembly language practical session outline introduction 80x86 assembly...

Computer Architecture and Computer Architecture and Assembly LanguageAssembly Language

Practical session outline

• Introduction

• 80x86 assembly–Data storage

–The registers

–Flags

–Instructions

•Assignment 0

Introduction:

Administration:

- Background.

- Guy’s office hours: Wednesday 16:00-18:00, room -105/58. email: guyshat@cs….

- 4 practical assignments in the course, 1 theoretic.

Introduction:

Why assembly?

-Assembly is widely used in industry:

- Embedded systems.

- Real time systems.

- Low level and direct access to hardware

-Assembly is widely used not in industry:

-Cracking software protections:, patching, patch-loaders and emulators (executable file compression, encryption, decryption)

-Hacking into computer systems: buffer under/overflows (worms andtrojans).

Byte structure :

byte has 8 bits

16 35 4 07 2

msb (most significant bit)

Data storage in memory:

NASM stores data using little endian order.

Little endian means that the low-order byte of the number is stored in memory at the lowest address, and the high-order byte at the highest address.

Example:

You want to store 0x1AB3 (hex number) in the memory.This number has two bytes: 1A and B3.

It would be stored this way :

1A

B3 0

1

2memory block

Note: when read a stored data from the memory, it comes in the source order.

bytes of memory

Registers:

CPU contains a unit called “Register file”.

This unit contains the registers of the following types:1. 8-bit general registers: AL, BL, CL, DL, AH, BH, CH, DH

2. 16- bit general registers: AX, BX, CX, DX, SP, BP, SI, Dl

3. 32-bit general registers: EAX, EBX, ECX, EDX, ESP, EBP, ESI, EDI

(Accumulator, Base, Counter, Data, Stack pointer, Base pointer, Source index, Destination Index)

4. Segment registers: ES, CS ,SS, DS, FS, GS

5. Floating-point registers: ST0, ST1, ST2, ST3, ST4, ST5, ST6, ST7

6. instruction pointer: IP

Note: the registers above are the basic. There exist more registers.

IP - instruction pointer:

contains offset (address) of the next instruction that is going to be executed. Exists only in run time. Can’t be reached.

AX,BX,CX,DX - 16-bit general registers:

contains two 8-bit registers:Example: AH,AL (for AX)

EAX - 32-bit general purpose register: lower 16 bits are AX.

segment registers: we use flat memory model – 32-bit 4Gb address space, without segments. So for this course you can ignore segment registers.

ST0 - floating-point registers: we use it to do calculations on floating point numbers, you can ignore these registers.

ESP - stack pointer: contains the next free address on a stack.

Lets zoom in:

XH XL

high byte

low byte

Lets zoom in: (2)

. Some instructions use only specific registers.

Examples:

1 .For DIV r/m8 instruction, AX is divided by the given operand;the quotient is stored in AL and the remainder in AH.

2 .LOOP imm,CX instruction uses CX register as a counter register.

3 .LAHF instruction sets the AH register according to the contents of the low byte of the flags word.

.We use ESP and EBP registers to work with stack.

Example for using registers:

instruction:

mov ax, 0

mov ah, 0x13

mov ax, 0x13

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 1 0 0 1 1

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 1 0 0 1 1

AH AL

AH

AH AL

AL

content of the register AX after the instruction execution:

Status Flags:

Flag is a bit (of the Flags Register).

. The status flags provide some information about the result of the last (usually arithmetic) instruction that was executed.

This information can be used by conditional instructions (such a JUPMcc and CMOVcc) as well as by some of the other instructions (such as ADC).

There are 6 status flags:

CF - Carry flag: set if an arithmetic operation generates a carry or a borrow out of the most-significant bit of the result; cleared otherwise. This flag indicates an overflow condition for unsigned-integer arithmetic.

PF - Parity flag: set if the least-significant byte of the result contains an even number of ‘1’ bits; cleared otherwise.

Status Flags (2):

AF - Adjust flag: set if an arithmetic operation generates a carry or a borrow out of bit 3 of the result; cleared otherwise. This flag is used in binary-coded decimal (BCD) arithmetic – not needed in our course.

ZF - Zero flag: set if the result is zero; cleared otherwise.

SF - Sign flag: set equal to the most-significant bit of the result, which is the sign bit of a signed integer. (0 indicates a positive value and 1 indicates a negative value).

OF - Overflow flag: set if the integer result is too large a positive number or too small a negative number (excluding the sign-bit) to fit in the destination operand; cleared otherwise. This flag indicates an overflow condition for signed-integer (two's complement) arithmetic.

We can’t reach Flags Register but there are few instructions that let us get and set its value :

1. LAHF: set the AH register according to the contents of the low byte of the flags word:

2. SAHF: set the low byte of the flags word according to the contents of the AH register.

3 .SALC: set AL to zero if the carry flag is clear, or to 0xFF if it is set.

4 .STC: sets the carry flag.

5 .CLC: clears the carry flag.

Instructions on Flags Register:

16 35 4 07 2

SF CF0 AF PF 1ZF 0AH:

Note: this is not a complete set of the flag instructions. You can find more in the NASM tutorial.

Basic assembly instructions:Each NASM standard source line contains a combination of the 4 fields:

label: (pseudo) instruction operands ; comment

optional fieldsEither required or forbidden by an instruction

Notes:

1 .backslash (\) uses as the line continuation character: if a line ends with backslash, the next line is considered to be a part of the backslash-ended line.2. no restrictions on white space within a line.3. a colon after a label is optional.

Examples:

1 .mov ax, 2 ; moves constant 2 to the register ax2. buffer: resb 64 ; reserves 64 bytes

Instruction arguments

A typical instruction has 2 operands.

The left operand is the target operand, while the right operand is the source operand

3 kinds of operands exists:

1. Immediate, i.e. a value

2. Register, such as AX,EBP,DL

3. Memory location; a variable or a pointer.

One should notice that the x86 processor does not allow

both operands be memory locations.

mov [var1],[var2]

Move instructions:

B.4.156: MOV – move data

mov r/m8,reg8 (copies content of 8-bit register (source) to 8-bit register or 8-bit memory unit (destination) )

mov reg32,imm32 (copies content of 32-bit immediate (constant) to 32-bit register)* for all the possible variants of operands look at NASM manual, B.4.156

-In all forms of the MOV instruction, the two operands are the same size

Examples:mov EAX, 0x2334AAFFmov word [buffer], ax* Note: NASM don’t remember the types of variables you declare. Whereas MASM will remember, on seeing var dw 0, that you declared var as a word-size variable, and will then be able to fill in the ambiguity in the size of the instruction mov var,2, NASM will deliberately remember nothing about the symbol var except where it begins, and so you must explicitly code mov word [var],2.

Move instructions (2):

B.4.181 MOVSX, MOVZX: move data with sign or zero extend

movsx reg16,r/m8 (sign-extends its source (second) operand to the length of its destination (first) operand, and copies the result into the destination operand)

movzx reg32,r/m8 (does the same, but zero-extends rather than sign-extending)* for all the possible variants of operands look at NASM manual, B.4.181

Examples:movsx EAX, AX (if AX has 10…0b value, EAX would have value)

movzx EAX, BL (if AX has 10…0b value, EAX would have value)

111…1 100…0

000…0 100…0

Basic arithmetical instructions:

B.4.3 ADD: add integers

add r/m16,imm16 (adds its two operands together, and leaves the result in its destination (first) operand)

*for all the possible variants of operands look at NASM manual, B.4.3

Examples:add AX, BX

B.4.2 ADC: add with carry

adc r/m16,imm8(adds its two operands together, plus the value of the carry flag, and leaves the result in its destination (first) operand)

• *for all the possible variants of operands look at NASM manual, B.4.2

Examples:add AX, BX (AX gets a value of AX+BX+CF)

Basic arithmetical instructions (2):

B.4.305 SUB: subtract integers

sub reg16,r/m16 (subtracts its second operand from its first, and leaves the result in its destination (first) operand)


Examples:sub AX, BX

B.4.285 SBB: subtract with borrow

sbb r/m16,imm8 (subtracts its second operand, plus the value of the carry flag, from its first, and leaves the result in its destination (first) operand)


Examples:sbb AX, BX (AX gets a value of AX-BX-CF)

Basic arithmetical instructions (3):

B.4.120 INC: increment integer

inc r/m16 (adds 1 to its operand)

*does not affect the carry flag; affects all the other flags according to the result


Examples:inc AXB.4.58 DEC: decrement integer

dec reg16 (subtracts 1 from its operand)

*does not affect the carry flag; affects all the other flags according to the result


Examples:dec byte [buffer]

Basic logical instructions:

B.4.189 NEG, NOT: two's and one's complement

neg r/m16 (replaces the contents of its operand by the two's complement negation - invert all the bits, and then add one)not r/m16 (performs one's complement negation- inverts all the bits)


Examples:neg AL (if AL = (11111110), it becomes (00000010))

not AL (if AL = (11111110), it becomes (00000001))

Basic logical instructions (2):

B.4.191 OR: bitwise or

or r/m32,imm32 (each bit of the result is 1 if and only if at least one of the corresponding bits of the two inputs was 1; stores the result in the destination (first) operand)


Example:or AL, BL (if AL = (11111100), BL= (00000010) => AL would be (11111110))

B.4.8 AND: bitwise and

and r/m32,imm32 (each bit of the result is 1 if and only if the corresponding bits of the two inputs were both 1; stores the result in the destination (first) operand)


Example:and AL, BL (if AL = (11111100), BL= (00000010) => AL would be (11111100))

Compare instruction:

B.4.24 CMP: compare integers

cmp r/m32,imm8 (performs a ‘mental’ subtraction of its second operand from its first operand, and affects the flags as if the subtraction had taken place, but does not store the result of the subtraction anywhere)


Example:cmp AL, BL (if AL = (11111100), BL= (00000010) => ZF would be 1) (if AL = (11111100), BL= (11111100) => ZF would be 0)

Labels definition (basic):

.Each instruction of the code has its offset (address from the beginning of the address space).

.If we want to refer to the specific instruction in the code, we should mark it with a label: my_loop1: add ax, ax.…

-label can be with or without colon- an instruction that follows it can be at the same or the next line- a code can’t contain two different non-local (as above) labels with the same name

Loop definition:B.4.142 LOOP, LOOPE, LOOPZ, LOOPNE, LOOPNZ: loop with counter


Example: mov ax, 1 mov cx, 3 my_ loop: add ax, ax loop my_ loop, cx

1 .decrements its counter register (in this case it is CX register)

2 .if the counter does not become zero as a result of this operation, it jumps to the given label

Note: counter register can be either CX or ECX - if one is not specified explicitly, the BITS setting dictates which is used.

LOOPE (or its synonym LOOPZ) adds the additional condition that it only jumps if the counter is nonzero and the zero flag is set. Similarly, LOOPNE (and LOOPNZ) jumps only if the counter is nonzero and the zero flag is clear.

Code ASCIIThe standard ASCII code defines 128 character codes (from 0 to 127), of which, the first 32 are control codes (non-printable), and the other 96 are representable characters:

Example: the A character is located at the 4throw and the 1st column, for that it would be represented in hexadecimal as 0x41.

Here you have an interactive Decimal-Hexadecimal-Octal-ASCII converter (at the bottom of the page).

http://www.cplusplus.com/doc/papers/ascii.html

Assignment 0

• You get a simple program which prints the Nth element of the Fibonacci series.

• Add a function written in assembly to the program:

– function_double: given 2 arguments: m and n, prints the number: m * 2n (m, n > 0).

section .rodataLC0: DB "the result is: %d", 10, 0

section .data an_2: DD 0 an_1: DD 1 helper: DD 0

section .text global function_fib extern printf

function_fib: push ebp mov ebp, esp pushad mov ecx, dword [ebp+8]

label_here: mov eax,[an_1] mov ebx,[an_2] add eax,ebx mov [helper],eax mov ebx,[an_1] mov [an_2],ebx mov ebx,[helper] mov [an_1],ebx

loop label_here,ecx

mov eax,[an_2]

push eax

push dword LC0

call printf

add dword esp,8

;;;;;;;;;;;;;;;;;;;;

popad

mov eax,[an_2]

mov esp, ebp

pop dword ebp

ret

; add necessary modifications

; to the various sections

function_double:

push ebp

mov ebp, esp

pushad

mov ecx, dword [ebp+8]

mov edx, dword [ebp+12]

;; insert your code here

push eax push dword LC0 call printf add dword esp,8 popad mov esp, ebp pop dword ebp ret

Running NASM

To assemble a file, you issue a command of the form

>nasm -f <format> <filename> [-o <output>] [ -l listing]

Example:

>nasm -f elf mytry.s -o myelf.o

It would create myelf.o file that has elf format (executable and linkable format).We use main.c file (that is written in C language) to start our program, and sometimes also for input / output from a user. So to compile main.c with our assembly file we should execute the following command:

>cc main.c myelf.o -o myexe.out –l mylist.lst

It would create executable file myexe.out and a listing file named mylist.lst.In order to run it you should write its name on the command line:

>myexe.out

computer architecture and assembly language practical session outline introduction 80x86 assembly...

Documents

general registers

segment registers

ebp registers

floatingpoint registers

specific registers

bytes of memory registers

register ax

cx instruction