assembler lecture 4 s.Šimoňák, dci feei tu of košice · example: asembler code generated mov...

Assembler lecture 4 S.Šimoňák, DCI FEEI TU of Košice

Addressing

• data access specification• arrays - specification and manipulation• impacts of addressing to performance

Processor architecture• CISC (more addressing modes)• RISC (limited number of addressing modes)

◦ instructions work with operands in CPU registers◦ load/store instructions – transfers between registers and memory

Addressing of Pentium processors (x86, CISC)• Register Addressing Mode – CPU registers, speed• Immediate Addressing Mode – only one operand immediate, operand is a part of instruction• Memory Addressing Mode – a number of modes (a way of specifying the effective address (offset))

Memory addressing mode• motivation – effective support of HLL constructions and data structures• available addressing modes (according to address size)

◦ 16-bit addresses (like a i8086)◦ 32-bit addresses (more flexible)

16-bit addresses [1]

32-bit addresses [1]

Comparing 16-bit and 32-bit modes• 32-bit mode gives higher flexibility in register usage• possibility to take the operand size into consideration (scale factor)

• which addressing mode will CPU use?◦ bit D of segment descriptor CS (D = 1: 32-bit default)◦ possibility to change an implicit option explicitly (size override prefix)

▪ 66H (operand size override prefix)▪ 67H (address size override prefix)

• by using these prefixes – mixed mode available (16/32-bit data and addresses)• within our course 32-bit data and addresses generally used

Example: asembler code generated

mov eax, 123 B8 0000007Bmov ax, 123 66 B8 007B (prefix inserted automatically)mov al, 123 B0 7B (different operation code!)mov EAX, [BX] 67 8B 07mov AX, [BX] 66 67 8B 07 (both prefixes in use)

Based addressing

• one of registers in role of base when address of operand is calculated• effective address – sum of register content and offset (signed)

Base + disp ; signed displacement

• access to structure elements

Example: number of free places in given course? [1]

let EBX contains SSA:

...mov AX, [EBX + 48]sub AX, [EBX + 46]...

Indexed addressing

• effective address calculation

(Index * scale) + disp ; signed displacement

• access to array elements◦ start of array (disp)◦ array element (index register)◦ element size (scale – 2, 4, 8 – 32-bit mode only)

Example:mov EAX, [marks_table + ESI*4] ; elements of marks_table and table1 - 4Badd EAX, [table1 + ESI] ; ESI - offset in bytes (e.g.36 for 10.element)

Based – indexed addressing

• two types◦ without operand size taking into account (B-I with No Scale Factor)

Base + Index + disp ; signed displacement (8/16 in 16-bit, 8/32 in 32-bit mode)

▪ two dimensional arrays (disp – start of array)▪ arrays of records (disp – record element offset)

◦ with operand size taking into account (B-I with Scale Factor)▪ effective way of accessing elements of two dimensional arrays (element size 2, 4, 8 B)

Base + (Index * scale) + disp

Example: Insertion sort – program reads a sequence of integers, prints them in sorted order

• algorithm operation (insert new element into sorted array to correct position)◦ we start with empty array◦ after first element insertion – sorted◦ insert new element to correct position◦ repeat the process, until all elements are inserted

• pseudocode◦ index i – element being inserted◦ elements to the left of i – sorted◦ elements to insert – to the right of i (including i)

Main program:

01020304050607080910111213141516171819202122232425262728

%include "asm_io.inc"MAX_SIZE EQU 100segment .datainput_prompt db "Enter an input array: "

db "(negative terminates input)",0out_msg db "The array sorted:",0

segment .bssarray resd MAX_SIZE

segment .textglobal _asm_main

_asm_main:enter 0,0pusha

mov EAX, input_promptcall print_stringmov EBX, arraymov ECX, MAX_SIZE

array_loop:call read_intcall print_nlcmp EAX,0 jl exit_loop mov [EBX], EAX add EBX, 4 loop array_loop

2930313233343536373839404142434445464748495051

exit_loop:mov EDX, EBX sub EDX, array shr EDX, 2 push EDX push arraycall insertion_sortmov EAX, out_msgcall print_stringcall print_nlmov ECX, EDXmov EBX, array

display_loop:mov EAX, [EBX]call print_intcall print_nladd EBX, 4loop display_loop

done: popa mov EAX, 0leaveret

Procedure insertion_sort:

5253545556575859606162636465

%define SORT_ARRAY EBXinsertion_sort:

pushadmov EBP, ESP

mov EBX, [EBP+36] mov ECX, [EBP+40] mov ESI, 4

for_loop:; variable mapping:; EDX = temp, ESI = i, and EDI = jmov EDX, [SORT_ARRAY+ESI] mov EDI, ESI ; j = i-1sub EDI, 4

6667686970717273747576777879808182838485

while_loop:cmp EDX, [SORT_ARRAY+EDI]

; temp < array[j]jge exit_while_loop; array[j+1] = array[j]mov EAX, [SORT_ARRAY+EDI]mov [SORT_ARRAY+EDI+4], EAXsub EDI, 4 ; j = j-1cmp EDI, 0 ; j >= 0jge while_loop

exit_while_loop:; array[j+1] = tempmov [SORT_ARRAY+EDI+4], EDXadd ESI, 4 ; i = i+1dec ECXcmp ECX, 1 jne for_loop

sort_done:popadret 8

• procedure without return value (pushad/popad)• access to parameters (pushad – 32B)• while loop (r.66 – 76)• for loop (r.60 – 82)• based addressing (r. 57, 58)• based-indexed addressing (r.63, 67, 71, 72, 78)

Arrays

One dimensional arrays

• one dimensional array in C (index starts at 0)

int test_marks[10];

◦ HL declaration (size: 40B)▪ array name▪ number of elements (10)▪ element size (4)▪ element type (int)▪ indexes (0 – 9)

• array in assembly language – space allocation

test_marks resd 10

◦ correct access to elements – programmer's task (indexes, element size)◦ elements linearly ordered◦ offset from the beginning of an array (offset = index * element size)

Multi dimensional arrays

• two dimensional array in C (5 rows x 3 columns)

int class_marks[5][3];

◦ memory representation (linear array of bytes)▪ row-major ordering, e.g. C▪ column-major ordering, e.g. Fortran

• two dimensional array in assembly language [1]◦ memory representation essential◦ allocation (60B)

class_marks resd 5*3

◦ index to offset translation (row-major)

offset = (i * COLS + j) * ELM_SIZE

◦ COLS – number of columns, i – row, j – column

Integer arithmetic

• impact of arithmetic and logic instruction execution to status flags• multiplication and division• multi-word arithmetic

Status flags

• 6 flags – monitoring of operation results• ZF, CF, OF, SF, AF, PF• if flag is updated – remains unchanged, till next instruction changes its state

◦ not all of instructions affect status flags (add, sub – all 6; inc, dec – except CF; mov, push – no flags)• flags can be tested (individually, in combinations) in order to control the program execution

Zero flag• the result of last operation (affecting ZF) was 0 – ZF = 1, otherwise ZF = 0• sub – intuitive, other instructions – sometimes bit less intuitive

Example:mov AL,0FHadd AL,0F1H (sets ZF = 1, all 8 bits of AL – 0)

mov AX,0FFFFHinc AX (sets ZF = 1)

mov EAX,1dec EAX (sets ZF = 1)

• instructions of conditional jumps: jz (if ZF = 1), jnz (if ZF = 0)

• using the ZF◦ test of equality (often the cmp instruction)

cmp char,'$'

cmp EAX, EBX

◦ counting to given value▪ M, N ≥ 1, inner loop (ECX/loop – does not affect flags)▪ outer loop (EDX/dec/jnz)

Carry flag• the result of arithmetic operation on unsigned numbers exceeded the destination range (R/M)

Example:mov AL,0FH 00001111add AL,0F1H 11110001

-------- 100000000

• in case of 8-bit register the 9.bit required (AL – 8-bits)• value range of unsigned integers

• operation producing the result out of range sets the CF • negative result thus is out of range

Example:mov EAX,12AEH mov EAX,0sub EAX,12AFH dec EAX(4782 - 4783 = -1, CF = 1) (CF = 0; inc, dec – do not affect CF)

• instructions of conditional jumps: jc (if CF = 1), jnc (if CF = 0)• using CF

◦ carry/borrow propagation in multi-word addition/subtraction▪ instructions – operand size 8,16,32b, if greater operand size – step by step, taking the carry into account

◦ underflow/overflow detection▪ result out of range indication (situation handling by the programmer)

◦ testing the bit using shifts/rotations▪ bit (MSb, LSb) captured in CF – conditional jumps can be used (conditional code execution)

• instructions inc, dec don't affect CF◦ often the number of loop iterations (32b value) is enough for most applications◦ condition detected by CF is detectable also with ZF (setting CF redundant)

▪ if ECX = FFFFFFFFH and inc is executed inc ECX

▪ we suppose CF = 1, but we can detect the condition also by ZF (ECX = 0)

Overflow flag• like CF, but for operations with signed numbers• indication of result out of valid range• ranges for signed numbers

Example:mov AL,72Hadd AL,0EH (114 + 14 = 128, OF = 1)

• 128 (80H) is a correct result of sum of unsigned numbers• when signed interpretation is used – incorrect: 80H means -128

Signed/unsigned interpretation• how the system will recognize the way of interpretation the string of bits by the program? (not at all)• processor takes into account both the interpretations – and sets the CF and OF correctly

mov AL,72Hadd AL,0EH (114 + 14 = 128: CF = 0, OF = 1)

• respecting the corresponding bit is the task of programmer

• instructions of conditional jumps: jo (if OF = 1), jno (if OF = 0)• instruction of SW interrupt: into (interrupt on overflow, generates INT 4)

Sign flag• the sign of operation result• useful only with signed interpretation• copy of the highest (most significant) bit of the result

Example:mov EAX,15add EAX,97 (15 + 97 = 112, SF = 0)

mov EAX,15sub EAX,97 (15 – 97 = -82, SF = 1)

15 + (-97): 00001111 (15)10011111 (-97, c-repr.)--------10101110 (-82, c-repr.)

• instructions of conditional jumps: js (if SF = 1), jns (if SF = 0)• usage

◦ the sign of result◦ loops with the control variable value decreasing to zero (including)

Auxiliary carry flag• carry from (borrow to) lower 4 bits (nibble) of operand

mov AL,43 00101011 (43) mov AL,43 00101011 (43)add AL,94 (AF=1) 01011110 (94) add AL,84 (AF=0) 01010100 (84)

-------- --------10001001 (137) 01111111 (127)

• related instructions ◦ non-existence of conditional jumps testing AF◦ arithmetic operations with BCD numbers

▪ aaa, aas, aam, aad (ASCII adjust for addition, subtraction, ...)▪ daa, das (decimal adjust for addition, ...)

Parity flag• parity of operation producing 8-bit result (only the lower 8 bits affects the PF)• even number of 1 (PF = 1), odd number (PF = 0)

mov AL,53 00110101 (53)add AL,89 (PF =1) 01011001 (89)

--------10001110 (142)

• related instructions – jumps: jp (if PF = 1), jnp (if PF = 0)• usage (e.g. data encoding)

Example: modem transfer using 7-bit ASCII code• simple transfer errors detection – adding the parity bit (to 7-bit datum)• suppose encoding of even parity (update of 8.bit when needed)• receiver counts the number of ones in byte received (error, if it contains odd number of them)

◦ A – 41H (code: 01000001, MSb – 0)◦ C – 43H (code: 11000011, MSb – 1, set)

Example: effects of arithmetic operations execution to flags [1]

Arithmetic instructions

• addition (add, adc, inc) • subtraction (sub, sbb, dec, neg, cmp)• multiplication and division (mul, imul, div, idiv)• relative instructions (cbw, cwd, cdq, cwde, movsx, movzx)• addition and subtraction – discussed yet

Instructions for multiplying

• properties of multiplication operation◦ the result size (2n bits, when multiplying two n-bit numbers)◦ multiplication of signed numbers is different from those of unsigned (result – 2 multiplication instructions)

• multiplication of unsigned numbers (mul)◦ syntax

mul src (src – 8, 16, 32-bit GPR, memory)

◦ semantics (according to size of src)▪ 8 bits: AX ← src * AL▪ 16 bits: DX:AX ← src * AX▪ 32 bits: EDX:EAX ← src * EAX

◦ instruction affects all the status (6) flags, sets only CF and OF, rest of them – undefined▪ CF and OF set, if upper half part of result is not zero (AH, DX, EDX)

Example: mov AL,10 mov AL,10mov DL,25 mov DL,26mul DL ; CF = OF = 0 mul DL ; CF = OF = 1

• multiplication of signed numbers (imul)◦ syntax (like mul, support of additional formats, e.g. immediate datum like a parameter)◦ CF, OF – set, if upper half part of result is not the sign-extension of lower

Example: sign-extension of value -66

(-66)10 = (10111110)2 8-bit(-66)10 = (1111111110111110)2 16-bit

Example: mov DL,0FFH ; DL = -1mov AL,42H ; AL = 66imul DL ; AX = -66 (1111111110111110)2, CF = OF = 0

Instructions for dividing

• properties of division operation◦ two values as a result – quotient and remainder◦ multiplication operation (result with double length of operands) no overflow, division – overflow can occur (divide overflow)

• syntax

div src (unsigned, src – 8, 16, 32-bit GPR, memory)idiv src (signed)

• semantics of div instruction (due to size of divider src)◦ 8 bits: AL ← quot(AX/src), AH ← rem(AX/src)◦ 16 bits: AX ← quot(DX:AX/src), DX ← rem(DX:AX/src)◦ 32 bits: EAX ← quot(EDX:EAX/src), EDX ← rem(EDX:EAX/src)

• flags – affected by instructions – not defined• semantics of idiv instruction – same format and behaviour like div

◦ complication – if dividend is negative – sign-extension is required

Example: division -251/12 (16-bit)◦ (-251) = FF14H, thus DX initialized to FFFFH ◦ if DX initialized to 0000H (like in case of div), DX:AX represents a positive number!◦ if dividend positive – DX should be 0000H

• instructions for sign-extension◦ cbw (convert byte to word) - extension AL to AH (8-bit idiv)◦ cwd (convert word to doubleword) - extension AX to DX (16-bit idiv)◦ cdq (convert doubleword to quadword) - extension EAX to EDX (32-bit idiv)

• next relative instructions ◦ cwde - sign-extension AX to EAX ◦ movsx dst,src (move sign-extended src to dst)

▪ dst – R, src – R/M, if src 8-bit → dst 16- or 32-bit, if src 16-bit → dst 32-bit ◦ movzx dst,src (move zero-extended src to dst)

▪ like movsx

Example: 16-bit. signed division

mov AX,-5147cwd ; DX = FFFFHmov CX,300idiv CX ; AX = FFEFH (-17) quotient, DX = FFD1H (-47) reminder

Using shifts for multiplying and dividing• effective alternative for performing operations mentioned; if it is possible, use it (multiplying/dividing by power of 2)

Example: AX * 32 (multiplicand in AX), 2 alternatives (b – speed, space)

a) mov CX,32 b) sal AX,5mul CX

Arithmetic operations over multiword data (multiword arithmetic)

• arithmetic instructions work with data of size 8, 16, 32-bit (data of greater size – problem)• basics of multiword arithmetic

Addition and subtraction (64-bit, unsigned)• relatively simple • addition – we sum right 32 bits, left in a next step (with a carry from previous step)

Example: addition of two 64-bit numbers in EBX:EAX and EDX:ECX, result in EBX:EAX. Overflow indicated by CF.

add64: add EAX,ECX ; subtraction – similarly (add→sub, adc→sbb)adc EBX,EDXret

Multiplication and division

• detailed information on multiplication and division can be found in [1]

Study literature:[1] Dandamudi,S.,P.: Introduction to Assembly Language Programming, Springer Science+Business Media, Inc., 2005.[2] Carter, A., P.: PC Assembly Language, 2006, http://www.drpaulcarter.com/pcasm/

http://www.drpaulcarter.com/pcasm/

assembler lecture 4 s.Šimoňák, dci feei tu of košice · example: asembler code generated mov...

Documents