assembler lecture 4 s.Šimoňák, dci feei tu of košice · example: asembler code generated mov...
TRANSCRIPT
Assembler lecture 4 S.Šimoňák, DCI FEEI TU of Košice
Addressing
• data access specification• arrays - specification and manipulation• impacts of addressing to performance
Processor architecture• CISC (more addressing modes)• RISC (limited number of addressing modes)
◦ instructions work with operands in CPU registers◦ load/store instructions – transfers between registers and memory
Addressing of Pentium processors (x86, CISC)• Register Addressing Mode – CPU registers, speed• Immediate Addressing Mode – only one operand immediate, operand is a part of instruction• Memory Addressing Mode – a number of modes (a way of specifying the effective address (offset))
Memory addressing mode• motivation – effective support of HLL constructions and data structures• available addressing modes (according to address size)
◦ 16-bit addresses (like a i8086)◦ 32-bit addresses (more flexible)
16-bit addresses [1]
32-bit addresses [1]
Comparing 16-bit and 32-bit modes• 32-bit mode gives higher flexibility in register usage• possibility to take the operand size into consideration (scale factor)
• which addressing mode will CPU use?◦ bit D of segment descriptor CS (D = 1: 32-bit default)◦ possibility to change an implicit option explicitly (size override prefix)
▪ 66H (operand size override prefix)▪ 67H (address size override prefix)
• by using these prefixes – mixed mode available (16/32-bit data and addresses)• within our course 32-bit data and addresses generally used
Example: asembler code generated
mov eax, 123 B8 0000007Bmov ax, 123 66 B8 007B (prefix inserted automatically)mov al, 123 B0 7B (different operation code!)mov EAX, [BX] 67 8B 07mov AX, [BX] 66 67 8B 07 (both prefixes in use)
Based addressing
• one of registers in role of base when address of operand is calculated• effective address – sum of register content and offset (signed)
Base + disp ; signed displacement
• access to structure elements
Example: number of free places in given course? [1]
let EBX contains SSA:
...mov AX, [EBX + 48]sub AX, [EBX + 46]...
Indexed addressing
• effective address calculation
(Index * scale) + disp ; signed displacement
• access to array elements◦ start of array (disp)◦ array element (index register)◦ element size (scale – 2, 4, 8 – 32-bit mode only)
Example:mov EAX, [marks_table + ESI*4] ; elements of marks_table and table1 - 4Badd EAX, [table1 + ESI] ; ESI - offset in bytes (e.g.36 for 10.element)
Based – indexed addressing
• two types◦ without operand size taking into account (B-I with No Scale Factor)
Base + Index + disp ; signed displacement (8/16 in 16-bit, 8/32 in 32-bit mode)
▪ two dimensional arrays (disp – start of array)▪ arrays of records (disp – record element offset)
◦ with operand size taking into account (B-I with Scale Factor)▪ effective way of accessing elements of two dimensional arrays (element size 2, 4, 8 B)
Base + (Index * scale) + disp
Example: Insertion sort – program reads a sequence of integers, prints them in sorted order
• algorithm operation (insert new element into sorted array to correct position)◦ we start with empty array◦ after first element insertion – sorted◦ insert new element to correct position◦ repeat the process, until all elements are inserted
• pseudocode◦ index i – element being inserted◦ elements to the left of i – sorted◦ elements to insert – to the right of i (including i)
Main program:
01020304050607080910111213141516171819202122232425262728
%include "asm_io.inc"MAX_SIZE EQU 100segment .datainput_prompt db "Enter an input array: "
db "(negative terminates input)",0out_msg db "The array sorted:",0
segment .bssarray resd MAX_SIZE
segment .textglobal _asm_main
_asm_main:enter 0,0pusha
mov EAX, input_promptcall print_stringmov EBX, arraymov ECX, MAX_SIZE
array_loop:call read_intcall print_nlcmp EAX,0 jl exit_loop mov [EBX], EAX add EBX, 4 loop array_loop
2930313233343536373839404142434445464748495051
exit_loop:mov EDX, EBX sub EDX, array shr EDX, 2 push EDX push arraycall insertion_sortmov EAX, out_msgcall print_stringcall print_nlmov ECX, EDXmov EBX, array
display_loop:mov EAX, [EBX]call print_intcall print_nladd EBX, 4loop display_loop
done: popa mov EAX, 0leaveret
Procedure insertion_sort:
5253545556575859606162636465
%define SORT_ARRAY EBXinsertion_sort:
pushadmov EBP, ESP
mov EBX, [EBP+36] mov ECX, [EBP+40] mov ESI, 4
for_loop:; variable mapping:; EDX = temp, ESI = i, and EDI = jmov EDX, [SORT_ARRAY+ESI] mov EDI, ESI ; j = i-1sub EDI, 4
6667686970717273747576777879808182838485
while_loop:cmp EDX, [SORT_ARRAY+EDI]
; temp < array[j]jge exit_while_loop; array[j+1] = array[j]mov EAX, [SORT_ARRAY+EDI]mov [SORT_ARRAY+EDI+4], EAXsub EDI, 4 ; j = j-1cmp EDI, 0 ; j >= 0jge while_loop
exit_while_loop:; array[j+1] = tempmov [SORT_ARRAY+EDI+4], EDXadd ESI, 4 ; i = i+1dec ECXcmp ECX, 1 jne for_loop
sort_done:popadret 8
• procedure without return value (pushad/popad)• access to parameters (pushad – 32B)• while loop (r.66 – 76)• for loop (r.60 – 82)• based addressing (r. 57, 58)• based-indexed addressing (r.63, 67, 71, 72, 78)
Arrays
One dimensional arrays
• one dimensional array in C (index starts at 0)
int test_marks[10];
◦ HL declaration (size: 40B)▪ array name▪ number of elements (10)▪ element size (4)▪ element type (int)▪ indexes (0 – 9)
• array in assembly language – space allocation
test_marks resd 10
◦ correct access to elements – programmer's task (indexes, element size)◦ elements linearly ordered◦ offset from the beginning of an array (offset = index * element size)
Multi dimensional arrays
• two dimensional array in C (5 rows x 3 columns)
int class_marks[5][3];
◦ memory representation (linear array of bytes)▪ row-major ordering, e.g. C▪ column-major ordering, e.g. Fortran
• two dimensional array in assembly language [1]◦ memory representation essential◦ allocation (60B)
class_marks resd 5*3
◦ index to offset translation (row-major)
offset = (i * COLS + j) * ELM_SIZE
◦ COLS – number of columns, i – row, j – column
Integer arithmetic
• impact of arithmetic and logic instruction execution to status flags• multiplication and division• multi-word arithmetic
Status flags
• 6 flags – monitoring of operation results• ZF, CF, OF, SF, AF, PF• if flag is updated – remains unchanged, till next instruction changes its state
◦ not all of instructions affect status flags (add, sub – all 6; inc, dec – except CF; mov, push – no flags)• flags can be tested (individually, in combinations) in order to control the program execution
Zero flag• the result of last operation (affecting ZF) was 0 – ZF = 1, otherwise ZF = 0• sub – intuitive, other instructions – sometimes bit less intuitive
Example:mov AL,0FHadd AL,0F1H (sets ZF = 1, all 8 bits of AL – 0)
mov AX,0FFFFHinc AX (sets ZF = 1)
mov EAX,1dec EAX (sets ZF = 1)
• instructions of conditional jumps: jz (if ZF = 1), jnz (if ZF = 0)
• using the ZF◦ test of equality (often the cmp instruction)
cmp char,'$'
cmp EAX, EBX
◦ counting to given value▪ M, N ≥ 1, inner loop (ECX/loop – does not affect flags)▪ outer loop (EDX/dec/jnz)
Carry flag• the result of arithmetic operation on unsigned numbers exceeded the destination range (R/M)
Example:mov AL,0FH 00001111add AL,0F1H 11110001
-------- 100000000
• in case of 8-bit register the 9.bit required (AL – 8-bits)• value range of unsigned integers
• operation producing the result out of range sets the CF • negative result thus is out of range
Example:mov EAX,12AEH mov EAX,0sub EAX,12AFH dec EAX(4782 - 4783 = -1, CF = 1) (CF = 0; inc, dec – do not affect CF)
• instructions of conditional jumps: jc (if CF = 1), jnc (if CF = 0)• using CF
◦ carry/borrow propagation in multi-word addition/subtraction▪ instructions – operand size 8,16,32b, if greater operand size – step by step, taking the carry into account
◦ underflow/overflow detection▪ result out of range indication (situation handling by the programmer)
◦ testing the bit using shifts/rotations▪ bit (MSb, LSb) captured in CF – conditional jumps can be used (conditional code execution)
• instructions inc, dec don't affect CF◦ often the number of loop iterations (32b value) is enough for most applications◦ condition detected by CF is detectable also with ZF (setting CF redundant)
▪ if ECX = FFFFFFFFH and inc is executed inc ECX
▪ we suppose CF = 1, but we can detect the condition also by ZF (ECX = 0)
Overflow flag• like CF, but for operations with signed numbers• indication of result out of valid range• ranges for signed numbers
Example:mov AL,72Hadd AL,0EH (114 + 14 = 128, OF = 1)
• 128 (80H) is a correct result of sum of unsigned numbers• when signed interpretation is used – incorrect: 80H means -128
Signed/unsigned interpretation• how the system will recognize the way of interpretation the string of bits by the program? (not at all)• processor takes into account both the interpretations – and sets the CF and OF correctly
mov AL,72Hadd AL,0EH (114 + 14 = 128: CF = 0, OF = 1)
• respecting the corresponding bit is the task of programmer
• instructions of conditional jumps: jo (if OF = 1), jno (if OF = 0)• instruction of SW interrupt: into (interrupt on overflow, generates INT 4)
Sign flag• the sign of operation result• useful only with signed interpretation• copy of the highest (most significant) bit of the result
Example:mov EAX,15add EAX,97 (15 + 97 = 112, SF = 0)
mov EAX,15sub EAX,97 (15 – 97 = -82, SF = 1)
15 + (-97): 00001111 (15)10011111 (-97, c-repr.)--------10101110 (-82, c-repr.)
• instructions of conditional jumps: js (if SF = 1), jns (if SF = 0)• usage
◦ the sign of result◦ loops with the control variable value decreasing to zero (including)
Auxiliary carry flag• carry from (borrow to) lower 4 bits (nibble) of operand
mov AL,43 00101011 (43) mov AL,43 00101011 (43)add AL,94 (AF=1) 01011110 (94) add AL,84 (AF=0) 01010100 (84)
-------- --------10001001 (137) 01111111 (127)
• related instructions ◦ non-existence of conditional jumps testing AF◦ arithmetic operations with BCD numbers
▪ aaa, aas, aam, aad (ASCII adjust for addition, subtraction, ...)▪ daa, das (decimal adjust for addition, ...)
Parity flag• parity of operation producing 8-bit result (only the lower 8 bits affects the PF)• even number of 1 (PF = 1), odd number (PF = 0)
mov AL,53 00110101 (53)add AL,89 (PF =1) 01011001 (89)
--------10001110 (142)
• related instructions – jumps: jp (if PF = 1), jnp (if PF = 0)• usage (e.g. data encoding)
Example: modem transfer using 7-bit ASCII code• simple transfer errors detection – adding the parity bit (to 7-bit datum)• suppose encoding of even parity (update of 8.bit when needed)• receiver counts the number of ones in byte received (error, if it contains odd number of them)
◦ A – 41H (code: 01000001, MSb – 0)◦ C – 43H (code: 11000011, MSb – 1, set)
Example: effects of arithmetic operations execution to flags [1]
Arithmetic instructions
• addition (add, adc, inc) • subtraction (sub, sbb, dec, neg, cmp)• multiplication and division (mul, imul, div, idiv)• relative instructions (cbw, cwd, cdq, cwde, movsx, movzx)• addition and subtraction – discussed yet
Instructions for multiplying
• properties of multiplication operation◦ the result size (2n bits, when multiplying two n-bit numbers)◦ multiplication of signed numbers is different from those of unsigned (result – 2 multiplication instructions)
• multiplication of unsigned numbers (mul)◦ syntax
mul src (src – 8, 16, 32-bit GPR, memory)
◦ semantics (according to size of src)▪ 8 bits: AX ← src * AL▪ 16 bits: DX:AX ← src * AX▪ 32 bits: EDX:EAX ← src * EAX
◦ instruction affects all the status (6) flags, sets only CF and OF, rest of them – undefined▪ CF and OF set, if upper half part of result is not zero (AH, DX, EDX)
Example: mov AL,10 mov AL,10mov DL,25 mov DL,26mul DL ; CF = OF = 0 mul DL ; CF = OF = 1
• multiplication of signed numbers (imul)◦ syntax (like mul, support of additional formats, e.g. immediate datum like a parameter)◦ CF, OF – set, if upper half part of result is not the sign-extension of lower
Example: sign-extension of value -66
(-66)10 = (10111110)2 8-bit(-66)10 = (1111111110111110)2 16-bit
Example: mov DL,0FFH ; DL = -1mov AL,42H ; AL = 66imul DL ; AX = -66 (1111111110111110)2, CF = OF = 0
Instructions for dividing
• properties of division operation◦ two values as a result – quotient and remainder◦ multiplication operation (result with double length of operands) no overflow, division – overflow can occur (divide overflow)
• syntax
div src (unsigned, src – 8, 16, 32-bit GPR, memory)idiv src (signed)
• semantics of div instruction (due to size of divider src)◦ 8 bits: AL ← quot(AX/src), AH ← rem(AX/src)◦ 16 bits: AX ← quot(DX:AX/src), DX ← rem(DX:AX/src)◦ 32 bits: EAX ← quot(EDX:EAX/src), EDX ← rem(EDX:EAX/src)
• flags – affected by instructions – not defined• semantics of idiv instruction – same format and behaviour like div
◦ complication – if dividend is negative – sign-extension is required
Example: division -251/12 (16-bit)◦ (-251) = FF14H, thus DX initialized to FFFFH ◦ if DX initialized to 0000H (like in case of div), DX:AX represents a positive number!◦ if dividend positive – DX should be 0000H
• instructions for sign-extension◦ cbw (convert byte to word) - extension AL to AH (8-bit idiv)◦ cwd (convert word to doubleword) - extension AX to DX (16-bit idiv)◦ cdq (convert doubleword to quadword) - extension EAX to EDX (32-bit idiv)
• next relative instructions ◦ cwde - sign-extension AX to EAX ◦ movsx dst,src (move sign-extended src to dst)
▪ dst – R, src – R/M, if src 8-bit → dst 16- or 32-bit, if src 16-bit → dst 32-bit ◦ movzx dst,src (move zero-extended src to dst)
▪ like movsx
Example: 16-bit. signed division
mov AX,-5147cwd ; DX = FFFFHmov CX,300idiv CX ; AX = FFEFH (-17) quotient, DX = FFD1H (-47) reminder
Using shifts for multiplying and dividing• effective alternative for performing operations mentioned; if it is possible, use it (multiplying/dividing by power of 2)
Example: AX * 32 (multiplicand in AX), 2 alternatives (b – speed, space)
a) mov CX,32 b) sal AX,5mul CX
Arithmetic operations over multiword data (multiword arithmetic)
• arithmetic instructions work with data of size 8, 16, 32-bit (data of greater size – problem)• basics of multiword arithmetic
Addition and subtraction (64-bit, unsigned)• relatively simple • addition – we sum right 32 bits, left in a next step (with a carry from previous step)
Example: addition of two 64-bit numbers in EBX:EAX and EDX:ECX, result in EBX:EAX. Overflow indicated by CF.
add64: add EAX,ECX ; subtraction – similarly (add→sub, adc→sbb)adc EBX,EDXret
Multiplication and division
• detailed information on multiplication and division can be found in [1]
Study literature:[1] Dandamudi,S.,P.: Introduction to Assembly Language Programming, Springer Science+Business Media, Inc., 2005.[2] Carter, A., P.: PC Assembly Language, 2006, http://www.drpaulcarter.com/pcasm/