data structures covers chapter 5, pages 144 160 and chapter 6, pages 198 203

Data Structures

Covers Chapter 5, pages 144 – 160 and Chapter 6, pages 198 – 203

Instruction du Jour

mov ax, [bx + TPoint.x]

AgendaSimple ArraysAddressing modesDuplicating data Uninitialized data String variables Local labels STRUCtures

Simple ArraysIn assembly language, simple arrays can be created on the data segment like this:simpleArray1DB ‘ ‘simpleArray2DD 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

There are better (and shorter) ways to declare arrays that will be discussed later

Addressing ModesInformal definition: address, v – to specify the location of something, usually data (can be code) Nine addressing modes: Immediate Register Direct Register-indirect Base Indexed Base-indexed String I/O port

Immediate AddressingExample:

mov ax, 2

The “2” is called an immediate valueIs not usually regarded as a type of addressing – but the immediate value is in memory, in the code segment

Register AddressingExample:

mov ax, bx

Like immediate, usually not regarded as a type of addressing – but the register has a physical location in the CPU

Direct AddressingSyntax: [label] Example:

mov ax, [numBodies]

The label numBodies is assembled into a number – an offset into the data segment Assuming numBodies is at offset 244, the instruction assembles into something like mov cx, [244]

Register-indirect AddressingSyntax: [base-or-index-register] Example:

mov ax, [bx]

The register bx is used as an offset into the data segment Usually used for addressing arrays

Register-indirect Addressing (cont.)

Register used for addressing can be bx, si, or di The register should contain some meaningful value first 8086 string instructions use this type of addressing, although the implementation is hidden

Base AddressingSyntax: [base-register + displacement] Example:

mov ax, [bx + partNumArray]

The contents of bx and the offset of the label are added together to create an offset into the data segment

Base Addressing (cont.)Usually used for addressing data structures or more than one array of the same size Register used for addressing can be bx or bp (which uses the stack segment by default) Again, the register should contain some meaningful value first

Indexed AddressingSyntax: [index-register + displacement] Example:

mov ax, [si + partNumArray]

The contents of si and the offset of the label are added together to create an offset into the data segment

Indexed Addressing (cont.)Usually used for addressing data structures or more than one array of the same size Register used for addressing can be si or di And again, the register should contain some meaningful value Strikingly similar (actually almost identical) to base addressing

Base-indexed AddressingSyntax: [base-register + index-register + displacement] Example:

mov ax, [bx + si + fifthField] The contents of si and bx, and the offset of the label fifthField are added together to create an offset into the data segment

Base-indexed Addressing (cont.)

Usually used for addressing very complex data structures (e.g. bx = array address, si = array index, displacement = field in record) Base register can be bx or bp (which uses the stack segment by default), index register can be si or di

String AddressingExample:

movsb

movsb is shorthand for movs [byte es:di], [byte ds:si]String instructions use register-indirect addressing – but always use di in conjunction with es

Nifty RulesThe types of addressing available for an operand depend on the opcode Every opcode that allows memory addressing (meaning not immediate or register) allows direct, register-indirect, base, indexed, and base-indexed addressing Size override directives are necessary when the assembler is unable to decide how much data to address

Duplicating DataWhat if you wanted to create a byte string of 10 NULL characters?Methods:lotsNulls DB 0, 0, 0, 0, 0, 0, 0, 0, 0, 0lotsNulls DB 10 DUP (0)

The DUP directive duplicates almost any type (or size) of data a specified number of times

DUP, DUP, DUPSyntax: [label] directive count DUP (expression [, expression]…)Examples twentyZeros DB 20 DUP (0) thirty32s DW 20 DUP (32) DW 5 DUP (32, 32) bigList DB 14 DUP (“empty, ”) DB “empty”, 0 tonsOfSpace DD 8000 DUP (?)

Uninitialized DataSo far, we’ve worked only with initialized data Pro – initialized data doesn’t have to be

initialized in the code Con – initialized data takes up space in the

executable

The ? expression stands for uninitialized data

?? used with DUP can create very large blocks of uninitialized data that would otherwise take up space in the executableExamples:scratchSpace DQ ?string255 DB 255 DUP(?)

Single rule: all uninitialized data should be placed at the end of the data segment, or be prefaced with the UDATASEG directive

String VariablesOnly the DB directive can set aside space for strings Use single or double quotes Rules for using single and double quotes: ‘ can exist when enclosed by “ “ can exist when enclosed by ‘ ‘’ means one ‘ when within two ‘ “” means one “ when within two “

String Variables (cont.)Examples: doubleQuote DB “This text isn’t in single quotes.”singleQuote DB ‘This text is in “single” quotes.’insaneSingle DB ‘This text isn‘’t in ‘’double‘’ quotes.’insaneDouble DB “This text is in ““double”” quotes.”

Local LabelsLabels so far have been global Big, gigantic, enormous drawback – it’s hard to come up with new label names for every place in code that needs to be jumped to Possible solution: Preface every label with the name of its procedure Problem: you get big, long, ugly labels that are

hard to read

Local Labels (cont.)Actual solution:Use a @@ before the label name Local labels are not visible before or

beyond global labels

A good practice is to make every label in a PROC local, and every label between Start and End local

STRUCturesStructure: a named variable that contains other named variables called fields Example:STRUC Date day DB 1 month DB ? year DW 1991 ENDS Date

So What?Using structuresgroups information that belongs togetherenhances readabilitymakes for a slick way to create arrays of

records

STRUC SyntaxBegins with STRUC label, ends with ENDS label Fields are declared between, as if they were setting aside space for data The ? means the data has no default valueA STRUC directive really creates a template, or blueprint, or new data type

Declaring StructuresSyntax: [label] struc-name <initializers> Examples: birthDay Date <> ; 1-0-1991 today Date <5,10> ; 5-10-1991 dayInDayOutDate <11,12,1912> ; 11-12-1912 anotherDay Date <,8,> ; 1-8-1991

Addressing FieldsTo address fields in a structure, use the . (period) operator – like member access in Java or C++Example: STRUCAD0

STRUC TGarbage trash DB ? refuse DB ? ENDS TGarbage

DATASEGcan TGarbage <24,32>

CODESEGmov al, [can.trash]

Base AddressingIdea: to set bx to the beginning of the structure and add displacements to address fields Implementation: first, set bx to the offset of the structure, and access the members like this:

mov ax, [bx + TGarbage.trash]Using the STRUC name (TGarbage) instead of the structure itself gives the displacement of the field in any structureExample: STRUCAD1

Array of StructuresThe structure array is declared using the DUP directiveIdea: to set bx to the beginning of the structure array, add displacements to address fields, and add the size of the structure to bx to address the next structure in the array The SIZE [structure-name] directive expands to the size of the structure

Array of Structures (cont.)Example: STRUCAD2Example: STRUCAD3Question of the Day: assume each structure in the array contains an array of words, and you want to perform some operation on each word. Which memory addressing mode would you use?

Instruction du Jour (revisited)

mov ax, [bx + TPoint.x]

data structures covers chapter 5, pages 144 160 and chapter 6, pages 198 203

Documents