csc 1410 tutorial notes 3 - weeblyartilife.weebly.com/.../8086_assembly_language_tutorial.doc ·...

J Freshersworld.com 8086 Assembly Language Tutorialwww.Freshersworld.com.com

8086 Assembly Language TutorialProvided by

www.Freshersworld.com

Tutorial Index : Section 1. Introduction to Assembler and Assembly Language___________________3

Section 1.1. Generation of Language_________________________________________3

Section 1.2. Assembly Language and Assembler________________________________3

Section 1.3. Introduction to 8086 CPU internal architecture______________________4

Section 1.4. Memory segmentation___________________________________________5

Section 2.First Step in Assembly program__________________________________5

Section 3.Basic structure of assembly program______________________________6Section 3.1. Basic structure of Assembly program______________________________6

Section 3.2. Basic structure of Data Segment___________________________________8

Section 3.3. Basic structure of Code segment___________________________________9

Section 3.4. Simple Commands in the Code Segment___________________________11Section 3.4.1. Data Movement___________________________________________________11Section 3.4.2. Arithmetic Operations______________________________________________13

Section 3.4.2.1. ADD and SUB function_________________________________________13Section 3.4.2.2. MUL and DIV function__________________________________________14Section 3.4.2.3. Other arithmetic operations_______________________________________15

Section 3.4.3. Simple Control flow________________________________________________15Section 3.4.3.1. Unconditional Jump____________________________________________15Section 3.4.3.2. CALL procedure_______________________________________________16Section 3.4.3.3. Conditional Jump______________________________________________17Section 3.4.3.4. Looping______________________________________________________18

Section 3.4.4. Logic Control_____________________________________________________18Section 3.4.4.1. Simple Logic Function__________________________________________18Section 3.4.4.2. Relational Operators____________________________________________19Section 3.4.4.3. Rotation & shifting_____________________________________________19

Section 3.4.5. Simple interrupts (I/O)______________________________________________20

Section 3.5. Basic structure of the Stack Segment______________________________22

Section 4.Macro Processing_____________________________________________23Section 4.1. Macro Definition_______________________________________________23

Section 4.2. Local Directives_______________________________________________25

Section 4.3. Nested Macro_________________________________________________25

Section 4.4. Special Directives______________________________________________26Section 4.4.1. Repetition Directives_______________________________________________26Section 4.4.2. Conditional Directives______________________________________________27

Section 4.5. INCLUDE Directive____________________________________________29

*** ***

1


Section 5.Linking to Subprograms_______________________________________30Section 5.1. Intersegment Calls_____________________________________________30

Section 5.2. Returning Parameters__________________________________________34

Section 5.3. Local Variable Storage_________________________________________35

Section 5.4. Recursive Calling______________________________________________35

Section 5.5. Linking C and Assembly Language Programs______________________35

Section 6.Keyboard & Screen handling (I/O)_______________________________37Section 6.1. Introduction to I/O handling_____________________________________37

Section 6.2. I/O with DOS interrupt_________________________________________38Section 6.2.1. String input_______________________________________________________38Section 6.2.2. Display the special characters________________________________________38

Section 6.3. Video display with BIOS interrupt_______________________________38Section 6.3.1. Screen Clearing and Coloring________________________________________39Section 6.3.2. Setting and Moving the Cursor_______________________________________40

Section 7.Interrupt Service Routine (ISR)_________________________________40Section 7.1. Introduction to 8086 Interrupt Service Routine_____________________40

Section 7.2. Writing the Interrupt Service Routine_____________________________41

Section 7.3. Chaining and Reentrance Problem_______________________________42

Section 8.Examples___________________________________________________43

Section 9.Appendix 1: How to write and assemble my assembly program?_______49

Section 10. Appendix 2: Commands and syntax of Assembly language__________50

Freshersworld.com 8086 Assembly Language Tutorial www.Freshersworld.com.com

Provided bywww.Freshersworld.com.com First Job ….. ! Dream Job ….! Freshersworld.com

*** ***

2


Section 1. Introduction to Assembler and Assembly Language

Section 1.1. Generation of LanguageA computer cannot do anything until you "tell" it what to do and how to do. The process of telling

the computer a sequence of instructions is called programming.

Programming languages can be classified into various levels. They are:

1. Low level language, e.g. Machine Language & Assembly Language. Low level languages are closely related to the internal architecture of the computer system. For each single action of the computer, a corresponding program line must be written.

2. High level language, e.g. C, C++, COBOL, etc.

Machine language, or first-generation language, is a set of instructions that can be directly executed by a computer system. Each instruction is composed of an operation code (OPCODE), and an operand which defines the function that the computer must perform. They are written at the most basic level of the computer operation, as a series of 1s and 0s. However, it is difficult to understand and time-consuming for programmers.

Section 1.2. Assembly Language and AssemblerIn order to make the OPCODE easier to read and memorized, mnemonics are used to replace the

binary codes. Furthermore, decimal number or symbolic notations are used for the operand. Assembly language, or second-generation language, is a low-level language that uses mnemonics to represent machine code instructions. The following is the simple Assembly program in 8086/ 8088 machine1 (different from 68000 machine) with 16-bit data manipulation. The following will teach you the assembly program for 8086/ 8088 machine only.

CODE_SEG SEGMENTMAIN PROC FAR

ASSUME CS:CODE_SEG, DS:DATA_SEGMOV AX, DATA_SEGMOV DS, AX

MOV CX,0100hSTART: MOV AH,2h

MOV DL,2AhMOV ABC,0AhINT 21hMOV AH,4ChINT 21h

MAIN ENDPCODE_SEG ENDS

DATA_SEG SEGMENTABC DB 0BhDATA_SEG ENDS

END MAIN

Figure 1 Simple Assembly Program

An assembler is a language translator used to convert assembly language into machine code. Assembler accepts as input a program whose instructions are essentially in one to one correspondence with those of machine language, with symbolic names used for operation codes and operands. It produces as output

1 8088 refers to the model of the processor used. For example, 80386, 80486, 80586 are the model of the processor. It is designed in late 1970s.

*** ***

3


a machine-language program in main storage for execution. It is not necessary to find the meaning or work done in the assembly program. The purpose of the assembler is nothing but the translation.

Section 1.3. Introduction to 8086 CPU internal architectureIn a CPU, especially 8086 CPU, the visible component is the registers. The registers on the 8086

CPU can be categorized into three parts: general purpose registers, segment registers and other registers.

1. General Purpose RegistersAlthough these registers are called “General Purpose Registers”, everyone has its own

special purpose. There are totally eight 16-bit general purpose registers on the 8086 CPU. The main four

general purpose registers are:J AX (accumulator) is used to store the results of the arithmetic and logical

computations.J BX (base register) is used to store the address of a variable, works as a pointer.J CX (counter) is acted as a counter in the looping.J DX (data) will hold the overflow from certain arithmetic operations.

The other four registers are:J SI & DI (Source Index & Destination Index) are acted as pointers of a string.J BP (base pointer) is used to access parameters and local variables in a procedure.J SP (stack pointer) is a pointer that points the head of the stack.

Besides the above eight 16-bit registers, the 8086 CPU also have the eight 8-bit registers, which divides the AX, BX, CX and DX registers into eight registers, as shown in Figure 2. They are called AH, AL, BH, BL, CH, CL, DH and DL. AH is the higher-order byte of AX register, and AL is the lower-order byte of AX. It implies the similar meaning in BX, CX and DX registers.

2. Segment RegistersThere are four special segment registers:J CS (Code Segment Registers) stores the starting address of the program instruction.J DS (Data Segment Registers) stores the starting address of the data segment.J SS (Stack Segment Registers) stores the starting address of the stack segment.J ES (Extra Segment) locates an additional data segment if needed.

3. Other RegistersThere are two main types of other registers. One is instruction pointer (IP) and the other is

the status register. The instruction pointer contains the address of the currently executing instruction. A 16-bit register provides a pointer into the current code segment.

The status register holds 16-bit information, which includes overflow register (O), sign register (N), zero register (Z) & carry register (C).

J O is set to 1 if overflow value (signed) is generated. Cleared otherwise.J N is set to 1 if negative result generated. Cleared otherwise.J Z is set to 1 if result is zero. Cleared otherwise.J C is set to 1 if carry bit (unsigned) is generated. Cleared otherwise.J P is set to 1 if parity bit is even. Cleared otherwise.

O N Z P C15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

Figure 3 Internal construction of Status register

*** ***

4

AH AL

BH BL

CH CL

DH DL

Figure 2 Register Segmentation


Section 1.4. Memory segmentation

Writing the assembly program includes the tracing of the data & address in the memory. Here, we introduce the memory segmentation.

The first 1MB spaces of memory, from address 02 to FFFFFh, is the entire address space of the 8086/ 8088 microprocessors. The data addresses to locations are limited in this range.

But, what is memory segmentation?Memory segmentation is like to divide a large memory space into a linear array of bytes. A single

index, called address, selects some particular byte from that array. Traditionally, segmented address uses two components to specify a certain memory location: a segment value and an offset value within that segment.

A full segmented address contains a segment address and the offset address, written as segment: offset. Both are 1 word (16 bits) value. This representation of the address is called the logical address representation. In order to calculate the actual address, which is called the physical address, on the address bus, the CPU will multiply the segment address by 10h and adds the offset address, i.e.,

Physical address = segment address 10h + offset

Inside the 8088 machine, the 16-bit number is stored in the memory in such a way that the most significance byte is stored in higher memory than lower-order byte, i.e. it will be stored in reverse order. For example, the number ABCDh will become CDABh, as in Figure 4.

Section 2. First Step in Assembly programHere we copy the simple assembly program from section 1.

CODE_SEG SEGMENTMAIN PROC FAR


MOV CX,0100hSTART: MOV AH,2h

MOV DL,2AhMOV ABC,0AhINT 21hMOV AH,4ChINT 21h

MAIN ENDPCODE_SEG ENDSDATA_SEG SEGMENTABC DB 0BhDATA_SEG ENDS

END MAIN

Figure 5 Simple Assembly Program

To begin with, the assembly program can be divided into 3 segments:

1. Code segment (CS), which defines the main program or instructions.2. Data segment (DS), which defines the data you used.3. Stack segment (SS), which defines the stack.

2 h means in hexadecimal form.

*** ***

CD 00001

0000300005

FFFFDFFFFF

000000000200004

FFFFCFFFFE

:::

:::

AB

Figure 4 Memory Segmentation

5


All the above segments are not necessarily located one by one in the memory. They can be located far away from each other, as shown in figure 6. Furthermore, an assembly program is free to allocate less than 3 segments. For example, only 1 code segment & 1 data segment is accepted with the warning message.

Section 3. Basic structure of assembly program

Section 3.1. Basic structure of Assembly programProcessor Directive

By default, the assembler instructions are available on all members of the 80x86 family, including 8086, 8088, 80186, 80286, 80386, 80486, 80586, etc. By generating an error message for non-8086 instructions, a processor directive will prevent the accidental use of the instructions that are not available on 8086 processor. The processor directive is

p8086Such processor directives enable all instructions available on an

8086 processor.

Segment DefinitionAll programs consist of one or more segments. Of course, while your programs is running, the

segment registers point at the currently active segments. Segment, in the assembly language source file, are defined with the following instructions:

<Segment Name> segment (<align>) (<combine>) (<class>) <code>

<Segment Name> endswhere<Segment Name> is the segment name you designed, <align> indicates the boundary on which the segment is to begin,<combine> indicates whether to combine with other segments in linking,<class> is used as identifier for the linker to combine various segments.

<align> ::= para | byte | page | wordwhere para sets the segment starting address evenly divisible by 16 (10h) byte sets the segment starting address on the next available byte page sets the segment starting address on the next page (100h).word sets the segment starting address on the next available word.

<combine> ::= STACK | PUBLIC | NONEwhereSTACK defines stack segment only. PUBLIC grouped the same class segment both physically and logicallyNONE separated each segment with others logically.

<class> ::= ‘<name>’It is used as an identifier for the linker to combine various code segments.

Segment Registers*** ***

CS

DS

SS

0100:0001

1750:0001

3230:0001

Address

Figure 6 Location of Program segment in the memory

6


When MS-DOS begins execution of your program, it initializes two segment registers. It points CS at the code segment containing your main program and it points SS at your stack segment. From that point forward, you are responsible for maintaining the segment registers yourself.

To access the data correctly in your program, you should copy the address from the actual data segment to the data segment register, as shown below:

MOV AX, DATA_SEGMOV DS, AX

Let’s explain each term in the syntax of the segment.

Segment NamesThe segment directive requires a label in the label field. This label is the segment’s name. The

assembler will use the segment name to obtain the address of a segment. You must also specify the segment’s name in the label field of the ENDS directive that ends the segment.

Segments normally load into memory in the order that they appear in your source file. If you write the code segment after the data segment, the data segment would load into memory before the code segment.

Align TypeThe align parameter is one of the following words: byte, word, para, or page. These keywords

instruct the assembler, linker, and DOS to load the segment on a byte, word, paragraph or page boundary. The align parameter is optional. If one of the above keywords does not appear as a parameter to the segment directive, the default alignment is paragraph (16 bytes).

Aligning a segment on a byte boundary loads the segment into memory starting at the first available byte after the last segment. Aligning on a word boundary will start the segment at the first byte with an even address after the last segment.

Combine TypeThe combine type controls the order of the segment. To specify the combine type you use one of the

keywords: PUBLIC, STACK or NONE. The PUBLIC and STACK combine types essentially perform the same operation. They will combine all the segments with the same name, and join into a single contiguous segment. The main difference between PUBLIC and STACK is that STACK is used in the stack segment while PUBLIC is used in the code segment and data segment.

Class TypeThe final operand to the segment directive is usually the class type. The class type specifies the

ordering of segments that do not have the same segment name. This operand consists of a symbol enclosed by apostrophes ‘ ‘. Generally, you should use the following names: CODE (for code segment), DATA (for data segment) and STACK (for stack segment).

The following will be the typical three segments:CODE_SEG SEGMENT PARA PUBLIC ‘CODE’

: :CODE_SEG ENDSDATA_SEG SEGMENT PARA PUBLIC ‘DATA’

: :DATA_SEG ENDSSTACK_SEG SEGMENT PARA STACK ‘STACK’

DW 1024 dup (?)STACK_SEG ENDS

END

Figure 7 Typical Segment Definitions

Assume DirectiveHow does the assembler know which segment is the data segment, and which one is stack segment?

The segment directives don’t tell you what type of segment it happens to be in the program. When you specify the segment in your program, not only you must tell the CPU that a segment is a data segment, but also you must tell the assembler where and when the segment is a data segment. The assume directive provides this information to the assembler.

The assume directive takes the following form:

*** ***

7


ASSUME CS:<code segment name>, DS:<data segment name>, SS:<stack segment name>The assume directives tells the assembler that you have loaded the specified segment register(s) with

the segment addresses of the special value. Note that this directive does not modify any of the segment registers, it simply tells the assembler to assume the segment registers are pointing at certain segments in the program. END directive

The END directive terminates an assembly language source file. In addition to telling the assembler that it has reached the end of the assembly language source file, the end directive operand <entry point> tells the MS-DOS where to transfer the control when the program begins execution, as shown by the following syntax:

END <entry point>If you write separate assembly and you’re linking together several different object codes, only one

module can have a main program. Likewise, only one module should specify the starting location of the program. Others will leave blank.

Section 3.2.Basic structure of Data SegmentData segment is the assembly segment that defines the data you will use. The syntax is:

<variable name> DB | DW <value>Variable name represents the name of your variable. DB or DW declares the size of the variable, and

the value is the initial value of that variable.Variable name should begin at column one on the instruction in the data segment. If you need more

than one variable in your program, just place additional lines in the data segment declaring the variables. The assembler will allow you to refer to that variable by name rather than by the location.

The first variable you place in the data segment gets allocated storage at location DS:0. The next variable in memory gets the storage just beyond the pervious variable. DB declares the byte size (2 hex digits), and DW declares the word size (4 hex digits) for the variable.

Here are the examples for the data initialization.V1 DB 25h ; V1=25 hexadecimal no. with byte sizeV21 DB 20d ; V21 = 20 decimal numberV22 DW 0010110101001010b ; V22 is binary digit with Word sizeV3 DB 10 dup(?) ; 10 consecutive non-initialized bytesV4 DB “HELLO$” ; Character string3

Figure 8 Variable declaration

For the above example, the variable V1 locates at location DS:0, V21 locates at location DS:1, V22 locates at location DS:2, V3 locates at location DS:4, etc. The question mark ‘?’ tells the assembler that the variable should be left uninitalized when it loads into the memory, i.e. just leave space for that variable. You may specify any initial value for the variable before or after (replace the “?”) the execution of the program.

Number BaseTo differentiate between numbers in the various bases, you can use a suffix character. If you

terminate a number with a ‘b’ or ‘B’, then the assembler assumes that it is a binary number. If it contains any digits other than 0 or 1 will generate an error. A suffix ‘D’ or ‘d’ assumes the number is a decimal number, while a suffix ‘H’ or ‘h’ will select the hexadecimal radix.

All integer constants must begin with a decimal digit, including hexadecimal constants. To represent the number ‘ABCD’, you must specify as ‘0ABCDh’. The assembler requires the leading decimal digit so that it can differentiate between symbols and numeric constants.

If you do not specify the suffix after the number, the assembler will use the current default radix, i.e. decimal radix. Therefore, you can specify the values without using ‘D’ character.

ArrayIn the high-level programming, it is commonly to use the arrays. Abstractly, an array is an aggregate

data type whose members are all the same type. Selection of the member from the array is by an integer index. For example, A[2] selects the third element from array A in C language.

3 All string must be ended with “$” notation, otherwise the assembler will output the unpredictable result.

*** ***

8


There are three factors controlling the array. The base address of an array is the address of the first element on the array and always appears in the lowest memory location. The index is the position of the member that retrieves the specified element. The element size is the size of the element in the memory.

For a single dimension array, the address of the specified element can be calculated as:Address = Base Address + (Index Element Size)

In the assembly language, an array is defined as<arrayname> DB | DW <size> dup (<element>)

Array name is the name of the array variable. The <size> dup (<element>) tells the assembler to duplicate the object with the size defined in <size>. For example, in Figure 7, 10 dup (?) duplicate the array V3 10 times with a byte size for each element. With a ? in the “( )”, 10 uninitialized value will be defined. If a value, say 1, is defined in the ( ), 10 byte size array with value 1 will be assigned.

In order to design an array with all different value, you can use the following syntax<arrayname> DB | DW value1, value2, value3, ……valuen

This form allocates n variables of DB or DW type. It initializes the first item to value1, the second item to value2, etc. For example,

Integers DB 0, 1, 2, 3, 4initialize the five-element array, with the values from 0 to 4 respectively. Consider the following declaration:

Strange DW 256 dup (0, 1, 2, 3)The array Strange has 1024 elements. The n dup (XXX) operand tells the assembler to duplicate

XXX n times, not creating an array with n elements. If XXX contains k elements, then the dup operator will create an array with size kn. The values in the above example will be 0 1 2 3 0 1 2 3 ….

To access the element in the array, the address formula will be used.For the base address, you can use the name of the array. The element size is the number of bytes for

each array element. If the object is an array of the bytes, the element size is 1, while it will be 2 if the object is an array of the word length.

The assembly code for accessing the array will be[<arrayname>+<index>*<element size>]

The square box tells the assembler to resolve the entire element to the address inside the box. For example, to denote the array SUMMER[2] with word size in assembly, you can use it as:

[SUMMER+3*2]

StringTo declare the string, you can use the following syntax

<stringname> DB ‘<string>$’The <stringname> is the name of the string, which defined inside the ‘’ or “ “. If you want to place an

‘ inside a string, you must place a pair of ‘ next to each other, e.g.‘I’’m fine, thanks’

or by using the other characters as the string delimiter:“I’m fine, thanks”

A string will appear as an array of ASCII number for each character. With the size of the ASCII is a byte size, we will only use DB to define the string.

Section 3.3. Basic structure of Code segmentIn the last section, we have discussed that Code segment includes the main program codes of the

program. Since the spacing is not important for the assembler, we divide the program line into 4 columns: Label Mnemonic Operand Comment (can be inserted everywhere)We copy from the examples in section 1 and define the above parts as follows:

Label Mnemonic OperandCODE_SEG SEGMENTMAIN PROC FAR


MOV CX,0100h

*** ***

9


START: MOV AH,2hMOV DL,2AhMOV ABC,0AhINT 21hMOV AH,4ChINT 21h


Figure 9 Simple Assembly Program (Only Code segment part)

LabelThe label field is an optional field containing a symbolic label for the current statement. Labels are

used in assembly language as to mark the lines which can be jumped by the other instruction (as GOTO statement). In general, you should begin your labels in column one with the syntax:

<label name> :A symbol is associated with some particular value. This value can be an offset within a segment, a

constant, a string, etc. A symbolic name consists of a sequence of letters, digits, and special characters, with the following restrictions:

1. First character must be alphabetic letter or special character like ‘$’. No numeric digit is allowed. Only ‘$’ or ‘?’ characters are not allowed (as these two characters have meanings).

2. It cannot be reserved words.3. The maximum length for the characters is 31.4. It is not case-sensitive, which is different from C. It will treat the upper and lower case

alphabetic equivalently.

MnemonicA mnemonic is an instruction name. The mnemonic field contains an assembler instruction.

Instructions are divided into three classes: machine instructions, assembler directives and pseudo opcodes. Machine instructions are assembler mnemonics that corresponds to the actual instructions. The assembler directives are special instructions that provide the information to the assembler but not generate any code. All instructions introduced in section 3.1 are assembler directives. They are only message to the assembler, nothing else. A pseudo-opcode is a message to the assembler, but it will emit the object code bytes. For example, DB and DW in data segment in section 3.2 are the pseudo-opcode. These instructions will emit the bytes or words of the data specified by their operands but they are not true instructions in the assembly language.

OperandThe operand field contains the operands, or parameters, for the instruction specified in the mnemonic

field. Operands never appear on lines by themselves. The type and number of the operands depend entirely on the specific instruction.

CommentThe comment field allows you to annotate each line of source code in your program. When the

assembler is processing a line of text with beginning of the notation “;”, it completely ignores everything on the source line following a semi-colon. You can also have a comment on the line by itself.

Rules for writing assembly languageEach assembly language statement appears on its own line in the source file. No multiple assembly

language statements on a single line are allowed. However, a blank line is accepted.

Procedure DefinitionIn the whole code segment you must define at least one procedure segment as:<procedure name> PROC {FAR | NEAR}

: : ;***your (main/ sub) program here***<procedure name> ENDP

Procedure name, as function name of the subroutine in high level language, must be unique. Proc means the procedure begins, while endp tells the assembler the procedure ends. Far refers the procedure can

*** ***

10


be called outside, while Near refers the procedure can be called only in this procedure. As you can see, the definition of a procedure looks like definition of a segment. One difference is that procedure name must be a unique identifier within your program. Your code calls this procedure using this name. This topic will be further discussed in the CALL section.

Section 3.4. Simple Commands in the Code Segment

Section 3.4.1. Data MovementTo begin with, we introduce MOV command. MOV is the command that copy (move) the content of

one memory to another memory location. Generally speaking, assembly language instructions manipulate data stored in memory and registers. In order to tell which data and registers we needed, use defined addressing methods is the efficient way to access. We will show how the addressing modes reference with MOV command in the following paragraphs.

Consider the 8086 MOV instructionMOV <destination>, <source>

The instruction MOV copies the data from the source operand to the destination operand. The only restriction of the MOV instruction is that both operands must be the same size.

We can use the different type of the memory location or the registers in the source and destination. However, the best choice of the source or destination is using the registers. Instructions using the registers will be shorter and faster than accessing the memory location.

Furthermore, as we will show in the following sections, the general purpose registers will be one of the instruction operands (or in other words, parameters).

There are two restrictions in the MOV instruction:1. It is invalid in specifying CS as the destination operand, and2. It is invalid that both of the operands are segment registers.

Furthermore, there are five main types of the indexing methods:

Immediate data reference & Register data reference (Direct data reference)Immediate data reference consists of an immediate mode source and a destination location.

Data value directly move to a defined register. From the example “MOV AX, 004Ch”, it means that the hexadecimal number 004C immediately move to AX (accumulator long) register.

Register data reference includes data moving from one register or memory location to another register. From the example, “MOV AX, BX”, it means that the content of the BX register moved to the AX register. We can also represent this mode into another way. The instruction

MOV AL, DS:[8088h]Loads the byte at memory location 8088h (in data segment) to the AL register. Likewise, the

instructionMOV DS:[1234h], DL

stores the value from the DL register to the memory location 1234h.Likewise, to access the location in the code segment 1234h you would use

MOV AX, CS:[1234h]Of course, you can use the name of the memory location as the source. For example,

MOV AX, [CON]means that the content of variable CON will be copied to AX. The CPU, as similar as above, will resolve the [CON] as an address first.

Register Indirect Addressing ModesThere are totally four forms of the addressing modes on the 8086 CPU.

MOV <destination>, [BX | BP | SI | DI]These four addressing modes reference the byte at the offset found in the BX, BP, SI or DI

register respectively. The BX, SI and DI use the data segment (DS) address as reference, but BP refers as the stack segment (SS) reference.

Similar as above, the above instructions can refer to different segments, e.g.,MOV AL, CS: [BX]

refers the base address BX relative to the code segment.

*** ***

11


Indexed Addressing ModesThe syntax of the indexed addressing modes are:

MOV <destination>, DISP [BX | BP | SI | DI]If BX contains 1000h, the instruction

MOV CL, 20h [BX]will load CL from memory location DS:1020h.

Likewise, if BP contains 2020h, MOV DH, 1000h [BP]

will load DH from location SS:3020h.The offsets generated by these addressing modes are the sum of the constant and the

specified register. The addressing modes involving BX, SI and DI are all use the data segment, while the BP addressing modes referring to stack segment.

Based Indexed Addressing ModesThe based indexed addressing modes are simply combinations of the register indirect

addressing modes. These addressing modes form the offset by adding together a base register (BX or BP) and an index register (SI or DI), which the syntax is:

MOV <destination>, [BX | BP] [SI | DI]Suppose that BX contains 1000h and SI contains 880h, the instruction

MOV AL, [BX] [SI]would load AL from the location DS:1880h.

Likewise, if BP contains 1598h and DI contains 1004h, MOV AX, [BP+DI]

will load the 16-bit in AX from location SS:259Ch and SS:259Dh.

Based Indexed Plus Displacement Addressing ModeThese addressing modes are a slight modification of the base/ indexed addressing modes

with the addition of a 8-bit or 16-bit constant. For example,MOV AL, DISP [BX] [SI]MOV AL, DISP [BX+DI]MOV AL, [BP+SI+DISP]

In order to remember all the above memory addressing modes, the following syntax can help you to do this.

MOV <destination>, <DISP | [BX | BP] | [SI | DI] >There are totally three terms in the source field: DISP, [BX | BP] and [SI | DI]. You can choose one

term, two terms or three terms. For example, choose DISP from column one, nothing from column two and [DI] from column three getting

MOV AL, DISP [DI]Moreover, the generic of the move instruction takes three different assembly language forms:

MOV <register>, <memory>MOV <memory>, <register>MOV <register>, <register>

Note that at least one of the operands is always general purpose register.Finally, if the effective address calculation produces a value greater than FFFFh, the CPU ignores the

overflow and the result wraps around back to zero. For example, if BX contains 10h, the instructionMOV AL, FFFFh [BX]

will load the AL register from location DS: 0Fh, but not DS: 1000Fh.

XCHG instructionThe XCHG (exchange) instruction swaps two values. The general form is

XCHG <operand1>, <operand2> There are two specific forms of this instruction on the 8086 machine:

XCHG <register>, <memory>XCHG <register>, <register>

Since the 8086 often provides shorter and faster versions of instructions that use the AX register, you should try to arrange your computations so that the CPU can use the AX register as much as possible. However, both the operands must be the same size, and the XCHG instruction does not modify any status flags in the status register.

*** ***

12


Section 3.4.2. Arithmetic OperationsAlthough we are accustomed to decimal arithmetic, a microcomputer performs only binary

arithmetic. In the following operations, we will only base on binary operations, or in other words, hexadecimal operations.

The 8086 provides many arithmetic operations: addition, subtraction, multiplication, division, negation, etc. We will introduce all these in the following paragraphs.

Section 3.4.2.1. ADD and SUB functionTo begin with, we define ADD and SUB first.

ADD {destination} , {source} ; destination = destination + sourceSUB {destination} , {source} ; destination = destination - source

which takes the form:ADD {<register> | <memory>} , {<register> | <memory> | <immediate>}SUB {<register> | <memory>} , {<register> | <memory> | <immediate>}

Note that no memory forms can both exist in destination and source operand. If you want to add both memory elements together, you must load one of the memories to the register, and performs the addition.

The ADD instruction adds the contents of the source operand to the destination operand. For example, if AL = 60h and

ADD AL, 20his executed, AX will become 80. Why?4

In order to deal with carry bit and borrow bit problem, we define ADC and SBB. ADC {destination} , {source} ; destination = destination + source + CarrySBB {destination} , {source} ; destination = destination – source - Carry

which takes the form:ADC {<register> | <memory>} , {<register> | <memory> | <immediate>}SBB {<register> | <memory>} , {<register> | <memory> | <immediate>}

ADC will add the three elements: destination, source operand and a carry bit (0 or 1). If the carry bit is clear before execution, ADC will behave exactly like the ADD instruction.

The following example will demonstrate the carry bit calculation.For example, we want to do the calculation: 0123 BC62

+) 0012 553A 0136 119C

It is clear that we cannot use one memory location to store the whole source and destination numbers. In order to deal with this, we break down each number into two bytes, defining W11 = 0123, W12 = BC62, W21 = 0012, W22 = 553A. The output will store in W31 & W32. Here is the sample program segment.

MOV AX, W12 ; Add rightmostADD AX, W22MOV W32, AXMOV AX, W11 ; Add leftmostADC AX, W21 ; With carryMOV W31, AX; ***** data definition *****W11 DW 0123hW12 DW BC62hW21 DW 0012hW22 DW 553AhW31 DW ? ; not initializedW32 DW ?

Figure 10 Multi-word Addition

During the ADD operation, although there is a carry bit, it would not be added into the result. However, the carry flags (CF) will be set to 1. When ADC operation performed, it will add three elements, including AX contents, W21 and CF, whatever CF related to that calculation or not.

The SUB instruction is similar to ADD instruction. Note that the subtraction is not commutative.

4 Since it is overflow for AL, the value would store in the whole AX register.

*** ***

13


The ADD and SUB instruction will affect the status registers. They will set the overflow flag to be 1 if signed overflow/ underflow occurs. They will set the sign flag if the result is negative. They will set the zero flag if the result is zero, and set the carry flag if an unsigned overflow occurs.

Section 3.4.2.2. MUL and DIV functionFor multiplication, define the instructions as follows.

MUL {source} ; AX = AL * source / DX:AX = AX * sourceIMUL {source} ; AX = AL * source/ DX:AX = AX * source

which takes the form:MUL {<register> | <memory>}IMUL {<register> | <memory>}

MUL multiplies unsigned 8 or 16 bits data while IMUL multiples the signed (2’s complement) 8 or 16 bits data. Note that when multiplying two n-bit values, the result may require as many as 2n bits. Therefore, the basic operations can be divided into two types: byte times byte or word times word. In the byte times byte operation, the multiplicand will be in the AL register and the result will put in AX. All contents in AH will be erased after multiplication. In the word times word operation, the multiplicand will be in the AX register. After multiplication, the most-significant word is in the DX while the least-significant word will be in AX. The original content in DX will be erased.

For example, if the following program segment is executed, MOV AL, B1MUL B2IMUL B2

: :;***** data definition *****B1 DB 80HB2 DB 40H

Figure 11 Unsigned and Signed Multiplication

The first MUL B2 will treat 80h as +128, whereas IMUL will treat 80h as -128 (2's complement). After MUL executes, since 12864 = 8192 which is 2000h, therefore AX = 2000. However, when IMUL executes, since -12864 = -8192 which is E000h, therefore AX = E000.

Division shares the same properties with multiplication. The syntax are:DIV {source} ; AL = AX / source …. AH or AX = DX:AX / source…DXIDIV {source} ; AL = AX / source …. AH or AX = DX:AX / source…DX

which takes the form:DIV {<register> | <memory>}IDIV {<register> | <memory>}

DIV handles unsigned data while IDIV handles signed division. The basic operations can be divided into two types: byte into word or word into double word. In the byte into word operation, the dividend will be in the AX register. The remainder will put in AH while the quotient will put in AL. In the word into double word operation, the most significance dividend will be in the DX register while the least significance dividend will be in the AX register. The remainder will put in DX while the quotient will put in AX.

You cannot simply divide the 8-bit value by another 8-bit value. If the denominator is an eight-bit value, the numerator must be a 16-bit value. You can extend that 8-bit value into 16 bit and loaded it into a 16-bit register (e.g., AX), and perform the division.

Furthermore, if the dividend and divisor have the same sign, DIV and IDIV generate same results. However, if they are different in their sign, DIV generates a positive quotient while IDIV generates a negative quotient.

We summarized the MUL and DIV operation into the following table.Instruction Multiplier Multiplicand in Product in

MUL CL CL (byte) AL AXMUL BX BX (word) AX DX AX

Instruction Divisor Dividend in Quotient in Remainder inDIV CL CL (byte) AX AL AHDIV BX BX (word) DX AX AX DX

Table 1 Summary of MUL and DIV operation

*** ***

14


Section 3.4.2.3. Other arithmetic operationsThe NEG instruction reverses the sign of a binary value. In effect, NEG reverses the bits and adds 1

to the number (as 2’s complement operation). The syntax isNEG {<memory> | <register>}

The INC instruction increases the memory or register content by 1. The DEC instruction decreases the memory or register content by 1. It will be used in the looping, which will be discussed in the next section. The syntax are

INC {<memory> | <register>}DEC {<memory> | <register>}

The INC instruction is very important because adding one to a register is a very common operation. The fact that INC does not affect the carry flag is very important that you would not affect the result of ADC or SBB operation.

Section 3.4.3. Simple Control flowThe instructions discussed thus far are executed in a straight line, with one instruction sequentially

following another. However, most programs consist of number of loops in which a series of steps until reaching a specific requirement and determining which action to take. Examples include the loops and subroutine calls.

Certain instructions in assembly language achieve the need of the above purpose. The following are the four classes of transfer operations, including unconditional jump, conditional jump, looping and call, in the assembly language.

Section 3.4.3.1. Unconditional JumpA commonly used instruction for transferring control is the unconditional jump instruction. The

operation transfers control under all circumstances. The syntax is JMP {<address> | <label> | <register>}

You normally specify the target address by using a label. A statement label, as told before, is usually an identifier followed by a colon, usually on the same line as an executable machine instruction. The assembler determines the offset of the statement after the label and automatically computes the distance from the jump instruction to the statement label. Therefore, you do not have to worry about computing displacements manually. For example, the following short little jump jumps to label “START” if JMP program line is executed.

START: MOV AX, 20hJMP START

Figure 12 Sample JUMP example

Note that you can use any general purpose register. For example, if you useJMP AX

It is roughly equivalent toMOV IP, AX

Some forms of memory addressing, unfortunately, do not intrinsically specify a size. For example, JMP [BX]

cannot tell us the size of the variable (for far or near jump?). To solve the ambiguity, you will need to use a type coercion operator.

Coercion OperatorThere are times when you would probably like to treat a byte variable as a word, or treat a word as a

double word (addressing). Temporarily changing the type of a label for some particular occurrence is called coercion, as shown below:

<type> PTR <expression>Type is any of byte, word, dword, near, far, or other types. Expression is any general expression on

that is the address of some object. The coercion operator returns an expression with the same value as expression, but with the type specified by type. For example,

JMP word ptr [BX]refers to the size of the BX is a word size. For example, the following will jump to the different code

segment address with address ABC:JMP dword ptr CS:[ABC]

*** ***

15


Section 3.4.3.2. CALL procedureThe CALL and RET instruction enable us to call and return any procedure we like. Before we enter

into detailed discussion, we define the call & return function asCALL <label>

RET <immediate>and re-call the procedure syntax as<procedure name> PROC {FAR | NEAR}

: : ;***your (main/ sub) program here***<procedure name> ENDPIn the procedure definition, FAR allows us to call outside the code segment (inter-segment call)

while NEAR only allows us to call inside the code segment (intra-segment call). The CALL instructions take the same forms as the JMP instructions except there is no short

intrasegment call.

The FAR CALL instruction does the following:1. Pushes the CS register onto the stack.2. Pushes the 16-bit offset of the next instruction following the call onto the stack.3. Copies the 32-bit effective address into the CS:IP register.4. Execution continues at the first instruction of the subroutine.

The NEAR CALL instruction does the following:1. Pushes the 16-bit offset of the next instruction following the call onto the stack.2. Copies the 16-bit effective address into the IP register.3. Execution continues at the first instruction of the subroutine.The return RET instruction returns control to the caller of a subroutine. It does so by popping the

return address off the stack and transferring control to the instruction at the return address. Near call returns pop a 16-bit return address off the stack into the IP register. A far call returns pop a 16-bit offset into the IP register and a 16-bit segment value into the CS register.

The other form of the RET instruction is adding a displacement number after the RET. It is identical to those RET instruction, except the CPU adds the displacement value to the stack pointer immediately after popping the return address from the stack. This mechanism removes parameters pushed onto the stack before returning to the caller. Let’s take an example.

CODE_SEG SEGMENTBEGIN PROC FAR

: :CALL ACALL B

: :BEGIN ENDPA PROC NEAR

: :RET

A ENDPCODE_SEG ENDSCODE_SEG1 SEGMENT

B PROC FAR: :

RETB ENDP

CODE_SEG1 ENDS

Figure 13 Simple procedure call

From the example, we notice that if the subroutine A to be called within the code segment, PROC NEAR can be defined for the procedure. However, when we want to call the procedure outside the code segment, e.g. B, PROC FAR must be defined. After finishing all the procedure and returning to the program after CALL function, RET tells the assembler return back to the original program. Otherwise, the program would continue to execute with unpredictable results. Note that the above program is not valid in executing the FAR call procedure. For more details, please refer the subroutine section.

*** ***

16


Section 3.4.3.3. Conditional JumpAlthough the JMP, CALL and RET instructions provide transfer of control, they do not allow you to

make decisions before the jump. The conditional jump instructions handle this task. The conditional jump instructions are the basic tool for creating loops and other conditionally executable statements like the if…..then statement.

The conditional jumps test one or more bits in the status register to see if they match some particular pattern. If the pattern matches, control transfers to the target location. If the match fails, the CPU ignores the conditional jump and execution continues with the next instruction. Some instructions, for example, test the conditions of the sign, carry, overflow and zero flags.

Most of the time, you will probably execute a conditional jump after the comparison. CMP allows you to do this. To compare the content of the two data fields before executing the conditional jump instruction, a compare instruction can be executed:

CMP {destination}, {source} ; destination – source (set flags)which takes the form:CMP {<register> | <memory>}, {<register> | <memory> | <immediate>}Note that no both memories can exist in destination and source.The CMP instruction updates the status flags according to the result of the subtraction operation.

You can test the result of the comparison by checking the appropriate flags in the status register. On a 8086 machine, the conditional jump instructions are all two bytes long, which takes the form:

JXX {<address> | <label>}where XX is the mnemonic representing the condition of branching. The first byte is a one byte

opcode followed by a one byte displacement. Although this leads to a very compact instruction, a single byte displacement only allows a range of -128 bytes to +127 bytes.

Conditional branching instructions test the corresponding status register. If the condition is TRUE, the program will flow to the label specified. The status register would not change according to the JXX condition. The JXX command will be summarized in the following table.

Definition Description Condition5

Jump Based on Unsigned DataJE / JZ Jump equal or jump zero Z=1

JNE / JNZ Jump not equal or jump not zero Z=0JA / JNBE Jump above or jump not below/ equal C=0 & Z=0JAE / JNB Jump above/ equal or jump not below C=0JB / JNAE Jump below or jump not above/ equal C=1JBE / JNA Jump below/ equal or jump not above C=1 or Z=1

Jump Based on Signed DataJE / JZ Jump equal or jump zero Z=1

JNE / JNZ Jump not equal or jump not zero Z=0JG / JNLE Jump greater or jump not less/ equal N=0 & Z=0JGE / JNL Jump greater/ equal or jump not less N=0JL / JNGE Jump less or jump not greater/ equal N=1JLE / JNG Jump less/ equal or jump not greater N=1 or Z=1

Arithmetic JumpJS Jump sign N=1

JNS Jump no sign N=0JC Jump carry C=1

JNC Jump no carry C=0JO Jump overflow O=1

JNO Jump not overflow O=0JP / JPE Jump parity even P=1

JNP / JPO Jump parity odd P=0The conditional jump instruction give you the ability to split program flow into one of the two paths

depending upon some logical condition. Suppose you want to increment the AX register if BX is equal to CX. You can accomplish this with the following code:

CMP BX, CXJNE NEXTSTAT

5 Refer P.4 for register notation.

*** ***

17


INC AXNEXTSTAT: : :

Figure 14 Example of using conditional jump

Use the opposite branch to skip over the instructions you want to execute if the condition is true. Always use the opposite branch rule given earlier to select the opposite branch, otherwise you may use more than one conditional jump to perform one jumping.

Section 3.4.3.4. LoopingThe JMP instruction would cause an endless loop. However, a routine is more likely to loop a

specified number of times until it reaches a particular condition. The LOOP instruction, which serves the above purpose, requires an initial value in the CX register. It will decrement the CX register and branches to the target location if the CX register does not contain 0. The syntax is:

LOOP <label>Although the loop instruction’s name suggests that you would normally create loops with it, keep in mind that all it is really doing is decreasing CX and branching to the target address if CS does not contain zero after the decrement. For example, the following loop will loop the program segment 10 times.

MOV CX, 10A: INC AX

: :LOOP A

Figure 15 Example of looping

The loop instruction does not affect any flags.

Section 3.4.4. Logic Control

Section 3.4.4.1. Simple Logic FunctionLogic function is important in circuitry design. The instructions for Boolean logic are AND, OR,

XOR, and NOT. The syntax are: AND {destination}, {source} ; destination = destination AND sourceOR {destination}, {source} ; destination = destination OR sourceXOR6 {destination}, {source} ; destination = destination XOR sourceNOT {destination} ; destination = NOT destinationwhich takes the form:AND {register | memory} , {register | memory | immediate}OR {register | memory} , {register | memory | immediate}XOR {register | memory} , {register | memory | immediate}NOT {register | memory}Note that no memory location can appear in both destination and source. Of course, both the

destination and source must be same size.Under all the operation (except NOT), these instruction will clear the carry and overflow flag, and

copy the high order bit of the result into the sign bit.All the above logic functions are applied bit-by-bit. For example, if AL = 1100 0101 (C5h) and BH =

0101 1100 (5Ch), after the operationAND AL, BH

AL would set to 0100 0100 (44h).

Section 3.4.4.2. Relational OperatorsThe following are the syntax of the relational operators:

<expression> <relational operator> <expression>where <relational operator>::= EQ | NE | LT | LE | GT | GE

= It will return a value (FFh) when the condition is true, otherwise it will return 0 (false condition). For

example, if the following is executed,0 EQ 0

6 XOR is Exclusive OR, which will set the bit zero if both bits are equal (0 0 or 1 1).

*** ***

18


it will return FFh. The use of the relational operators will be discussed in the conditional directive section.

Section 3.4.4.3. Rotation & shiftingThere are several shifting and rotation instructions, the main function of them are integer

multiplication. Using shifting and rotation to perform such calculation on integer is faster than using multiplication. Destination operand must either be register or memory. Source operand is immediate provided that it is 1. The syntax are

Commands ExplanationSHL {<register>, <memory>}, {1 | CL} Shift LeftSHR {<register>, <memory>}, {1 | CL} Shift RightSAL {<register>, <memory>}, {1 | CL} Shift arithmetic leftSAR {<register>, <memory>}, {1 | CL} Shift arithmetic rightROL {<register>, <memory>}, {1 | CL} Rotation to leftROR {<register>, <memory>}, {1 | CL} Rotation to rightRCL {<register>, <memory>}, {1 | CL} Rotation through carry bit leftRCR {<register>, <memory>}, {1 | CL} Rotation through carry bit right

SHR will shift all the bits right by one bit, supply 0 to the missing bit and put the thrown bit to the carry bit C. For example, if AH = 10101101 and

SHR AH, 1is executed, AH will become 01010110 and the carry bit C = 1 (thrown bit)SAL would perform the same thing with SHL. However, when the arithmetic right shifting is

executed, the most significant bit will remain the original position.SAR instruction shifts all the bits in the operand to the right one bit, and replicating the high order

bit. The main purpose is to perform a signed division by some power of two. ROL does the similar thing with SHL. However, ROL & ROR will put the thrown bit to the missing

part of the value. For example, if AH = 10101101 andROR AH, 1

is executed, AH will become 11010110 and the carry bit C = 1.RCR does the similar thing with ROR. However, RCR will put the thrown bit to the carry bit and

get the last result of thrown bit to the missing part. Notice that if you rotate through carry bit (n+1) times, where n is the number of bits in the operand, you will get your original value. For example, if AH = 10101101 and carry bit C = 0, and

RCR AH, 1is executed, AH will become 01010110 and the carry bit C = 1.

0C

C

C

C0

C

C

C0

SHL/ SAL SHR

RORROL

RCL RCR

SAR

Figure 16 Shifting & Rotation graphical representation

For 8088 and 8086 CPUs, the number of bits to be shifted or rotated is either 1 or in CL. If the number of bits shifted is larger than 1, the source operand must be initialized in CL. For example, if AH = 10101101 and 3 bit right shifting is needed, the following must be followed.7

MOV CL, 3SHR AH, CL

7 The command, for example, “SHL AH, 3” is incorrect unless the constant is 1.

*** ***

19


Figure 17 Shift Left Example with constant larger than 1

Since shifting an integer value to the left one position is equivalent to multiplying that value by two (2h), use the shift left instruction for multiplication by powers of two.

For example, if the following calculation is done:810 1010 (= 8010) where 810 = 000010002

It is noticed that 810 1010 = 8 (8+2) = 8 (23+21) = 8 23+ 8 21

Therefore, when we do the following procedure,MOV AH, 8h ; store the value of AH = 24dMOV AL, AH ; store the value of AL = 24dMOV CL, 3 ; left shifting the AH by 3SHL AH, CLSHL AL, 1 ; left shifting the AL by 1ADD AH, AL ; Add the value to perform the multiplication

Figure 18 Multiplication and Shifting Example

AH will become 01010000 = 8010. Try more examples by yourself!

Section 3.4.5. Simple interrupts (I/O)An interrupt is an operation that interrupts execution of a program so that the system can take special

action. The interrupt INT is a very special form of a call instruction. Whereas the CALL instruction calls subroutines within your program, the INT instruction calls the system routines and other special subroutines. The major difference between interrupt service routines and standard procedures is that you can have any number of different procedures in an assembly language program, while the system supports a maximum of 256 different interrupt service routines. A program calls a subroutine by specifying the address of that subroutine; it calls an interrupt service routine by specifying the interrupt number for that particular interrupt service routine. The interrupt form is

INT <special code>where special code controls the different types of interrupts, within 0 to 255. It allows you to call one

of the 256 different interrupt routines. This form of the INT instruction is two bytes long. The first byte is the INT opcode, and the second byte is the immediate data containing the interrupt number.

Although you can use the INT instruction to call procedures, the primary purpose of the interrupt instruction is to make a system call. A system call is a subroutine call to a procedure provided by the system, such as DOS, mouse, or some other piece of software resident in the machine. Since you always refer to a specific number when make a system call, your program does not need to know the actual address of the subroutine in memory. The INT instruction provides dynamic linking to your program. The CPU determines the actual address of an interrupt service routine at run time by looking up the address in an interrupt vector table. This allows the change of the address of the system routines without fear of correcting new interrupt service routine numbers. As long as the system call uses the same interrupt number, the CPU will automatically call the interrupt service routine at its new address.

The only problem is that MS-DOS alone supports over 100 different calls. BIOS and other system utilities provide even more. This is beyond the interrupts reserved number (256). The common solution is to employ a single interrupt number for a given class and pass a service number in one of the registers (typically AH). For example, MS-DOS uses only a single interrupt number, 21h. To choose a particular DOS function, you load a DOS service code into the AH register before executing the INT 21h instruction. For example, to terminate a program and return the control to the MS-DOS, you would normally load AH with 4Ch and call DOS interrupt, as shown below:

MOV AH, 4ChINT 21h

Figure 19 Quit interrupt

The keyboard and screen interrupt is another good example. Interrupt 21h provides some service for keyboard and screen handling, with different service number. To choose a particular operation, you load the service number into the AH register before executing the 21h. The following table lists some typical service numbers.

Service No. Explanation01h Keyboard input with echo02h Display output07h Keyboard input without echo (no check for Ctrl-C)

*** ***

20


08h Keyboard input without echo (check for Ctrl-C)09h Display string4Ch Terminate program

Table 2 DOS interrupt service code

For example, to read a character from a keyboard buffer with displaying the character on the screen, you would use the following code:

MOV AH, 01hINT 21h

Figure 20 Input Service interrupt

The AL register will store the standard ASCII code shown on the screen. With the service number 08h, it acts as service number 02h, except that no output will be shown on the screen. With the service number 07h, it acts as service number 08h, except that it will ignore Ctrl-C key.

Another example is to print a character or string on the screen. In order to print the character, use the output Service 02h. The operation would display a character read from the DL register (ASCII format). For displaying the string, use the output Service 09h. The operation would display a string, defined in the data segment, with the effective address at DX register. For example, if we want to print the string “HELLO”, the following procedure must be follow:

MOV AH, 09hLEA DX, NAMEINT 21h

: :DATA_SEG SEGMENTNAME DB “HELLO$”DATA_SEG ENDS

Figure 21 Output Service interrupt

LEA (load effective address) instruction is used to prepare the pointer values. It takes the form:LEA <destination>, <source>

which takes the formLEA <register>, <memory>

It loads the specified general purpose register with the effective address of the specified memory location. The effective address is the final memory address obtained after all addressing mode computations. For example,

LEA AX, 3 [BX]loads the value of BX plus 3 into the AX register.For the string case, it will load the address of the first character (ASCII form) in the register (DX) to

enable the string operation. Furthermore, it will display characters until it finds the “$” sign.

Section 3.5. Basic structure of the Stack SegmentEvery DOS executable of .EXE format must have a stack of sufficient size to operate normally. Stack

is a piece of memory reserved for an EXE program which stays in memory. When you try to assemble a program, you may see a message:

“NO STACK SEGMENT”It is still work if you don’t declare it. However, from now on, you should define your own stack.A piece of memory must be reserved for the operations of the stack. The definition of the stack inside

the stack segment isDW <size of stack> dup (<initial value>)

For example, STACK_SEG SEGMENT STACK


Figure 22 Stack Definition

*** ***

21


defines the stack with duplicating 1024 word initialized to 0. It is suggested that you should declare the enough stack space you needed8. Otherwise, you will cause an error in execution. Furthermore, STACK must be written behind the SEGMENT.

When a program is loaded into the memory, a stack is created. Stack segment register (SS) initialized the stack segment. Stack pointer (SP) will point to the top of the stack. The base register (BP) is used to access the elements inside the stack without popping out the top elements. When using SP and BP, the referencing pointer is SS.

There are two basic instructions for stack manipulation. They are push and pop, and the syntax are:

PUSH {<register> | <memory>}POP {<register> | <memory>}

PUSH will decrease the stack pointer (SP) by 2 before placing the data in the stack. POP is an operation which copies the top element to the register before increase the stack pointer by 29.

Notice three things about the manipulation of the stack. First, it is always in the stack segment. Second, the stack grows down in memory, i.e. as you push the values onto the stack the CPU stores them into successively lower (smaller) memory locations. Finally, the SP (stack pointer) always contains the address of the value on the top of the stack.

All pushes and pops are 16-bit operations. There is no way to push a single 8-bit value onto the stack. To push an 8-bit value you should load it into 16-bit register first, with the higher order byte being 0.

Section 4. Macro Processing

Section 4.1. Macro DefinitionFor each symbolic instruction, the assembler generates one machine language instruction. However,

if you want to repeat the same instruction(s) with a calling statement, there will be overhead in the calling procedure. In this regard, a macro statement can help to solve the problem. To begin with, macros can simplify and reduce the amount of coding. Furthermore, it is easy to be read. Finally, errors can be reduced in the program.

A macro is like a procedure that inserts a block of statements at various points in the assembly program during assembly. Unlike the assembly instructions you write, the conditional assembly (talked later) and macro language constructs execution during assembly. The conditional directives and macro statements

8 Reserve space for interrupt execution, procedure call and own use.9 It implies that it will only PUSH and POP a word size data. (16 bit)

*** ***

sp

ss

bp

Push Direction

Figure 23 Stack operation

sp

ss

Push Direction

top of stack

next elementto be placed

Figure 24 Stack push implementation

22


do not exist when your assembly program is running. The purpose of these statements is to control which statements the assembler assembles into your final execution file. The macro directives let you emit repetitive sequences of instructions to an assembly language file like high level language procedures and loops, but without any running overheads.

The assembler has facilities that programmers use to define the macros. A specific macro name is defined for the macro along with the assembly instructions that the macro is to generate. Next, the instructions are defined within the macro.

A macro definition appears before any defined segment, except if your macro is a data definition, put it in data segment. The syntax is:

<Macro name> MACRO (arg1,arg2,....): : : : : ;***your macro here***

ENDMThe name of the macro is defined before the MARCO directive. It should be a valid and unique

symbol in the source file. You will use this identifier to expand the macro (as the calling subroutine). The arguments are the values you specify when you expand the macro (as the formal and actual parameters in the calling subroutine). The MACRO directive tells the assembler that the following instructions up to ENDM are to be part of the macro definition. The ENDM tells the assembler the end of the macro definition.

During the expansion, the assembler will expand every occurrence of the macro in the code segment. It will replace the macro name by the macro body at the destination address, and the formal arguments inside macro body by the supplied arguments. In this case, no run-time overhead is needed. The only thing the assembler done is a simple substitution followed by normal assembling.

Note that the assembler does not immediately assemble the instructions between the MACRO and ENDM directives when the assembler encounters the macro. Instead, the assembler stores the text corresponding to the macro into a special table (called the symbol table). The assembler inserts these instructions into your program when the assembler expands the macro. For example,

CLS MACROMOV AH, 06hMOV AL, 00hMOV BH, 07hMOV CX, 0000hMOV DH, 18hMOV DL, 50hINT 10h

ENDM

Figure 25 Macro Definition of clearing Screen

How to call the macro procedure? In the code segment, use the symbol to call the macro procedure. In the above example, if you want to call the CLS macro, just write CLS in the code segment. When you do this, the assembler will insert the statements between the MACRO and ENDM directives into your code at the point of the macro invocation.

If arguments are needed, formal parameters and actual parameters (argument) are defined in the macro definition and the calling macro respectively. The assembler will substitute the actual parameters appearing as operands for the formal parameters appearing in the macro definition. The assembler does a straight textual substitution only. For example, if we want to print the message with different message in different cases, the following example can be followed:

PRTMSG MACRO MSG ;/*** printing message ***/MOV AH, 09hLEA DX, MSGINT 21hENDM

CODE_SEG SEGMENT PARAMAIN PROC FAR

: :PRTMSG MSG1PRTMSG MSG2: :

MAIN ENDP

*** ***

23


CODE_SEG ENDS

DATA_SEG SEGMENT PARAMSG1 DB “Hello$”MSG2 DB “Everybody$”DATA_SEG ENDS

Figure 26 Macro Definition with argument

After the assembling, the program will becomeCODE_SEG SEGMENT PARAMAIN PROC FAR

: :MOV AH, 09hLEA DX, MSG1INT 21h

MOV AH, 09hLEA DX, MSG2INT 21h: :


Figure 27 Assembly Program after assembling

In some instances using macros can save a considerable amount of typing in the program. For example, if you want to clear the screen many times, you may use a lot of code for only one purpose in case of no macro. A large number of same statements are duplicated in the program. Writing the macro can simplify your program, and easy to write.

Since the assembler does a textual substitution for macro parameters when expanding the macro, there are times when a macro expansion might not produce the results you expected. For example, look at the following statement:

MSG A * 5As you can see, the calling to the MSG macro with text can lead to the problem. The assembler will

automatically convert a text object A passed to the macro. If we call the macro procedure as:

MSG <A*5>the assembler will automatically convert the text object A*5 to the macro. In order to evaluate the

value A*5 expression, the following should be done:MSG %A*5

It will evaluate the expression “A*5” and convert the resulting numeric value to a text value consisting of the digits that represent the value before the expansion.

Macro and ProcedureAlthough the macro and procedure produce the same result, they do it in different ways. The

procedure definition generates code when the assembler encounters the PROC directive. A call to this procedure requires:

1. Encounters the CALL instruction2. Pushes the return address onto the stack3. Jumps to the procedure4. Executes the code in the procedure5. Pops the return address off the stack6. Returns the calling code.

The macro, on the other hand, does not emit any code when processing the statements between the MACRO and ENDM directives. However, upon encountering macro in the mnemonic field:

1. The assembler will assemble every statement between the MACRO and ENDM directives2. Emit the code to the output execution file (*.exe).

At the running time, the CPU executes these instructions without the procedure overhead.

*** ***

24


The execution of a macro expansion is usually faster than the execution time of the same code implemented with a procedure. Furthermore, to call a macro, you simply specify the macro name as though it were an instruction or directive. To call a procedure, you need to use the CALL instruction.

Section 4.2. Local DirectivesSome macros may require definition of instruction labels. However, there will be a problem with

using the label. Since the assembler copies the macro text directly, the label will be redefined each time when both the main procedure and the macro are using the same label. When this happens, the assembler will generate a multiple definition error. To overcome this problem, the LOCAL directives can be used to define the local label within a macro. The syntax is:

LOCAL (label_1,label_2,....)For example,

CLS MACRO LOCAL A MOV AH, 06h MOV AL, 00hA: MOV BH, 07h

: :LOOP A

ENDM

Figure 28 Macro Definition with local definition

The local label definition should be defined after the macro directives. During the expansion, the assembler will assign the different labels in the program.

Section 4.3.Nested MacroA macro definition may contain a reference to another defined macro. Consider a simple macro

named DOSINT that loads a service in the AH register and calls DOS Interrupt 21h:DOSINT MACRO SERVICEMOV AH, SERVICEINT 21h ENDM

Figure 29 Example of Dos interrupt macro

Suppose that you have another macro named DISPLY that uses the service 02 in the AH register to display a character:

DISPLY MACRO CHARMOV DL, CHARMOV AH, 02hINT 21h ENDM

Figure 30 Example of Display macro

Now, you can change the above macro into the following nested macro:DISPLY MACRO CHARMOV DL, CHARDOSINT 02h

ENDM

Figure 31 Example of Display macro

and the expansion will be as the same as before.

*** ***

25


Section 4.4. Special Directives10

Section 4.4.1. Repetition DirectivesAnother macro format is the repeat macro. A repeat macro is nothing but more than a loop that

repeats the statements within the loop some specified number of times. There are three types of repetition directives, REPT, IRP and IRPC. These three directives cause the assembler to repeat a block of statements, terminated by ENDM. These directives do not have to be contained in a MACRO definition, but if they are, one ENDM is required to terminate the repetition and the second ENDM to terminate the MACRO definition.

REPT directiveThe syntax of the REPT directives is:

REPT <expression><statements>

ENDM Expression must be a numeric expression that evaluates to an unsigned constant. The repeat directive

duplicates all the statements between REPT and ENDM with the number of times indicated in the expression, for example,

MAKEWORD MACRO NREPT NDW 1024ENDMENDM

Figure 32 Example of REPT directives

When a macro MAKEWORD 3 is called, the loop will repeat 3 times, each time emitting the code of "DW 1024" after the expansion. Note that the REPT loop executes at assembly time, not at run time. REPT is not a mechanism for creating loops within the program, it is only used for replicating sections of code within your program.

IRP directiveAnother form of the repeat macro is the IRP macro. The IRP (Indefinite Repeat) operation will cause

a repeat of block of instruction up to the ENDM. The syntax is:IRP <<parameter>, <arguments>>

<statements>ENDM

The “< >” brackets are required around the items in the parameter and arguments. The IRP directive replicates the instructions between IRP and ENDM once for each item appearing in the argument. Furthermore, for each iteration, the first symbol in the parameter is assigned the value of the successive items from the second parameter. For example,

IRP N, <1,2,3,4>DB NENDM

Figure 33 Indefinite Repeat Example

The loop emits 4 DB instructions, generating DB 1, DB 2, DB 3 and DB 4. The arguments can be any number of legal symbols, string, numeric, or arithmetic constants. Remember, the IRP loop, like the REPT loop, executes at assembly time, not at run time.

IRPC directiveThe third form of the loop macro is the IRPC macro. It differs from the IRP macro in that it repeats a

loop the number of times specified by the length of a character string rather than by the number of the operands present. Here is the general syntax:

IRPC <<parameter>, <string argument>><statements>

ENDM

10 Ignore this part forward for CSC Minor Students

*** ***

26


The statements in the loop repeat once for each character in the string operand. The angle brackets “< >” must appear around the string, for example,

IRPC N, <1234>DB NENDM

Figure 34 Indefinite Repeat Example

The assembler will generate a block of the code for each character in the string argument. After the expansion, the assembler will generate DB 1, DB 2, DB 3 and DB 4. The arguments can be any number of legal symbols, string, numeric, or arithmetic constants.

Section 4.4.2. Conditional DirectivesIt is important that you realize these directives evaluating their expressions at ASSEMBLY TIME,

not at any running time. The conditional assembly directive is not the same as a C conditional statement. For example, the IF statement in assembly is similar to #ifdef in C statement, but not exactly equal to the if statement in C.

The conditional assembly directives are important because they let you generate different object code for different operating environments and different situations, especially in the macro expansion.

One possible solution to determine the processor to execute different sections of code in the program is to use the conditional assembly. With conditional assembly, you can conditionally choose whether the assembler assembles the code or not. Since the code segment appear in the same source file, the program will be much easier to maintain since you will not have to correct the same bug in two separate programs. You may need to correct the same bug twice in two separate code sequences, so it is less likely that you will forget to make the change in both places.

The conditional assembly directives are especially useful in the macros. They can help you produce efficient code when a macro would normally produce sub-optimal code. It actually acts as a programming language within a programming language.

IF DirectiveThe assembly language supports a number of the conditional directives. Conditional directives IF are

most useful within a macro definition. Every IF directives must have a matching ENDIF to terminate the tested condition. One optional ELSE may provide an alternative condition. The syntax is:

IF <condition><sequence of statements>

<ELSE> ;optional<sequence of statements>

ENDIFOmission of ENDIF causes an error message "Undetermined conditional". The assembler evaluates

the condition. If it is a non-zero value (true), the assembler will assemble the statements between the if and else directives (or endif, if the else is not present). If the expression evaluates to zero (false) and an else section is present, the assembler will assemble the statements between the else and the endif directive. If the else section is not present and expression evaluates to false, the assembler will not assemble any of the code between the if and endif directives.

The important thing to remember is that the condition has to be an expression (it can be the relational operation, e.g., EQ, LT, etc.) that the assembler can evaluate at assembly time, i.e., it must evaluate to a constant. For example, if you want to assemble the first set of code for A = 0, otherwise to assembler the second set of code, you could use the following statements:

IF A EQ 0MOV AX, A ;<first set>

ELSEMOV BX, A ;<second set>

ENDIF

Figure 35 Example of using IF directives

IFE DirectiveThe IFE directive is used exactly like the IF directive except it assembles the code after the IFE

directive only if the expression evaluates to zero (false), rather than non-zero (true), i.e. reverse case of IF directive.

*** ***

27


IFDEF & IFNDEF DirectiveThese two directives require a single symbol as the operand. IFDEF will assemble the associated

code if the symbol is defined, IFNDEF will assemble the associated code if the symbol isn’t defined. Use ELSE and ENDIF to terminate the conditional assembly sequences, as:

IFDEF <symbol><sequence of statements>


ENDIFThese directives are especially popular for including or not including code in an assembly language

program to handle certain special cases. For example, if you want to use statement to include the statements in the code, you can write the first line of the IFDEF directives as:

IFDEF SETTING: :

To activate the code, simply define the symbol SETTING somewhere at the beginning of the program (before the first IFDEF referencing SETTING). To automatically eliminate the code, simply delete the definition of SETTING. You may define the SETTING using a simple statement like:

SETTING = 0Note that the value you assign to SETTING is unimportant. Only the fact that you have defined this

symbol is important.

IFB, IFNBThe IFB and IFNB directives, useful mainly in macros, check to see if an operand is blank (IFB) or

not blank (IFNB), which is in the form:IFB <<condition>>

<sequence of statements><ELSE> ;optional

<sequence of statements>ENDIF

The IFB works in an opposite manner to IFB, i.e. it would assemble the statements above that IFB does not and vice versa. Note that “< >” is needed in the IFB statement.

For example, if we use IFNB (if not blank), all INT 21h requests require a service in the AH register, whereas some requests also require a value in the DX register. The following macro, DOSINT, uses IFNB to test for a nonblank argument for the DX.

DOSINT MACRO SERVICE, ADDRESSMOV AH, SERVICEIFNB <ADDRESS>LEA DX, ADDRESSENDIFINT 21hENDM

Figure 36 Example of IF directives

If DOSINT 01h is called, the assembler will generate only three program lines, which is:MOV AH, 01hINT 21h

Figure 37 Example of IF directives after expansion of missing parameter

If DOSINT 09h, MSG is called, the assembler will generate the following program lines:MOV AH, 09hLEA DX, MSGINT 21h

Figure 38 Example of IF directives after the expansion

WWW.Freshersworld.com

IFIDN, IFDIF, IFIDNI, and IFDIFI

*** ***

28

http://WWW.Freshersworld.com/


The IFIDN, IFDIF, IFIDNI and IFDIFI assembly directives take two operands and process the associated code if the operands are identical (IFIDN), different (IFDIF), identical ignoring case (IFIDNI), or different ignoring case (IFDIFI). The syntax is

IFXXX <<operand1>>, <<operand2>><sequence of statements>


ENDIFwhere XXX is the assembly code. Note that “< >” is required.

EXITM directiveThe EXITM directive immediately terminates the expansion of a macro, exactly as ENDM. But why

EXITM? The answer is the conditional assembly. Conditional assembly can be used to conditionally execute the EXITM directive in certain condition, such as

COUNTER MACRO COUNTIF COUNT EQ 0

: :EXITMENDIFADD AX, COUNTENDM

Figure 39 Example of EXITM

Section 4.5. INCLUDE DirectiveThe include directive, when encountered in the source file, switches program input from the current

file to the file specified in the parameter list of the include. This allows you to construct the text files containing macros, source code and other assembler items. The syntax for the include directives is:

INCLUDE <filename>Filename must be a valid DOS filename. The assembler will merge the specified file into the

assembly at the point of the include directive. Note that you can nest include the statements inside files you include.

Using the include directive by itself does not provide separate compilation. You could use the include directive to break up a large source file into the separate modules and join these modules together when you assemble you file. The following example would include the PRINTF.ASM file during the assemble of the program:

INCLUDE PRINTF.ASMNow the program can benefit from the modularity gained by this approach. The include directives

inserts the source file at the point of the include during assembly, exactly as though you had typed that code in the program.

Section 5. Linking to Subprograms

Section 5.1. Intersegment CallsAs in section 3.4.3.2, the subprogram can be called by the main program (Far call). However, it will

not work before defining any attributes. The call in main program has to know that subprogram exists outside the main program segment, and the subprogram must tell the assembler and linker that another module has to know the address of the subprogram, and thus the EXTRN and PUBLIC directives must be assigned.

To use the EXTRN and PUBLIC, you must create at least two procedures. One procedure contains a set of variables and procedures used by the second. The second procedure uses those variables without knowing how they’re implemented. The syntax of EXTRN and PUBLIC are:

EXTRN <subprogram name> : typePUBLIC <subprogram name>

where type can be BYTE, WORD, FAR, NEAR, etc. BYTE and WORD identify the data items that this module references but another module defines. NEAR and FAR identify the procedure or instruction label that this module references but another modules defines. The EXTRN directive tells the assembler that there is a subprogram outside this procedure. The PUBLIC directive tells the assembler and linker that the address of the specified symbol defined in the current assembly is to be available to other modules. Likewise, all

*** ***

29


external subprogram names within a module must appear within a PUBLIC statement in some other procedures.

Although there are possible ways to module your program into the procedures, you may need to pass the data from the main program or return the data from the procedure. When you pass parameters, it depends on the size and the number of the parameters. There are several ways to pass the parameters to the procedure.

Passing Parameters in RegistersIf you are passing a small number of bytes to a procedure, the registers are an excellent place to pass

parameters. The registers are an ideal place to pass value parameters to a procedure. If you are passing a single parameter to a procedure you may use AL register to pass the parameter. If you are passing several parameters to a procedure, you should probably use the registers. In general, you should avoid using BP register.

Let's show an example to illustrate this. In the following example it consists of a main program, MAIN, and a subprogram, ADDING. The main program defines the segments for the stack, data and code. The main purpose of the program is to add A and B together by using the subprogram function. An EXTRN in the main program defines the entry point to the subprogram as ADDING.

The subprogram contains a PUBLIC statement (after the ASSUME statement) that makes the ADDING known to the linker as the entry point for execution. This subprogram simply adds the contents of the AX and the BX. Since the subprogram does not define any data, it does not need a data segment. Also, the subprogram does not define a stack segment because it references the same stack addresses as the main program. Consequently, the stack defined in the main program is available to the subprogram. The linker requires definition of at least one stack for an EXE program, and the definition of the stack in the main program serves that purpose.

; Main ProgramEXTRN ADDING:FAR

CODE_SEG SEGMENT PARAMAIN PROC FAR

ASSUME CS:CODE_SEG,DS:DATA_SEG,SS:STACK_SEGMOV AX, DATA_SEGMOV DS, AXMOV AX, AMOV BX, BCALL ADDINGMOV AH, 4ChINT 21h


DATA_SEG SEGMENT PARAA DW 5B DW 7DATA_SEG ENDSSTACK_SEG SEGMENT PARA STACK


END MAIN; Subprogram

CODE_SEG SEGMENT PARAADDING PROC FAR

ASSUME CS: CODE_SEGPUBLIC ADDINGADD AX, BXRET

ADDING ENDPCODE_SEG ENDS

END ADDING

Figure 40 Use of EXTRN and PUBLIC in main and sub-program

How can we do if common data is used in both main program and the subprograms? A common requirement is to process data in one assembly module that is defined in another assembly module. We

*** ***

30


modify the above example in the following one, and noted the changes is that we move the content of A and B to AX and BX in the subprogram.

; Main programEXTRN ADDING1:FARPUBLIC A, B

CODE_SEG SEGMENT PARA PUBLICMAIN1 PROC FAR

ASSUME CS:CODE_SEG,DS:DATA_SEG,SS:STACK_SEGMOV AX, DATA_SEGMOV DS, AXCALL ADDING1MOV AH, 4ChINT 21h

MAIN1 ENDPCODE_SEG ENDS

DATA_SEG SEGMENT PARA PUBLICA DW 5B DW 7DATA_SEG ENDSSTACK_SEG SEGMENT PARA STACK


END MAIN1; Subprogram:

EXTRN A: WORD, B: WORDCODE_SEG SEGMENT PARA PUBLICADDING1 PROC FAR

ASSUME CS: CODE_SEGPUBLIC ADDING1MOV AX, AMOV BX, BADD AX, BXRET

ADDING1 ENDPCODE_SEG ENDS

END ADDING1

Figure 41 Example of using common data

Note that there are two main changes in the above example. To begin with, the main program MAIN defines the data A and B as PUBLIC. The data segment is also defined with the PUBLIC attribute. In the code segments, the PUBLIC attribute will cause the linker to combine the two logical code segments into one physical code segment.

Next, the subprogram ADDING defines A and B as EXTRN, and both as WORD size. This definition informs the assembler as to the length of the one word. The assembler can now generate to correct operation code for the MOV instructions, but the linker will have to complete the operands.

The main program and the subprogram may define any other data items, but only those defined as PUBLIC and EXTRN are known in common.

The reason why the ADDING subprogram can refer to the main program's data is because it does not change the address in the DS register, which still points to the main program data segment. However, programs are not always simple, and subprograms often have to define their own data as well as refer the data in the calling program.

The next example shows the variation of the data definition. Both the data is defined in both data segments.

; Main programEXTRN ADDING2:FARPUBLIC A

CODE_SEG SEGMENT PARA

*** ***

31


MAIN2 PROC FAR ASSUME CS:CODE_SEG,DS:DATA_SEG,SS:STACK_SEG

MOV AX, DATA_SEGMOV DS, AXCALL ADDING2MOV AH, 4ChINT 21h


DATA_SEG SEGMENT PARAA DW 5DATA_SEG ENDSSTACK_SEG SEGMENT PARA STACK



EXTRN A: WORDCODE_SEG SEGMENT PARA PUBLICADDING2 PROC FAR

ASSUME CS: CODE_SEGPUBLIC ADDING2MOV AX, APUSH DSASSUME DS: DATA_SEGMOV AX, DATA_SEGMOV DS, AXMOV BX, BADD AX, BXPOP DSRET


DATA_SEG SEGMENT PARAB DW 7DATA_SEG ENDS

END ADDING2

Figure 42 Example of using data in both programs

The main difference between this example and the above example is the definition of the data. Data A is defined in the main program while data B is defined in the sub-program. In the subprogram, it has to get the data A first while the DS register still contains the address of the main program data segment address. The subprogram then pushes the DS on the stack and loads the address of its own data segment. The subprogram now can get its own data segment.

Furthermore, you can use the PUBLIC in both the data segment. In this case, the linker combines them and need not to push and pop the DS because the programs use the same data segment and DS address.

WWW.Freshers world.comPassing Parameters on the Stack

Another way of making data know to a called subprogram is by passing the parameters, in which a program passes data physically via the stack. It can deal with that procedure which pass large number of parameters, and it will be demonstrated by the following example.

; Main programEXTRN ADDING3:FAR

CODE_SEG SEGMENT PARA PUBLICMAIN3 PROC FAR

ASSUME CS:CODE_SEG,DS:DATA_SEG,SS:STACK_SEG

*** ***

32



MOV AX, DATA_SEGMOV DS, AXPUSH APUSH BCALL ADDING3MOV AH, 4ChINT 21h


DATA_SEG SEGMENT PARAA DW 5B DW 7DATA_SEG ENDSSTACK_SEG SEGMENT PARA STACK



CODE_SEG SEGMENT PARA PUBLICADDING3 PROC FAR

ASSUME CS: CODE_SEGPUBLIC ADDING3PUSH BPMOV BP, SPMOV AX, [BP+8]MOV BX, [BP+6]ADD AX, BXPOP BPRET 4


END ADDING3

Figure 43 Example of passing parameters to the subprogram

Take a look at the stack after the execution of the MOV BP, SP in the subprogram ADDING3, and look like the right hand figure:

1. A push loaded data A (0005) onto the stack first.2. A push loaded data B (0007) onto the stack in the main

program.3. CALL function in the main program pushed the content

of the CS onto the stack, e.g., 1234. 4. CALL function in the main program pushed the content

of the IP onto the stack, e.g., 0035.

The called program requires use of the BP, say 0000, to access the parameters in the stack. Its first action is to save the contents of the BP for the calling program by pushing it onto the stack. The program then inserts the contents of the SP into the BP because the BP is usable as an index register. Since the BP now also contains the SP pointer, which points the top of the stack, now data A is in the stack at (BP+8) and data B is at (BP+6). The routine transfers these values from the stack to the AX and BX, and performs the addition.

Before returning to the calling program, the routine pops the BP (returning the base pointer address), which increments the SP by 2. The last instruction, RET, is a far return to the calling program that performs the following:

1. Pops the word now at the top of the stack to the IP and increase the SP by 2.2. Pops the word now at the top of the stack onto the CS and increase the SP by 2.3. Because of the two passed parameters (A and B) in the stack, the RET instruction is coded

as RET 4. The 4, known as a pop value, contains the number of bytes in the passed parameters. The RET operation also adds the pop value to the SP, correcting it so that it points to the bottom of the stack.

*** ***

33

Figure 44 The contents of the stack


In effect, because the parameters in the stack are no longer required, the operation discards the value A and B in the stack and returns correctly to the calling program. Note that the POP and RET operation increment the SP but no any erasing in the content of the stack.

When saving other registers onto the stack, always make sure that you save and set up BP before pushing the other registers. If you push the other registers before setting up the BP, the offsets into the stack will change. That means that the two statement

PUSH BPMOV BP, SP

instructions should be the first two instructions in any subroutine when passing the parameters by stack.

Section 5.2. Returning ParametersYou can return the results in the same places as you pass the parameters. In returning function results

in a register, like parameters, the registers are the best place to return the results. You can place the results in the register.

Another good place where you can return the results is on the stack. The idea is to push some dummy values onto the stack to create space for the function result. The procedure, before leaving, stores its result into this location. When the function returns to the caller, it pops everything off the stack except this result. For example, there are three parameters passing to the procedure, and the procedure adds all of these three parameters, returning the sum. In this case, we will push one more dummy values (in the following example, [BP+10]) for storing the sum.

Calling sequence:PUSH AX ; for dummy variable

PUSH A ; push three variablesPUSH BPUSH CCALL SUMMINGPOP AX ; Get the return result

Procedure:SUMMING PROC NEAR

PUSH BPMOV BP, SPMOV AX, [BP+4]ADD AX, [BP+6]ADD AX, [BP+8]MOV [BP+10], AXPOP BPRET 6

SUMMING ENDP

Figure 45 Example of returning results

Section 5.3. Local Variable StorageSometimes a procedure will require temporary storage, that it no longer requires when the procedure

returns. You can easily allocate such local variable storage on the stack. The 8088 CPU supports the local variable storage with the same mechanism it uses for parameters. It uses the BP and SP register to access and allocate the variables. For example,

TESTING PROC NEARPUSH BPMOV BP, SPSUB SP, 6MOV AX, [BP+4]MOV BX, [BP-2]

: :ADD SP, 6POP BPRET 6

*** ***

34


TESTING ENDP

Figure 46 Example of using local variable

The SUB SP, 6 instruction makes room for three words on the stack. You can allocate three local variables in these three words. You can reference these three variables by indexing off the BP register ([bp-2], [bp-4], [bp-6]) using negative offsets. Upon reaching that statement, you can use the memory between BP and SP as the temporary storage of the local variables.

The example uses the matching instruction:ADD SP, 6

at the end of the procedure to delocate the local storage. The value you add to the stack pointer must exactly match the value you subtract when allocating this storage. If these two values don’t match, the stack pointer upon entry to the routine will not match the stack pointer upon exit, this is like pushing or popping too many items inside the procedure.

Unlike parameters, you can allocate local variables in any order. As long as you are consistent with your location assignments, you can allocate them in any way you choose.

Section 5.4. Recursive CallingRecursion occurs when a procedure calls itself. There is one thing you should keep in mind when

using recursion. The recursive routines can eat up a considerable stack space. Therefore, when writing recursive subroutines, always allocate sufficient memory in your stack segment. The best example is the quick sort, which is almost a literal translation of the C-language form. Please refer Sample Program 3.

Section 5.5. Linking C and Assembly Language ProgramsThe assembler code can call C functions and reference external C variables. C program can also call

public assembler functions and reference public for assembler variables. The following is a few simple rules to share the functions and variables between C and Assembly.

1. Similar as in Assembly Language, when a C program links to the assembly program, EXTRN and PUBLIC must be defined in the main and subprogram respectively. In the C program, the syntax of the EXTRN is:

extern <program function> For example, if a function called _ADDING (which is an assembly language procedure)

to be called by a C program with parameters a and b with both integer type, the following must be written before the main program in C:

extern int _ADDING (int, int); The extern statement must be defined at the beginning of the main program.

2. If you are programming in C, all external labels should start with an underscore ‘_’ character in assembly. The C compilers automatically prefix an underscore to all function and external variable names when they’re used in C code, so you only need to attend to underscores in your assembler code. You must be sure that all assembler references to C functions and variables begin with underscores, and you must begin all assembler functions and variables that are made public and referenced by C with underscores. Furthermore, you must name the code segment as _TEXT, and all calls from C to assembly will be near procedure.

3. The assembler is normally insensitive to case when handling symbolic names, making no distinction between uppercase and lowercase letters. Since C is case sensitive, it’s desirable to have assembler be case sensitive, at least for those symbols that are shared between assembler and C.

4. It’s important that your assembler EXTRN statements that declare external C variables specify the right size for those variables. The correspondence between C and assembler types is as follows: Byte size in assembly:

Byte size in assembly: unsigned char, char Word size in assembly: unsigned short, short, unsigned int, int

5. C will pass parameters onto the stack. Before calling a function, C first pushes the parameters to that function onto the stack, starting with the rightmost parameter and ending with the leftmost parameter. The C function call:

Testing (a, b, 5);compiles to

MOV AX, 5

*** ***

35


PUSH AXPUSH BPUSH ACALL _Testing

Figure 47 Conversion of C function to assembly function

You can see that the rightmost parameter, 5, begin pushed first, then B, and finally A.6. As far as C is concerned, C can do anything as long as they preserve the registers. However, SI

and DI are special cases, since they are used by C as register variables. If register variables are enabled in the C module calling your assembler function, you must preserve SI and DI. It is a good practice that you should push them on entry and pop them when exit.

7. A C callable assembler function must return a value in AX, just like other functions. In general, a 8-bit or 16-bit value is returned in AX register, while the 32-bit values are returned in DX:AX two-word form.

The typical procedure for accessing the two passed parameters is done as the following example.C program:#include <stdio.h>#include <stdlib.h>extern int adding (int, int);int main ( ){ int a, b; : : : ADDING (a, b); : : :}Assembly Program:

PUBLIC _ADDINGCODE_SEG SEGMENT PARA PUBLIC ‘CODE’_ADDING PROC NEAR

ASSUME CS: CODE_SEGPUSH BPMOV BP, SPMOV A, [BP+4]MOV B, [BP+6]PUSH SIPUSH DS

: :POP DSPOP SIPOP BPMOV AX, 0RET 4

_ADDING ENDPCODE_SEG ENDS

END _ADDING

Figure 48 Example of Linking from C to Assembly Program

One case in which you may wish to call a C function from assembler is when you need to perform complex calculations. This is especially true when mixed integer type and floating-point calculations are involved. There is nothing different from calling to assembly program, but note that:

1. Define the extern function (procedure calling to C) in both C and assembly program. In the assembly procedure, if you want to call the C function outside the procedure, the definition of the extern will be

EXTRN _adding: procwhere proc means the procedure.

2. Push all necessary parameters before calling C function.3. Pop all necessary parameters after calling C function.4. The return parameter will be in the AX register.

*** ***

36


Section 6. Keyboard & Screen handling (I/O)

Section 6.1. Introduction to I/O handlingUp to this point, most programs have defined data items in the data area or within an instruction

operand as immediate data. However, most programs require input data from a keyboard and provide answers on screen. In the following paragraphs, it will cover the basic requirements for displaying information on a screen and for accepting input from a keyboard.

There are various methods for telling the system the keyboard processing and screen handling. The main methods are:

Direct access: the whole screen is mapped onto a piece of main memory, so screen display can be done simply by putting appropriate values into the main memory directly. The interface of keyboard and the system is called port which has specific address (or just a port number). Any input from keyboard can be read from that port. Where the screen is mapped and which port the keyboard uses may be different in different machines.

BIOS interrupt: it uses the INT instruction to transfer the control directly to BIOS. DOS interrupt: it uses the INT instruction to transfer the control directly to DOS. As DOS is

portable to any PC machine, DOS interrupt can be used in any I/O devices.

To call a BIOS or DOS interrupt, an interrupt instruction will be used in the assembly program. Given an interrupt value, the instruction INT will transfer the control to one of the 256 different interrupt handlers. The interrupt vector table holds the address of these interrupt handlers. We will introduce different applications of BIOS and DOS interrupt in the keyboard and screen processing.

Section 6.2. I/O with DOS interrupt

Section 6.2.1. String inputIn section 3, we have introduced the simple interrupt to handle the input characters. However, the

original DOS service to accept the string from a keyboard is particularly powerful. In the DOS interrupt service 0Ah, it provides a function that reads a line of text (string) from the keyboard and stores it into the input buffer. For example,

CODE_SEG SEGMENT ASSUME CS: CODE_SEG, DS: DATA_SEG : : MOV AH, 0Ah LEA DX, NAMESTRING INT 21h : :CODE_SEG ENDSDATA_SEG SEGMENTNAMESTRING DB 10, ?, 10 dup (?)DATA_SEG ENDS

Figure 49 Keyboard input segment

To begin with, the system needs to know the maximum length of the input data. The purpose is to warn the user who input too many characters. Secondly, the system must know how many characters the user inputs. Therefore, the label NAMESTRING needs three parameters to define an input string. The first parameter denotes the maximum length of input string, the second parameter denotes the actual number of input string, and the third parameter declares the space for input string.

To request the input string in code segment, apply the DOS interrupt with service 0Ah in the AH. Load the address of the parameter list into the DX and issue the INT 21h. Such interrupt will wait for the user to enter the characters and check whether they are exceeding the maximum or not. This operation will echo the entered characters Pressing the ENTER key (ASCII code: 0Dh) will tell the system the end of an entry.11

All the characters are interpreted as the ASCII code (with ENTER key). Therefore, the string definition must be DB.

11 Enter Key will count in the maximum length of the string. However, it will not count in the actual length of the string.

*** ***

37


Section 6.2.2. Display the special charactersSimilar as in section 3, you will use DOS interrupt service 02h to display the character (or 09h to

display the string with “$”). When special character is needed to print on the screen after the string, the following data syntax must be followed.

DATA_SEG SEGMENTNAMESTRING DB “Name”, 0Dh, 0Ah, “$”DATA_SEG ENDS

Figure 50 Data definition for printing the special character

where 0Dh is the enter key and 0Ah is the line-feed control.

Section 6.3. Video display with BIOS interruptThe PC BIOS uses several interrupt numbers to accomplish various operations. The main interrupt

service in the video display service is 10h. As same as DOS interrupt, it requires additional parameters in certain memory locations.

The INT 10h instruction does several video display related functions. You can use it to initialize the video display, set the cursor size and position, read the cursor position, etc. You can select the particular function to execute by passing a service number in the AH register.

Section 6.3.1. Screen Clearing and ColoringPrompts and commands will stay on the screen until overwritten or scrolled off. In assembly, clearing

screen can be requested by using BIOS interrupt INT 10h Service 06h. Furthermore, the initial value must be set to the following registers.

Registers Purpose Initial ValueAH Interrupt Service Code 06hAL Number of lines scrolled up 00 for full screen, other constant for number of linesBH Specify the color See belowCH Starting row Any value (Suggestion: 00)CL Starting column Any value (Suggestion: 00)DH Ending row Any value (Suggestion: 18h)DL Ending column Any value (Suggestion: 50h)

Table 3 Register purpose in screen handling

For example, MOV AH, 06hMOV AL, 00hMOV BH, 07hMOV CX, 0000hMOV DH, 18hMOV DL, 50hINT 10h

Figure 51 Screen clearing example

will perform the clear screen function.

*** ***

Value Background color Foreground color0h/ 0000b Black Black1h/ 0001b Blue Blue2h/ 0010b Green Green3h/ 0011b Cyan Cyan4h/ 0100b Red Red5h/ 0101b Magenta Magenta6h/ 0110b Brown Brown7h/ 0111b White White8h/ 1000b Blink black Gray9h/ 1001b Blink blue Light blueAh/ 1010b Blink green Light greenBh/ 1011b Blink cyan Light cyanCh/ 1100b Blink red Light redDh/ 1101b Blink magenta Light magentaEh/ 1110b Blink brown YellowFh/ 1111b Blink white Bright white

38


Table 4 Color and the corresponding value

BH specifies the color of the resulting screen. The most-significant bit represents blinking, the next 3 bits represents background color, while the least-significant 4 bits represents foreground color. Inserting a different value to different parameters causes the windowing effect function. For example,

MOV AH, 06hMOV AL, 05hMOV BH, 61hMOV CX, 0A1ChMOV DH, 0EhMOV DL, 34hINT 10h

Figure 53 Windows effect example

would create a window at the center of the screen with its own parameters.

Section 6.3.2. Setting and Moving the CursorSetting the cursor

Setting the cursor is a common requirement for text mode, since its position determines where to display the next character. INT 10h Service 02h tells the operation to set the cursor, BH defines the page number (00), and DH & DL define the row (y-coordinate) and column (x-coordinate) position respectively. For example,

MOV AH, 02hMOV BH, 00hMOV DH, 05hMOV DL, ChINT 10h

Figure 54 Moving Cursor example

defines the cursor at row 5 and column 12.Get the cursor position

Similarly, to get the cursor position, set INT 10h Service 03h and set page number BH to 00. For example,

MOV AH, 03hMOV BH, 00hINT 10h

Figure 55 Reading Cursor example

After the above execution, DH will contain the row number and DL will contain the column number of the cursor position.

Read the character from the cursor positionIn most application we will use the cursor to tell the user to input a character from the keyboard.

How can we get a character from the cursor position? An BIOS interrupt 10h with service number 08h will help. For example,

*** ***

Figure 52 Internal structure for color definition

7 6 5 4 3 2 1 0

blinking bit

Background color bitsForeground color bits

39


MOV AH, 08hMOV BH, 00hINT 10h

Figure 56 Reading character from Cursor example

The character will be read in the AL register with the ASCII representation.

Section 7. Interrupt Service Routine (ISR)In the last section we have discussed the interrupt. The interrupt is program control interruption based

on an external event. These interrupts generally have nothing at all to do with the instructions currently executing; instead, some event, such as pressing a key on the keyboard, informs the CPU that a device needs some attention. The CPU interrupts the currently executing program, services the device, and returns the control back to the program.

An interrupt service routine is a procedure written specifically to handle an interrupt. Although different phenomenon cause interrupts, the structure of an interrupt service routine, or ISR, is approximately the same for interrupts. The following will describe the interrupt structure and how to write a basic interrupt service routines for the 8086 assembly language.

Section 7.1. Introduction to 8086 Interrupt Service RoutineIn the 8086 chips it allows up to 256 vectored interrupts. This means that you can have up to 256

different sources for an interrupt and the CPU will directly call the service routine for that interrupt without any software processing. The CPU provides a 256 entry interrupt vector table beginning at address 0:0 in memory. This is a 1K table containing a 256 4-byte entries. Each entry in this table contains a segmented address that points at the interrupt service routine in memory. Generally, we will refer to interrupts by their interrupt value, so interrupt INT 0 address is at memory location 0:0, interrupt INT 1 address is at address 0:4, etc.

When an interrupt occurs, the CPU does the following:1. The CPU pushes the flags register onto the stack.2. The CPU pushes a far return address (CS:IP) onto the stack, segment value first.3. The CPU determines the interrupt number and fetches four-byte interrupt vector from

the correspondence address.4. The CPU transfers control to the routine specified by the interrupt vector table entry.

After the completion of the steps, the interrupt service routine takes control. When the interrupt service routine wants to return the control, it must execute an IRET (interrupt return) instruction. The interrupt return, similar as RET instruction, pops the far return address and the flags off the stack. Note that executing a far return is insufficient since that would leave the flags on the stack.

Furthermore, upon entry into the interrupt service routine, the CPU will disable further hardware interrupts by clearing the interrupt flag.

Section 7.2. Writing the Interrupt Service RoutineThe interrupt service routine ISR are written like almost any other assembly language procedure

except that they return with an IRET instruction rather than RET. Although the distance of the ISR procedure, i.e. NEAR or FAR, is usually of no significance, you should make all ISRs FAR procedures. This will make programming easier if you decide to call an ISR directly rather than using the normal interrupt handling mechanism.

The ISR has a very special restriction: they must preserve the state of the CPU. In particular, these ISRs must preserve all registers they modify. For example,

SimpleISR PROC FARMOV AX, 0IRET

SimpleISR ENDP

Figure 57 Simple ISR

Suppose you were executing the following code segment:MOV AX, 5ADD AX, 2

The interrupt service routine would set the AX register to zero and the register AX will be zero rather than five. Worse yet, the interrupts are generally asynchronous, meaning that they can occur at any time. Thus

*** ***

40


we cannot know what value of AX will be at any time. Bugs in the ISR are very difficult to find, because such bugs often affect the execution of unrelated code. The solution to this problem, of course, is to make sure you preserve all registers you use in the ISR. One possible method to preserve the registers is to save and restore all register a procedure modifies.

Writing the ISR is only the first step to implementing an interrupt handler. You must also initialize the interrupt vector table entry with the address of your ISR. There is two common way to accomplish this: store the address directly in the interrupt vector table or call DOS and let DOS do the job for you.

Storing the address yourself is an easy task. All you need to do is loading a segment register with zero and store the four-byte address at the appropriate offset within that segment. The following code sequence initializes the entry for interrupt 255 with the address of the SimpleISR routine in the above example.

MOV AX, 0MOV ES, AXCLILEA WORD PTR ES:[0FFh*4], [SimpleISR]MOV WORD PTR ES:[0FFh*4+2], CS

Figure 58 Example of storing the address by yourself

The CLI instruction tells the assembler to prevent any interrupt from this point, and the STI instruction allows any interrupt raised, and the syntax is

CLISTI

Perhaps a better way to initialize an interrupt vector is to use the DOS Set Interrupt Vector call. Calling DOS interrupt with AH equal to 25h provides this function. This call expects an interrupt number in the AL register and the address of the interrupt service routine in DS: DX, where DS stores the segment address of the interrupt and DX stores the address. The call to DOS that would accomplish the same thing as the code above is

MOV AX, 2523hMOV DX, CS ;assume SimpleISR in code segmentMOV DS, DXLEA DX, SimpleISRINT 21hMOV AX, DATA_SEG; restore data segment in DSMOV DS, AX

Figure 59 Example of storing the address by dos interrupt

Although this code sequence is a little more complex than putting the data directly into the interrupt vector table, it is safer. Many programs monitor changes made to the interrupt vector table through DOS. If you call DOS to change an interrupt vector table entry, those programs will become aware of your changes.

Generally, it is a very bad idea to patch the interrupt vector and not restore the original entry after your program terminates. Well behaved programs always save the previous value of an interrupt vector table entry and restore this value before termination. The following code sequences demonstrate how to do this. At the beginning of your program, you should save the old interrupt address. The DOS interrupt with service number 35h in AH will perform this. This interrupt will save the interrupt number, defined in AL, in ES: BX as segment: offset address. For example,

MOV AX, 3523hINT 21hMOV [OLDOFFSET], BXMOV [OLDSEGMENT], ES

Figure 60 Load the address of the interrupt

The above example shows how to save the interrupt address (segment: offset) of INT 10h (Video interrupt). Defining AH = 35h, AL = 23h, the segment address will be stored in ES and the offset address will be stored in BX after the interrupt. We will save all these address in the two variables.

Before exit the program, you should restore the interrupt vector entries, for example:MOV DS, CS: [OLDSEGMENT]MOV DX, CS: [OLDOFFSET]MOV AX, 2523hINT 21h

*** ***

41


Figure 61 Example of storing the address by dos interrupt

The above interrupt only responds to the corresponding interrupt. For example, the interrupt may raised if Ctrl-C key is pressed (INT 23h). After your ISR has defined, that interrupt service routine only responds to the Ctrl-C key, which is exactly the same as the original interrupt. Here is the reference of other interrupts that can be changed to your ISR.

Interrupt No. Interrupt raised when......00h Divide overflow error09h Keyboard input

08h/ 1Ch Every 1/18.2 sec (0.055sec, 5/91sec)23h Ctrl-C key is pressed70h Every 940ms

Table 5 Interrupt service code

Section 7.3. Chaining and Reentrance ProblemChaining ISR

Interrupt service routines come into basic varieties: those that need exclusive access to an interrupt vector and those that must share an interrupt vector with several other ISRs. The timer, real-time clock, and keyboard ISRs are generally fall into the latter category. It is not at all uncommon to find several ISRs in memory sharing each of these interrupts.

Sharing an interrupt vector is rather easy. All an ISR needs to do to share an interrupt vector is to save the old interrupt vector when installing the ISR and then call the original ISR before or after you do you own ISR processing. If you’ve saved away the address of the original ISR in the word variable OLDOFFSET, you can jump directly to the original ISR rather than calling it. In such cases, you should put the necessary variables directly in the code segment, as shown below:

OLDOFFSET DW ?NEWISR PROC NEAR

: : :JMP CS: [OLDOFFSET]

NEWISR ENDP

Figure 62 Example of chaining the ISRs

This code will pass along your original ISR. The OLDOFFSET variable must be defined in the code segment if you use this technique to transfer the control to the original ISR.

Reentrancy problemA minor problem develops with developing ISRs, what happens if you enable interrupts while in an

ISR and a second interrupt from the same device comes along? This would interrupt the ISR and then reenter the ISR from the beginning. Many applications do not behave properly under these conditions. The easiest way to prevent such an occurrence is to turn off the interrupts while executing code in a critical section, i.e. using CLI instruction.

Another problem is the DOS interrupt problem. The internal data structure of DOS is designed so that is is non-reenterable. It is because the DOS is for a single user, and single program system. It is not possible to active the DOS interrupt when it has activated. Therefore, you can only include the INT 21h in the ISR only if we know that INT 21h has not been called. Generally, it is suggested that you should not to call INT 21h in the ISR.

Section 8. ExamplesExample 1Question: Write a program to satisfy the following requirement:

1. Read until 10 small alphabetic letter entered.2. Convert the 10 small letter to capital letter and output these letters.

Algorithm:1. Display the input message and set the counter SI.2. Read the characters3. Check whether it is small alphabetic letter.

*** ***

42


4. If it is correct, change that letter to capital letter by adding -20h to its ASCII. Save the ASCII in the array MSG [SI]12.

5. Check whether the loop ends (10). If the loop is not finished, or the input is incorrect, repeat step 2 until it is finished.

6. Display the capital letter list.Program:;*** BEGIN OF THE PROGRAM ***;*** DEFINE THE VARIABLES ***CODE_SEG SEGMENTBEGIN PROC FAR ASSUME CS:CODE_SEG,DS:DATA_SEG,SS:STACK_SEGSTART: MOV AX,DATA_SEG MOV DS,AX;*** DISPLAY THE MESSAGE 1 *** MOV AH,09h LEA DX,MESSAGE1 INT 21h;*** READ THE CHARACTERS ***READ_CH: MOV AH,01h INT 21h;*** JUMP IF THE CHARACTERS < 'a' *** CMP AL,61h JL READ_CH;*** CORRECT IF THE CHARACTERS < 'z' *** CMP AL,7Bh JL CORRECT JMP READ_CH;*** CHANGE THE SAMLL LETTER TO CAPITAL LETTER ***CORRECT: ADD AL,-20h;*** CHECK THE CHARACTERS READ *** MOV [MSG+SI],AL ADD SI,1 CMP SI,10 JZ WRITE_MSG JMP READ_CH;*** ECHO THE STRING READ ***WRITE_MSG: MOV AH,09h LEA DX,MESSAGE2 INT 21h LEA DX,MSG INT 21h;*** END OF PROGRAM ***END_SEG: MOV AH,4Ch INT 21hBEGIN ENDPCODE_SEG ENDS;*** DATA SEGMENT ***DATA_SEG SEGMENTMESSAGE1 DB "Enter the 10 characters :$"MESSAGE2 DB " ;Corresponding CORRECT characters in capital letter :$"MSG DB 10 DUP(?)END_MSG DB '$'DATA_SEG ENDS;*** STACK SEGMENT ***STACK_SEG SEGMENT STACK DW 10 DUP (?)STACK_SEG ENDS END BEGIN

Figure 63 Sample Program 1

Example 2Question: Write an assembly program to do the following:

1. Read from the keyboard a positive decimal integer of at most 4 digits (0-9999). Convert it to decimal integer.

2. Using stack, display the integer as string.Algorithm:

1. Read the characters. Check whether the user enters the return input (ASCII for return key is 0Dh) or enter the digits.

12 In assembly program, to implement the array, use [MSG+SI] (as MSG [SI] in C). It will act as a list.

*** ***

43


2. If it is correct, change the characters to digits by multiplication, e.g.1234 = 11000+2100+310+4.

Multiply from the least significance digit to most significance digit.3. Check whether it gets 4 digits. If it is not, repeat step 1.4. By similar argument of step 2, get the least significance digit from the remainder of the division.5. Convert it to the string and push to the stack.6. Repeat step 5 until all digits are pushed into the stack.7. Pop the elements from the stack and output to the screen.

Program:;*** BEGIN OF THE PROGRAM ***;*** PART(I): CHANGE CHARACTERS TO DECIMAL INTEGERS ***;*** DEFINE THE VARIABLES ***CODE_SEG SEGMENTBEGIN PROC FAR ASSUME CS:CODE_SEG,DS:DATA_SEG,SS:STACK_SEGSTART: MOV AX,DATA_SEG MOV DS,AX;*** DISPLAY THE MESSAGE 1 *** MOV AH,09h LEA DX,MESSAGE1 INT 21h;*** READ THE CHARACTERS ***READ_CH: MOV AH,01h INT 21h;*** CHECK THE "RETURN" INPUT *** CMP AL,0Dh JZ PARTII;*** JUMP IF THE CHARACTERS < '0' *** CMP AL,30h JL READ_CH;*** JUMP IF THE CHARACTERS > '9' *** CMP AL,39h JG READ_CH;*** CHANGE THE CHARACTERS TO DIGITS *** ADD AL,-30h MOV ADDER,AL MOV AX,SUM MOV SI,10 MUL SI ADD AL,ADDER MOV SUM,AX ADD COUNT,1;*** CHECK THE 4 INPUT *** CMP COUNT,4 JZ PARTII JMP READ_CH;*** END OF PART (I) ***;*** PART(II):CHANGE DECIMAL INTEGER TO CHARACTERS ***;*** DISPLAY THE OUTPUT SIGNAL ***PARTII: MOV AH,02h MOV DX,000Dh INT 21h MOV DX,000Ah INT 21h MOV AH,09h LEA DX,MESSAGE2 INT 21h MOV COUNT,0;*** CHANGE DECIMAL TO CHAR. ***CONVERT: MOV AX,SUM MOV DX,0 MOV SI,10 DIV SI PUSH DX MOV SUM,AX ADD COUNT,1 CMP AX,0 JG CONVERT;*** ECHO THE STRING ***WRITE_MSG: POP DX ADD DX,30h MOV AH,02h INT 21h

*** ***

44


ADD COUNT,-1 CMP COUNT,0 JG WRITE_MSG;*** END OF PROGRAM ***END_SEG: MOV AH,4Ch INT 21hBEGIN ENDPCODE_SEG ENDS;*** STACK SEGMENT ***STACK_SEG SEGMENT STACK DW 40 DUP(?)STACK_SEG ENDS;*** DATA SEGMENT ***DATA_SEG SEGMENTMESSAGE1 DB "Enter digits (max 4):$"MESSAGE2 DB "The output is:$"SUM DW 0COUNT DB 0ADDER DB 0DATA_SEG ENDS END BEGIN

Figure 64 Sample Program 2

Example 3Question: Write an assembly program to do the quick sort.Algorithm: Please refer any reference on recursive calling (C programming textbook, P. 245)Program for C Version: (Copy from C programming textbook, P. 245)#include<stdio.h>#define SWAP(x,y) {int z=(x);(x)=(y);(y)=(z);}main(){

int a[100], n;: :

rquick(a,0,n-1);: :

}void rquick(int a[], int lo, int hi){

int low, high, pivot;low=lo;high=hi;if (low<high){

pivot=a[high];do {

while (low<high && a[low]<=pivot)low++;

while (low<high && a[high]>=pivot)high--;

if (low<high)SWAP(a[low],a[high]);

} while (low<high)SWAP(a[low],a[hi]);rquick(a,lo.low-1);rquick(a,low+1,hi);

}}

Figure 65 Quick Sort program for C Version

WWW.F r e s h e r s w o r l d.com

Program for Assembly Version:Points to note:1. Include the main program in the program2. Only two elements, lo & hi, are pushed into the stack. It is not necessary to push the address of the array

since the array is defined within the data segment.3. Only number 0 to 9 can be printed on the screen. For the number greater than 9, the program still work,

but no correct output will be shown. Please read the content inside the data segment after sorting, if you use the number larger than 9.

*** ***

45



;*** BEGIN OF THE PROGRAM ***CODE_SEG SEGMENT PARA 'CODE'BEGIN PROC NEAR ASSUME CS:CODE_SEG,DS:DATA_SEG,SS:STACK_SEGSTART: MOV AX,DATA_SEG MOV DS,AX;*** PUSH THE PARAMETERS *** MOV AX, 0 PUSH AX MOV AX, 4 PUSH AX;*** PASS THE CONTROL *** CALL SORT;*** PRINT THE OUTPUT *** MOV SI,0PRINTING: MOV AH,02h LEA BX,A ADD BX,SI ADD BX,SI MOV DX,[BX] INC SI ADD DX,30h INT 21h CMP SI,5 JGE END_SEG MOV DX,2Ch INT 21h JMP PRINTING;*** END OF PROGRAM ***END_SEG: MOV AH,4Ch INT 21hBEGIN ENDP;*** PROCEDURE SORT ***SORT PROC NEAR PUSH BP MOV BP, SP SUB SP, 6;*** INDEX: LOW = [BP-2], HIGH = [BP-4], PIVOT = [BP-6] (local var); LO = [BP+6], HI = [BP+4] ***;*** LOW=LO *** MOV AX, [BP+6] MOV [BP-2], AX;*** HIGH=HI *** MOV BX, [BP+4] MOV [BP-4], BX;*** IF (LOW<HIGH) *** CMP AX, BX JGE ELOOP1;*** COMPUTE ADDRESS OF A[LOW], PUT IN BX *** LEA BX, A ADD BX, [BP-2] ADD BX, [BP-2];*** COMPUTE ADDRESS OF A[HIGH], PUT IN SI *** LEA SI, A ADD SI, [BP-4] ADD SI, [BP-4];*** PIVOT = A[HIGH] *** MOV AX, [SI] MOV [BP-6], AX;*** WHILE (LOW<HIGH..... ***WHILE1: MOV AX, [BP-4] CMP [BP-2], AX JGE WHILE2;*** && A[LOW]<=PIVOT *** MOV AX, [BP-6] CMP [BX], AX JG WHILE2 JMP NEXTPELOOP1: JMP ELOOP;*** LOW++ ***NEXTP: MOV AX, [BP-2] ADD AX, 1 MOV [BP-2],AX ADD BX, 2 JMP WHILE1;*** WHILE (HIGH>LOW..... ***WHILE2: MOV AX, [BP-4]

*** ***

46


CMP [BP-2], AX JGE NLOOP;*** && A[HIGH]>=PIVOT *** MOV AX, [BP-6] CMP [SI], AX JL NLOOP;*** HIGH-- *** MOV AX, [BP-4] SUB AX, 1 MOV [BP-4], AX SUB SI, 2 JMP WHILE2;*** IF LOW<HIGH ***NLOOP: MOV AX, [BP-4] CMP [BP-2], AX JGE CHECKING;*** SWAP(A[LOW],A[HIGH]) *** PUSH AX MOV AX, [BX] XCHG AX, [SI] MOV [BX], AX POP AX;*** WHILE (LOW<HIGH) ***CHECKING: MOV AX, [BP-4] CMP [BP-2], AX JL WHILE1;*** COMPUTE ADDRESS OF A[HI], PUT IN CX *** LEA CX, A ADD CX, [BP+4] ADD CX, [BP+4];*** SWAP(A[LOW],A[HI]) *** MOV AX, [BX] PUSH SI MOV SI,CX XCHG AX, [SI] POP SI MOV [BX], AX;*** call Recursive 1 *** MOV AX, [BP-2] DEC AX PUSH [BP+6] PUSH AX CALL SORT;*** call Recursive 2 *** MOV AX, [BP-2] INC AX PUSH AX PUSH [BP+4] CALL SORT;*** EXIT and RET ***ELOOP: ADD SP, 6 POP BP RET 4SORT ENDP CODE_SEG ENDS;*** STACK SEGMENT ***STACK_SEG SEGMENT STACK 'STACK' DW 1024 DUP(?)STACK_SEG ENDS;*** DATA SEGMENT ***DATA_SEG SEGMENT PARA 'DATA'A DW 5,2,3,4,1DATA_SEG ENDS END BEGIN

Figure 66 Quick Sort program for Assembly Version

Example 4Question: Write a program which displays another message when interrupt raised.Algorithm: 1. Store the old ISR address, and define new ones.

2. Display the endless message. 3. Define the new ISR. Execute the ISR by pressing Ctrl_C key. 4. Accept "Y" to terminate the endless message.

5. Restore the old ISR.Program:

*** ***

47


;********** MACRO DEFINITION **********PRINT MACRO MSG MOV AH, 09h LEA DX, MSG INT 21h ENDM;********** BEGIN OF THE PROGRAM **********;********** DEFINE THE VARIABLES **********CODE_SEG SEGMENT PARA 'CODE'MAIN PROC FAR ASSUME CS:CODE_SEG,DS:DATA_SEG,SS:STACK_SEG OLD_SEGMENT DW ? OLD_OFFSET DW ? MOV AX,DATA_SEG MOV DS,AX;********** STORE THE ISR ********** MOV AH, 35h MOV AL, 23h INT 21h MOV [OLD_SEGMENT], ES MOV [OLD_OFFSET], BX;********** SET UP THE ISR ********** PUSH DS MOV AX, CS MOV DS, AX LEA DX, CTL_C MOV AL, 23h MOV AH, 25h INT 21h POP DS;********** PRINT MESSAGE **********ABC: PRINT MSG1 JMP ABC;********** RESTORE THE ISR ********** MOV AH, 25h MOV AL, 23h MOV DX, CS:[OLD_SEGMENT] MOV DS, CS:[OLD_OFFSET] INT 21h;********** END OF PROGRAM **********END_SEG: MOV AH,4Ch INT 21hMAIN ENDP;********** ISR **********CTL_C PROC FAR PRINT MSG2 MOV AH, 01h INT 21h CMP AL, 59h JE END_SEG IRETCTL_C ENDPCODE_SEG ENDS;********** STACK SEGMENT **********STACK_SEG SEGMENT STACK 'STACK' DB 1024 DUP(?)STACK_SEG ENDS;********** DATA SEGMENT **********DATA_SEG SEGMENT PARA 'DATA'MSG1 DB 0Dh,0Ah,"HELLO TO EVERYBODY!$"MSG2 DB "QUIT NOW (Y/N)?$"DATA_SEG ENDS END MAIN

Figure 67 ISR Example

You can try to run the above four programs.

Section 9. Appendix 1: How to write and assemble my assembly program?

Which editor can I write my program?In any editor, e.g. DOS editor, C editor, etc.

*** ***

48


In the PC Lab or your own computer DOS prompt13:

Without Debug (no error in your program):

1. Enter the DOS prompt (if in Win95)2. Go to A drive, which contain your assembly program

Or your own directory for your own computer.3. Type “TASM <filename.ASM>”14

4. Type “TLINK <filename.OBJ>”5. Execute your program by typing “<filename.EXE>”

With Debug:

6. Enter the DOS prompt (if in Win95)7. Go to A drive, which contain your assembly program

Or your own directory for your own computer. Add one more path: “C:\TASM\BIN”.8. Type “TASM /zi /la <filename.ASM>”. Check how many

errors you made.9. Type “TLINK /v <filename.OBJ>”.10. Type “TD <filename.EXE>” and enter the assembly debugger. The

debugger is similar to C debugger in DOS prompt. You can use it to debug your program by inspecting the variables, registers or memories, as you like.

With linking of C program:

Replace the two executions by:

1. Type "TASM /zi /la /mx /o <filename.ASM>".2. Type "TCC /M -I<include-file directory> -L<libraries directory>

<filename.c> <filename.OBJ>".

Section 10. Appendix 2: Commands and syntax of Assembly language

Segment Definition:

<Segment Name> segment (<align>) (<combine>) (<class>)<code><Segment Name> ends

With:

ASSUME CS:<code segment name>, DS:<data segment name>, SS:<stack segment name>

Program Ending:

13 Use the 16-bit assembler. TASM32 will not work.14 You can omit the bold extension.

*** ***

49


END <entry point>

Data Segment:

Data Definition:

<variable name> DB | DW <value><arrayname> DB | DW <size> dup (<element>)<stringname> DB ‘<string>$’

Code Segment:

Label Definition:

<label name> :

Procedure Definition:

<procedure name> PROC {FAR | NEAR} : : ;***your (main/ sub) program here***

<procedure name> ENDP

Data Movement:

MOV <destination>, <register | immediate>MOV <destination>, <DISP | [BX | BP] | [SI | DI] >XCHG <operand1>, <operand2>LEA <destination>, <source>

Arithmetic Operation:

ADD {destination} , {source} SUB {destination} , {source} ADC {destination} , {source} SBB {destination} , {source} MUL {source}IMUL {source}DIV {source}IDIV {<register> | <memory>}NEG {<memory> | <register>}INC {<memory> | <register>}DEC {<memory> | <register>}

Program Control Flow:

JMP {<address> | <label> | <register>}<type> PTR <expression>CALL <label>RET <immediate>

*** ***

50


CMP {destination}, {source} ; destination – source (set flags)JXX {<address> | <label>}LOOP <label>

Logic Operation:

AND {destination}, {source}OR {destination}, {source}XOR {destination}, {source}NOT {destination}<expression> <relational operator> <expression>

Rotation and Shifting:

SHL {<register>, <memory>}, {1 | CL}SAL {<register>, <memory>}, {1 | CL}SAR {<register>, <memory>}, {1 | CL}ROL {<register>, <memory>}, {1 | CL}ROR {<register>, <memory>}, {1 | CL}ROR {<register>, <memory>}, {1 | CL}RCR {<register>, <memory>}, {1 | CL}

Interrupts:

INT <special code>IRETCLISTI

Stack Segment:

Stack Definition:

DW <size of stack> dup (<initial value>)

Stack Manipulation:

PUSH {<register> | <memory>}POP {<register> | <memory>}

Macro Definition:

<Macro name> MACRO (arg1,arg2,....): : : : : ;***your macro here***ENDMLOCAL (label_1,label_2,....)

Directives:

REPT <expression>*** ***

51


<statements>ENDM IRP <<parameter>, <arguments>>

<statements>ENDMIRPC <<parameter>, <string argument>>

<statements>ENDMIFXX <condition>

<sequence of statements><ELSE> ;optional

<sequence of statements>ENDIF

Calling Sequences:

EXTRN <subprogram name> : typePUBLIC <subprogram name>

***** END of Assembly Programming *****

Provided By Freshersworld.com

*** ***

52

csc 1410 tutorial notes 3 - weeblyartilife.weebly.com/.../8086_assembly_language_tutorial.doc ·...

Documents