chapter 2 assembler
DESCRIPTION
Design of assemblerTRANSCRIPT
CHAPTER 2
ASSEMBLER
Role of Assembler
Source
Program Assembler
Object
Code
General Design Procedure
In general following procedure is used in design of an assembler. Define problem statement Define data structures Define format of data structures Define algorithm Check for modularity Repeat steps from1 to 5 for all modules.
Forward Reference Problem
In an assembly language program we can use symbols which are the names associated with data or instructions.
It may happen that the symbols are referred before they are defined. This is called as forward reference.
One approach to solve this problem is to have two passes over the source program. So the first pass just defines the symbols and second pass finds the addresses.
Design of two pass assembler
IBM 360/370 System
Elements of Assembly language Mnemonic operation codes: It is symbolic name given to
each machine instruction. It eliminates the need of memorizing the numeric op-codes. Pseudo-op : These are the instructions for the assembler
during the assembly process of program. Machine-op: These are actual machine instructions
The general format of assembly language statement is:
[label] <op-code> operand (s) ;
Symbols: These are the names associated with data or instructions. These names can be used as operand in program.
Literal: It is an operand which has syntax like =’<value>’. Assembler creates data area for literals containing
the constant values. E.g. =F’10’
Location Counter : Used to hold address of current instruction being executed
Meaning of some pseudo-op
START: It indicates start of the source program END : It indicates end of the source program. EQU : It associates symbol with some address specification.
<symbol> EQU <address spec> USING: It tells assembler that which register is used as
base register and its contents DROP: It makes the base register unavailable. LTORG: It tells assembler to place all literals at earlier
place DC: Define constant DS: Define storage
addressing schemeoffset
A 1, 250 (0,15)
index register base register
It is add instruction which adds some number to contents of register 1. The location of number is calculated as :
location = offset + contents of index register +contents of base register
5 Instruction Formats
RR (register-register) RX (register-indexed) RS (register-storage) SI (storage-immediate) SS (storage-storage)
Two passes
Pass 1 : It defines symbols and literals Find length of machine instructions Maintain location counter Remember values of symbols till pass2 Process some pseudo ops Remember literals
Pass 2: Generate object program Look up values of symbols Generate instructions Generate data Process pseudo ops
Pass1 requires following databases Source program Location counter(LC) which stores location of each
instruction Machine Operation Table (MOT):This table indicates the
symbolic mnemonic for each instructions and its length. Pseudo Operation Table (POT): This table indicates the
symbolic mnemonic and action taken for each pseudo-op in pass1.
Symbol Table (ST) which stores each label along with its value.
Literal Table(LT) which stores each literal and its corresponding address
A copy of input which will be used by pass2.
Format of databases Machine-op Table (MOT)
Pseudo-op Table (POT)
Symbol Table (ST)
Literal Table (LT)
Flowchart
Example
Pass 2 requires following databases: Copy of Source program Location counter(LC) Machine Operation Table (MOT). Pseudo Operation Table (POT). Symbol Table (ST) generated by pass1 Base Table (BT) which indicates which register is used as
base register and what its contents are. A work space (INST) which holds each instruction as its
various parts. A work space (PRINTLINE) which produces a printed listing A work space (PUNCH CARD) which is used to output the
assembled instructions in format needed by loader. An output deck of assembled instructions in format needed
by loader.
Base Table
Flowchart
Design of one pass assembler
IBM PC
Intel 8088
One pass processing
•Analysis Phase• Isolate label, mnemonic opcode and operand field• If label present enter (symb, LC contents) in
Symbol Table• Perform LC processing
•Synthesis Phase• Obtain machine opcode• Obtain address from symbol table
Addressing
•Segment based addressing scheme is used.• Code segment(CS)• Data segment(DS)• Stack segment(SS)• Extra Segment(ES)
Assembler directives1. EQU : It associates symbol with some address specification.
<symbol> EQU <address spec>
2. ORG : It is used to set location counter to specified address.
ORG <address spec>
3. ASSUME : This directive tells the assembler which segment register contains the segment base.
ASSUME <register> :<segment name>
4. SEGMRNT : It indicates start of segment
5. ENDS : It indicates end of segment
Databases required1. Source program
2. Mnemonic Operation Table (MOT). This table indicates the symbolic mnemonic for each instruction.
3. Symbol Table (ST) which stores each label along with its relevant information.
4. Segment Register Table (SRTAB) which stores information about segment name and segment register.
5. Forward Reference Table (FRT) which stores information about forward references.
6. Cross reference table (CRT) which list out all references to a symbol in ascending order of statements.
Mnemonic Table (MOT)Mnemonic
op-codes(6)
Machineop-codes
(2)
Alignment/format
information(1)
Routine id(4)
binary
JNE 75 H 00H R2
……… …………. ………… ……………
Symbol Tabel
Segment Register Table Array (SRTAB)
Forward Reference Table (FRT)
Pointer (2)
SRTAB #(1)
Instruction Address(2)
Usage Code(1)
Source statement #(2)
Cross reference table (CRT)
Pointer to next entry (2) Source statement # (2)
The stepwise processing is as follows:
Initialization of some parameters: LC=0 , size=0, srtab_no=1, SYMTAB_segmrnt_entry=0,
ERRTAB and SRTAB_ARRAY is cleared Read the statement from source program Examine the op-code field to check whether it is pseudo-op or machine-op. If it is machine-op then MOT is searched to find match for the op-code and call
the appropriate routine.
Every type statement requires different processing. The statemets are processed in following way. If it is EQU pseudo-op then
Evaluate expression in operand field, Make entry for the label in SYMTAB set offset = value of operand Enter stmt_no in the CRT list of the label in operand field Process forward references to the label. size=0
If it is ASSUME statement then Create a new SRTAB and make entry for segment
register and SYMTAB_segment_entry for the segment name in operand field..
srtab_no= srtab_no+1 size=0
If SEGMENT statement then make entry for label in SYMTAB with
segment_name =true size=0 LC=0 SYMTAB_segment_entry=entry no in SYMTAB
If ENDS statement then SYMTAB_segment_entry=0 If DC statement then
Align LC according to specification in operand field Assemble constant if any size=size of memory required
If Imperative statement then If operand is symbol then make entry in CRT If operand symbol is already defined then check its alignment
and addressability and generate address specification for symbol using SYMTAB entry
else Make appropriate entry in for the symbol in SYMTAB Assemble instruction in machine_code buffer size=size of instruction
If size!=0 If label is present then Make appropriate entry in for
the symbol in SYMTAB with current LC Move contents of machine_code_buffer to address
code_area_address code_area_address= code_area_address+size process forward references for symbol Enter errors in ERRTAB List statements with errors contained in ERRTAB Clear ERRTAB
If END statement then Report undefined symbols from SYMTAB Produce cross reference listing Write code_area into output file.