1 starting a program the 4 stages that take a c++ program (or any high-level programming language)...
TRANSCRIPT
1
Starting a ProgramThe 4 stages that take a C++ program (or any
high-level programming language) and execute it in internal memory are:
• Compiler - C++ -> Assembly code
• Assembler - Asm -> Machine code (object)
• Linker - Object -> Executable
• Loader - Executable -> Execution in Memory
2
Translation Hierarchy ( היררכיה )
Assembler
Assembly language program
Compiler
C program
Linker
Executable: Machine language program
Loader
Memory
Object: Machine language module Object: Library routine (machine language)
3
The Compiler• The compiler transforms the C++ program into an
assembly language program, a symbolic form of the machine language.
• High-level languages programs can be written in much less lines than assembly language, so programmer productivity ( תפוקה ) is high.
• In 1975 many operating systems, compilers and assemblers were written in assembly because compilers were inefficient and memories small.
• The increase in memory capacity has reduced program size concern and optimizing compilers produce assembly code as good as programmers.
4
The Assembler• Assembly language is the interface between high-
level Programming Languages (PLs) and machine code.
• The assembler can add instructions that aren't implemented in hardware. These are called pseudoinstructions. The use of them simplify translation and programming.
• The pseudoinstruction mov $t0,$t1 is converted by the assembler into the true machine instruction add $t0,$zero,$t1.
• The assembler converts branches to faraway locations into a branch and jump.
5
The Object File• The assembler turns the assembly code into an
object file, which contains machine code, data, and information needed to place instructions in memory.
• The assembler must map the labels in assembly code to addresses in machine code. This information is kept in the symbol table.
• After converting all labels to addresses the symbol table contains the remaining labels that aren't defined, such as external data or procedures.
• Each C++ source file is translated into one assembly code file which is then translated to one object file.
6
Object File StructureThe object file for Unix systems contains six parts:• Object file header - size and position of the other
parts of the file.• Text segment - the machine code.• Data segment - static data that comes with the
program.• Relocation information - identifies instructions and
data that depend on absolute addresses when the program is loaded into memory.
• Symbol table - labels to external references.• Debugging information - links machine instructions
to C++ statements.
7
The Linker• A single change to one line of the program requires
compiling and assembling the whole program. This is wasteful as most code won't be touched by the programmer, even code such as standard libraries which he/she didn't write, will be recompiled.
• An alternative is to compile and assemble each procedure independently. A change to a procedure will require compiling only a single procedure.
• The link editor or linker takes all the independent object files and links them together.
• The output of the linker is the executable file or executable.
8
Pictorial Description
Object file
Instructions
Relocationrecords
main: jal ??? · · · jal ???
call, subcall, printf
Executable file
main: jal printf · · · jal subprintf: · · ·sub: · · ·
Object file
sub:
···
C library
print:
···
Linker
9
Linking StepsThere are 3 steps for the linker:• Place code and data symbolically in memory.• Determine the addresses of data and instruction
labels.• Patch both the internal and external references.
The linker uses the relocation information and symbol table in each object module to find all undefined labels. These labels are found in branch and jump instructions and in data addresses. It finds the old addresses and replaces them with new addresses. It is faster to "patch" the code than recompile.
10
Memory Locations• If all the external references are resolved the linker
determines the memory location of all procedures and data.
• Since the files were assembled separately, the assembler can't know where a modules code and data will reside in memory relative to other modules.
• When the linker places a module in memory all absolute references, memory addresses not relative to a register, must be relocated to their true addresses.
11
MIPS Memory Allocation• The stack starts at top
and grows down towards the data segment.
• The program code starts at 0x40000.
• The static data starts at 0x1000000. Dynamic data (data allocated by new) starts right after it.
• The $gp is situated to make it easy to access the static data.
$sp
$gp
0040 0000 hex
0
1000 0000 hex
Text
Static data
Dynamic data
Stack7fff ffff hex
1000 8000hex
pc
Reserved
12
Object File 1 Object File Header Procedure AName 0x100Text Size 0x20Data Size Instruction AddressText Segment lw $a0, 0($gp)0 jal 04 … X0Data Segment …
Dependency Instruction typeAddressRelocation InfoXlw0Bjal4
AddressLabelSymbol Table -X B
13
Object File 2 Object File Header Procedure BName 0x200Text Size 0x30Data Size Instruction AddressText Segment sw $a1, 0($gp)0 jal 04 … Y0Data Segment …
Dependency Instruction typeAddressRelocation InfoYsw0Ajal4
AddressLabelSymbol Table -Y A
14
Executable FileExecutable file header
Text size 0x300Data size 0x50
Text Segment Address Instruction0x0040000 lw $a0, 0x8000($gp)0x0040004 jal 0x400100
… …0x00400100 sw $a1, 0x8020($gp)0x00400104 jal 0x400000
… …Data Segment Address
0x10000000 X… …
0x10000020 Y… …
15
The Executable File• Contains a header, the text segment and the data
segment.• The separate modules now reside together in the
text and data segments. All the unresolved addresses in the link stage are now resolved.
• This file can now be run in the computer. • In the debug stage of development the executable
will contain debug information. After development is finished the file is stripped of debug information.
16
The LoaderThe loader performs the following steps (Unix):• Reads the executable to find out the size of the text
and data.• Creates an address space large enough.• Copies the instructions and data into memory.• Copies parameters to the main program onto the
stack.• Initializes the machines registers and sets the stack
pointer.• Jumps to a start-up procedure that copies the
parameters into the argument registers and calls the main procedure (main() in C++)