1 starting a program the 4 stages that take a c++ program (or any high-level programming language)...

16
1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: • Compiler - C++ -> Assembly code • Assembler - Asm -> Machine code (object) • Linker - Object -> Executable • Loader - Executable -> Execution in Memory

Upload: nataly-veal

Post on 15-Dec-2015

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++

1

Starting a ProgramThe 4 stages that take a C++ program (or any

high-level programming language) and execute it in internal memory are:

• Compiler - C++ -> Assembly code

• Assembler - Asm -> Machine code (object)

• Linker - Object -> Executable

• Loader - Executable -> Execution in Memory

Page 2: 1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++

2

Translation Hierarchy ( היררכיה )

Assembler

Assembly language program

Compiler

C program

Linker

Executable: Machine language program

Loader

Memory

Object: Machine language module Object: Library routine (machine language)

Page 3: 1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++

3

The Compiler• The compiler transforms the C++ program into an

assembly language program, a symbolic form of the machine language.

• High-level languages programs can be written in much less lines than assembly language, so programmer productivity ( תפוקה ) is high.

• In 1975 many operating systems, compilers and assemblers were written in assembly because compilers were inefficient and memories small.

• The increase in memory capacity has reduced program size concern and optimizing compilers produce assembly code as good as programmers.

Page 4: 1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++

4

The Assembler• Assembly language is the interface between high-

level Programming Languages (PLs) and machine code.

• The assembler can add instructions that aren't implemented in hardware. These are called pseudoinstructions. The use of them simplify translation and programming.

• The pseudoinstruction mov $t0,$t1 is converted by the assembler into the true machine instruction add $t0,$zero,$t1.

• The assembler converts branches to faraway locations into a branch and jump.

Page 5: 1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++

5

The Object File• The assembler turns the assembly code into an

object file, which contains machine code, data, and information needed to place instructions in memory.

• The assembler must map the labels in assembly code to addresses in machine code. This information is kept in the symbol table.

• After converting all labels to addresses the symbol table contains the remaining labels that aren't defined, such as external data or procedures.

• Each C++ source file is translated into one assembly code file which is then translated to one object file.

Page 6: 1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++

6

Object File StructureThe object file for Unix systems contains six parts:• Object file header - size and position of the other

parts of the file.• Text segment - the machine code.• Data segment - static data that comes with the

program.• Relocation information - identifies instructions and

data that depend on absolute addresses when the program is loaded into memory.

• Symbol table - labels to external references.• Debugging information - links machine instructions

to C++ statements.

Page 7: 1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++

7

The Linker• A single change to one line of the program requires

compiling and assembling the whole program. This is wasteful as most code won't be touched by the programmer, even code such as standard libraries which he/she didn't write, will be recompiled.

• An alternative is to compile and assemble each procedure independently. A change to a procedure will require compiling only a single procedure.

• The link editor or linker takes all the independent object files and links them together.

• The output of the linker is the executable file or executable.

Page 8: 1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++

8

Pictorial Description

Object file

Instructions

Relocationrecords

main: jal ??? · · · jal ???

call, subcall, printf

Executable file

main: jal printf · · · jal subprintf: · · ·sub: · · ·

Object file

sub:

···

C library

print:

···

Linker

Page 9: 1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++

9

Linking StepsThere are 3 steps for the linker:• Place code and data symbolically in memory.• Determine the addresses of data and instruction

labels.• Patch both the internal and external references.

The linker uses the relocation information and symbol table in each object module to find all undefined labels. These labels are found in branch and jump instructions and in data addresses. It finds the old addresses and replaces them with new addresses. It is faster to "patch" the code than recompile.

Page 10: 1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++

10

Memory Locations• If all the external references are resolved the linker

determines the memory location of all procedures and data.

• Since the files were assembled separately, the assembler can't know where a modules code and data will reside in memory relative to other modules.

• When the linker places a module in memory all absolute references, memory addresses not relative to a register, must be relocated to their true addresses.

Page 11: 1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++

11

MIPS Memory Allocation• The stack starts at top

and grows down towards the data segment.

• The program code starts at 0x40000.

• The static data starts at 0x1000000. Dynamic data (data allocated by new) starts right after it.

• The $gp is situated to make it easy to access the static data.

$sp

$gp

0040 0000 hex

0

1000 0000 hex

Text

Static data

Dynamic data

Stack7fff ffff hex

1000 8000hex

pc

Reserved

Page 12: 1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++

12

Object File 1 Object File Header Procedure AName 0x100Text Size 0x20Data Size Instruction AddressText Segment lw $a0, 0($gp)0 jal 04 … X0Data Segment …

Dependency Instruction typeAddressRelocation InfoXlw0Bjal4

AddressLabelSymbol Table -X B

Page 13: 1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++

13

Object File 2 Object File Header Procedure BName 0x200Text Size 0x30Data Size Instruction AddressText Segment sw $a1, 0($gp)0 jal 04 … Y0Data Segment …

Dependency Instruction typeAddressRelocation InfoYsw0Ajal4

AddressLabelSymbol Table -Y A

Page 14: 1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++

14

Executable FileExecutable file header

Text size 0x300Data size 0x50

Text Segment Address Instruction0x0040000 lw $a0, 0x8000($gp)0x0040004 jal 0x400100

… …0x00400100 sw $a1, 0x8020($gp)0x00400104 jal 0x400000

… …Data Segment Address

0x10000000 X… …

0x10000020 Y… …

Page 15: 1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++

15

The Executable File• Contains a header, the text segment and the data

segment.• The separate modules now reside together in the

text and data segments. All the unresolved addresses in the link stage are now resolved.

• This file can now be run in the computer. • In the debug stage of development the executable

will contain debug information. After development is finished the file is stripped of debug information.

Page 16: 1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++

16

The LoaderThe loader performs the following steps (Unix):• Reads the executable to find out the size of the text

and data.• Creates an address space large enough.• Copies the instructions and data into memory.• Copies parameters to the main program onto the

stack.• Initializes the machines registers and sets the stack

pointer.• Jumps to a start-up procedure that copies the

parameters into the argument registers and calls the main procedure (main() in C++)