dixie binary translation and optimization for multiple isas

25
DIXIE Binary Translation and Optimization for Multiple ISAs Computer Architecture Department Universitat Politècnica de Catalunya- Barcelona www.ac.upc.es/dixie

Upload: dee

Post on 14-Jan-2016

28 views

Category:

Documents


0 download

DESCRIPTION

DIXIE Binary Translation and Optimization for Multiple ISAs. Computer Architecture Department Universitat Politècnica de Catalunya-Barcelona. www.ac.upc.es/dixie. UPC people involved. Roger Espasa Agustín Fernández Manel Fernández Victor Moya Juan Lopez Silvia Cernuda Antonio Parada - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: DIXIE Binary Translation and Optimization for Multiple ISAs

DIXIE

Binary Translation and Optimizationfor Multiple ISAs

Computer Architecture Department

Universitat Politècnica de Catalunya-Barcelona

www.ac.upc.es/dixie

Page 2: DIXIE Binary Translation and Optimization for Multiple ISAs

UPC people involved

Roger Espasa Agustín Fernández Manel Fernández Victor Moya Juan Lopez Silvia Cernuda Antonio Parada Albert Ribé Álex Ramírez

Page 3: DIXIE Binary Translation and Optimization for Multiple ISAs

Dixie

Static binary translator Accepts multiple ISAs (Alpha, x86, PPC, Mips, Convex) Translates to a common IR (Dixie ISA)

Static binary instrumentation Works on common IR but reflects source ISA

Static binary optimizer Optimizes the common IR Generates native code from common IR

Multiple targets supported also (Alpha, Mips)

Dixie Virtual Machine Can run binaries specified in the common IR Also runs binaries with mixture of common/native code

Page 4: DIXIE Binary Translation and Optimization for Multiple ISAs

Dixie overview

NativeISAsTarget

ISAs

Dixiebinary

Alpha

DIXIEC

JANGO

SPEEDY

DVM(Dixie Virtual Machine)

Userspecification

Convex

x86

Mips

Usersimulator

Alpha

Mips

...

Alpha

Convex

PowerPC

x86

Mips

PowerPC

Targetbinaries

Page 5: DIXIE Binary Translation and Optimization for Multiple ISAs

Outline

Motivation

DIXIE Architecture

Debugging Tools

Performance

Summary

Page 6: DIXIE Binary Translation and Optimization for Multiple ISAs

Outline

Motivation

DIXIE Architecture

Debugging Tools

Performance

Summary

Page 7: DIXIE Binary Translation and Optimization for Multiple ISAs

Binary Translation

For embedded processors Embedded market is

Rapidly moving Changes processors frequently

Software (development, porting) is a major cost issue Binary translation is cheaper than retargeting gcc

Goals Retargeting must be FAST and EASY Support different ISAs Provide good debugging tools

To ease writing ISA description To verify correctness of translations

Techniques Static Translation (as much as possible) Some Dynamic Translation (only if necessary)

Page 8: DIXIE Binary Translation and Optimization for Multiple ISAs

Binary Optimization

Inevitably, binary translation introduces overheads Use static and dynamic optimization to

Adapt better to new chip Offset overheads of static binary translation

Goals Eliminate overheads due to

Manual translation process Intermediate ISA lack of expressiveness

Incremental development of the optimizer

Techniques Static optimization (as much as possible) Dynamic optimization (only if necessary) Optimized blocks still run within Virtual Machine

Page 9: DIXIE Binary Translation and Optimization for Multiple ISAs

Instrumentation

Instrumentation of program binaries For computer architecture research Due to lack of access to ‘exotic’ machines Historical origin of Dixie…

Many classes of tools, but... Different tools for different machines Porting tools is difficult Few tools allow research on vector machines or new ISAs Lack of wrong-path information

Dixie goals Cross-platform instrumentation Research on multiple & discontinued ISAs Full architecture coverage Wrong-path information

Page 10: DIXIE Binary Translation and Optimization for Multiple ISAs

Outline

Motivation

DIXIE Architecture

Debugging Tools

Performance

Summary

Page 11: DIXIE Binary Translation and Optimization for Multiple ISAs

Dixie overview

NativeISAsTarget

ISAs

Dixiebinary

Alpha

DIXIEC

JANGO

SPEEDY

DVM(Dixie Virtual Machine)

Userspecification

Convex

x86

Mips

Usersimulator

Alpha

Mips

...

Alpha

Convex

PowerPC

x86

Mips

PowerPC

Targetbinaries

Page 12: DIXIE Binary Translation and Optimization for Multiple ISAs

Dixie compiler

NativeISAsTarget

ISAs

Dixiebinary

Alpha

DIXIEC

JANGO

SPEEDY

DVM(Dixie Virtual Machine)

Userspecification

Convex

x86

Mips

Usersimulator

Alpha

Mips

...

Alpha

Convex

PowerPC

x86

Mips

PowerPC

Targetbinaries

Page 13: DIXIE Binary Translation and Optimization for Multiple ISAs

Jango

NativeISAsTarget

ISAs

Dixiebinary

Alpha

DIXIEC

JANGO

SPEEDY

DVM(Dixie Virtual Machine)

Userspecification

Convex

x86

Mips

Usersimulator

Alpha

Mips

...

Alpha

Convex

PowerPC

x86

Mips

PowerPC

Targetbinaries

Page 14: DIXIE Binary Translation and Optimization for Multiple ISAs

Breakpoints: trace

mov a0,a1

ld.w @8(a1),a2

sub.w #8,a2

MOV.lo.32 r11,r10

LOAD.lo.32 r500,r11,#8

LOAD.lo.32 r12,r500,#0

SUB.c2.32 r12,r12,#8

LOAD.lo.32 r500,r11,#8

LOAD.lo.32 r12,r500,#0

SUB.c2.32 r12,r12,#8

MOV.lo.32 r11,r10

TRACE vpc,r11,#8

DIXIEC JANGO

TRACE vpc,r500,#0

Page 15: DIXIE Binary Translation and Optimization for Multiple ISAs

Speedy & DVM

NativeISAsTarget

ISAs

Dixiebinary

Alpha

DIXIEC

JANGO

SPEEDY

DVM(Dixie Virtual Machine)

Userspecification

Convex

x86

Mips

Usersimulator

Alpha

Mips

...

Alpha

Convex

PowerPC

x86

Mips

PowerPC

Targetbinaries

Page 16: DIXIE Binary Translation and Optimization for Multiple ISAs

Speedy & DVM

Dixie binary is optimized by Speedy Optimizations at basic block (BB) level

Translate Dixie BBs into native code Generates .speedy sections

Dixie binary is runable on top of the DVM Emulates the behavior of each Dixie instruction

Interpreting each Dixie instruction Jumping into sequences of “Speedy” BBs

Interacts with the user simulator Through trace instructions inserted by Jango

Maps target system calls into host system calls Through DixOS

Page 17: DIXIE Binary Translation and Optimization for Multiple ISAs

DVM Portability

DVM runs on all major hardware combinations:

x86 / LINUXPower2 / AIX

Sparc/SUNOS

Alpha / OSF1

IA64/LINUXMIPS / IRIX

Little Endian Big Endian

32 bits

64 bits

Page 18: DIXIE Binary Translation and Optimization for Multiple ISAs

Speedy Architecture

Front End: Understands Dixie ISA Optimizes Dixie Code (NOP, VPC, CSE) Lowers Representation

Load Virtual Registers into physical registers Local register allocation Load large constants into registers

Back End: Translates Dixie ISA into target ISA Instruction translation

Opcode selection Big/Little endian memory access Alignment issues

Peephole Optimizer Recognize instruction sequences Remove redundant loads Remove redundant branches

Page 19: DIXIE Binary Translation and Optimization for Multiple ISAs

Outline

Motivation

DIXIE Architecture

Debugging Tools

Performance

Summary

Page 20: DIXIE Binary Translation and Optimization for Multiple ISAs

Debugging

Porting to a new ISA is not easy Many “cut-and-paste” bugs A trivial bug may take weeks to be found without

appropriate tools

We would like developers to “Test-as-you-go’’ every instruction description Test each instruction almost in isolation Quickly compare DVM and native results

andiu. ra, rs, ui

MOV.lo.32 r(TMP0),uiSHL.lo.32 r(TMP0),r(TMP0),32 AND.lo.32 (ra),r(rs),r(TMP0) CMPLT.c2.32 r(ICR(POSCRI(0))),r(ra),0CMPGT.c2.32 r(ICR(POSCRI(1))),r(ra),0CMPEQ.lo.32 r(ICR(POSCRI(2))),r(ra),0AND.lo.32 r(TMP0),r(XER),0x80000000CMPNE.lo.32 r(ICR(POSCRI(3))),r(TMP0),0

Page 21: DIXIE Binary Translation and Optimization for Multiple ISAs

Outline

Motivation

DIXIE Architecture

Debugging Tools

Performance

Summary

Page 22: DIXIE Binary Translation and Optimization for Multiple ISAs

Performance

Benchmark suite SPECint95

Environment DEC Alpha AXP-21264 running at 625 MHz OSF/1 v4.0

Two versions of the Dixie binaries DVM: “pure” Dixie binaries Speedy: Dixie binaries optimized using Speedy

Page 23: DIXIE Binary Translation and Optimization for Multiple ISAs

DVM slowdown

0

25

50

75

100

125

150

go

m8

8ks

im

gcc

com

pre

ss

li

ijpe

g

pe

rl

vort

ex

DVMSpeedy

Alpha on Alpha

Page 24: DIXIE Binary Translation and Optimization for Multiple ISAs

Outline

Motivation

DIXIE Architecture

Debugging Tools

Performance

Summary

Page 25: DIXIE Binary Translation and Optimization for Multiple ISAs

Summary

Binary translation & optimization Are becoming important tools in the embedded market Promise lower development costs

When changing architectures Are also of interest to major computer manufacturers

IA-64 emulation Transmeta FX!32 (now obsolete)

DIXIE Robust tool that meets most translation demands Multi-ISA, Multi-platform