bil.220.–introduc7on.to.systems. programmingbil220/w01-overview-bits.pdf ·...

71
1 Acknowledgement: The course slides are adapted from the slides prepared by R.E. Bryant, D.R. O’Hallaron, G. Kesden and Markus Püschel of CarnegieMellon Univ. BIL 220 – Introduc7on to Systems Programming Spring 2012 Instructors: Aykut & Erkut Erdem WE’VE GOT A NEW COURSE! TOTALLY REDESIGNED INSIDE AND OUT

Upload: truongkien

Post on 28-Jun-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

1

Acknowledgement:  The  course  slides  are  adapted  from  the  slides  prepared  by    R.E.  Bryant,  D.R.  O’Hallaron,  G.  Kesden  and  Markus  Püschel  of  Carnegie-­‐Mellon  Univ.  

BIL  220  –  Introduc7on  to  Systems  Programming    Spring  2012  

Instructors:    Aykut  &  Erkut  Erdem   WE’VE GOT

A NEW

COURSE! TOTALLY REDESIGNED

INSIDE AND OUT

Page 2: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

2

Overview  

¢  Course  theme  ¢  Five  realiQes  ¢  How  the  course  fits  into  the  our  curriculum  ¢  LogisQcs  

Page 3: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

3

Course  Theme:  AbstracQon  Is  Good  But  Don’t  Forget  Reality  ¢ Most  CENG/CS  courses  emphasize  abstracQon  §  Abstract  data  types  §  AsymptoQc  analysis  

¢  These  abstracQons  have  limits  §  Especially  in  the  presence  of  bugs  §  Need  to  understand  details  of  underlying  implementaQons  

¢  Useful  outcomes  §  Become  more  effecQve  programmers  

§  Able  to  find  and  eliminate  bugs  efficiently  §  Able  to  understand  and  tune  for  program  performance  

§  Prepare  for  “systems”  classes  §  Computer  OrganizaQon,  OperaQng  Systems,  Computer  Networks,  Computer  Architecture,  Microprocessors  

Page 4: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

4

Great  Reality  #1:    Ints  are  not  Integers,  Floats  are  not  Reals  

¢  Example  1:  Is  x2  ≥  0?  

§  Float’s:  Yes!  

§  Int’s:  §   40000  *  40000    ➙  1600000000  §   50000  *  50000    ➙  ??  

¢  Example  2:  Is  (x  +  y)  +  z    =    x  +  (y  +  z)?  §  Unsigned  &  Signed  Int’s:  Yes!  §  Float’s:    

§   (1e20  +  -­‐1e20)  +  3.14  -­‐-­‐>  3.14  §   1e20  +  (-­‐1e20  +  3.14)  -­‐-­‐>  ??  

Source:  xkcd.com/571  

-1794967296

0.00

Page 5: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

5

Code  Security  Example  /* Kernel memory region holding user-accessible data */ #define KSIZE 1024 char kbuf[KSIZE]; /* Declaration of library function memcpy */ void *memcpy(void *dest, void *src, size_t n); /* Copy at most maxlen bytes from kernel region to user buffer */ int copy_from_kernel(void *user_dest, int maxlen) { /* Byte count len is minimum of buffer size and maxlen */ int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len); return len; }

¢  Similar  to  code  found  in  FreeBSD’s  implementaQon  of  getpeername  

¢  There  are  legions  of  smart  people  trying  to  find  vulnerabiliQes  in  programs  

Page 6: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

6

Typical  Usage  /* Kernel memory region holding user-accessible data */ #define KSIZE 1024 char kbuf[KSIZE]; /* Declaration of library function memcpy */ void *memcpy(void *dest, void *src, size_t n); /* Copy at most maxlen bytes from kernel region to user buffer */ int copy_from_kernel(void *user_dest, int maxlen) { /* Byte count len is minimum of buffer size and maxlen */ int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len); return len; }

#define MSIZE 528 void getstuff() { char mybuf[MSIZE]; copy_from_kernel(mybuf, MSIZE); printf(“%s\n”, mybuf); }

Page 7: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

7

Malicious  Usage  

#define MSIZE 528 void getstuff() { char mybuf[MSIZE]; copy_from_kernel(mybuf, -MSIZE); . . . }

/* Kernel memory region holding user-accessible data */ #define KSIZE 1024 char kbuf[KSIZE]; /* Declaration of library function memcpy */ void *memcpy(void *dest, void *src, size_t n); /* Copy at most maxlen bytes from kernel region to user buffer */ int copy_from_kernel(void *user_dest, int maxlen) { /* Byte count len is minimum of buffer size and maxlen */ int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len); return len; }

Page 8: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

8

Computer  ArithmeQc  

¢  Does  not  generate  random  values  §  ArithmeQc  operaQons  have  important  mathemaQcal  properQes  

¢  Cannot  assume  all  “usual”  mathemaQcal  properQes  §  Due  to  finiteness  of  representaQons  §  Integer  operaQons  saQsfy  “ring”  properQes  

§  CommutaQvity,  associaQvity,  distribuQvity  §  FloaQng  point  operaQons  saQsfy  “ordering”  properQes  

§  Monotonicity,  values  of  signs  

¢  ObservaQon  §  Need  to  understand  which  abstracQons  apply  in  which  contexts  §  Important  issues  for  compiler  writers  and  serious  applicaQon  programmers  

Page 9: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

9

Great  Reality  #2:    You’ve  Got  to  Know  Assembly  ¢  Chances  are,  you’ll  never  write  programs  in  assembly  §  Compilers  are  much  bener  &  more  paQent  than  you  are  

¢  But:  Understanding  assembly  is  key  to  machine-­‐level  execuQon  model  §  Behavior  of  programs  in  presence  of  bugs  

§  High-­‐level  language  models  break  down  §  Tuning  program  performance  

§  Understand  opQmizaQons  done  /  not  done  by  the  compiler  §  Understanding  sources  of  program  inefficiency  

§  ImplemenQng  system  sooware  §  Compiler  has  machine  code  as  target  §  OperaQng  systems  must  manage  process  state  

§  CreaQng  /  fighQng  malware  §  x86  assembly  is  the  language  of  choice!  

Page 10: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

10

Assembly  Code  Example  

¢  Time  Stamp  Counter  §  Special  64-­‐bit  register  in  Intel-­‐compaQble  machines  §  Incremented  every  clock  cycle  §  Read  with  rdtsc  instrucQon  

¢  ApplicaQon  §  Measure  Qme  (in  clock  cycles)  required  by  procedure  

double t; start_counter(); P(); t = get_counter(); printf("P required %f clock cycles\n", t);

Page 11: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

11

Code  to  Read  Counter  

¢ Write  small  amount  of  assembly  code  using  GCC’s  asm  facility  ¢  Inserts  assembly  code  into  machine  code  generated  by  compiler  

static unsigned cyc_hi = 0; static unsigned cyc_lo = 0; /* Set *hi and *lo to the high and low order bits of the cycle counter. */ void access_counter(unsigned *hi, unsigned *lo) { asm("rdtsc; movl %%edx,%0; movl %%eax,%1"

: "=r" (*hi), "=r" (*lo) : : "%edx", "%eax");

}

Page 12: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

12

Great  Reality  #3:  Memory  Maners  Random  Access  Memory  Is  an  Unphysical  AbstracQon  

¢ Memory  is  not  unbounded  §  It  must  be  allocated  and  managed  §  Many  applicaQons  are  memory  dominated  

¢ Memory  referencing  bugs  especially  pernicious  §  Effects  are  distant  in  both  Qme  and  space  

¢ Memory  performance  is  not  uniform  §  Cache  and  virtual  memory  effects  can  greatly  affect  program  performance  §  AdapQng  program  to  characterisQcs  of  memory  system  can  lead  to  major  speed  improvements  

Page 13: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

13

Memory  Referencing  Bug  Example  

¢   Result  is  architecture  specific  

double fun(int i) { volatile double d[1] = {3.14}; volatile long int a[2]; a[i] = 1073741824; /* Possibly out of bounds */ return d[0]; }

fun(0) ➙ 3.14 fun(1) ➙ 3.14 fun(2) ➙ 3.1399998664856 fun(3) ➙ 2.00000061035156 fun(4) ➙ 3.14, then segmentation fault

volatile keyword is intended to prevent the compiler from applying any optimizations on

the code that assume values of variables cannot change "on their own."

Page 14: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

14

Memory  Referencing  Bug  Example  double fun(int i) { volatile double d[1] = {3.14}; volatile long int a[2]; a[i] = 1073741824; /* Possibly out of bounds */ return d[0]; }

fun(0) ➙ 3.14 fun(1) ➙ 3.14 fun(2) ➙ 3.1399998664856 fun(3) ➙ 2.00000061035156 fun(4) ➙ 3.14, then segmentation fault

LocaQon  accessed  by  fun(i)

ExplanaQon:   Saved State 4 d7 ... d4 3 d3 ... d0 2 a[1] 1 a[0] 0

Page 15: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

15

Memory  Referencing  Errors  

¢  C  and  C++  do  not  provide  any  memory  protecQon  §  Out  of  bounds  array  references  §  Invalid  pointer  values  §  Abuses  of  malloc/free  

¢  Can  lead  to  nasty  bugs  §  Whether  or  not  bug  has  any  effect  depends  on  system  and  compiler  §  AcQon  at  a  distance  

§  Corrupted  object  logically  unrelated  to  one  being  accessed  §  Effect  of  bug  may  be  first  observed  long  aoer  it  is  generated  

¢  How  can  I  deal  with  this?  §  Program  in  Java,  Ruby  or  ML  §  Understand  what  possible  interacQons  may  occur  §  Use  or  develop  tools  to  detect  referencing  errors  (e.g.  Valgrind)  

Page 16: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

16

Memory  System  Performance  Example  

¢  Hierarchical  memory  organizaQon  ¢  Performance  depends  on  access  panerns  

§  Including  how  step  through  mulQ-­‐dimensional  array  

void copyji(int src[2048][2048], int dst[2048][2048]){ int i,j; for (j = 0; j < 2048; j++) for (i = 0; i < 2048; i++) dst[i][j] = src[i][j];}

void copyij(int src[2048][2048], int dst[2048][2048]){ int i,j; for (i = 0; i < 2048; i++) for (j = 0; j < 2048; j++) dst[i][j] = src[i][j];}

21  Qmes  slower  (PenQum  4)  

Page 17: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

17

The  Memory  Mountain  

64M

8M 1M

128K

16K

2K 0

1000

2000

3000

4000

5000

6000

7000

s1

s3

s5

s7

s9

s11

s13

s15

s32 Size (bytes)

Rea

d th

roug

hput

(MB

/s)

Stride (x8 bytes)

L1"

L2"

Mem"

L3"

copyij

copyji

Intel Core i7"2.67 GHz"32 KB L1 d-cache"256 KB L2 cache"8 MB L3 cache"

Page 18: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

18

Great  Reality  #4:  There’s  more  to  performance  than  asymptoQc  complexity    

¢  Constant  factors  maner  too!  ¢  And  even  exact  op  count  does  not  predict  performance  

§  Easily  see  10:1  performance  range  depending  on  how  code  wrinen  §  Must  opQmize  at  mulQple  levels:  algorithm,  data  representaQons,  procedures,  and  loops  

¢ Must  understand  system  to  opQmize  performance  §  How  programs  compiled  and  executed  §  How  to  measure  program  performance  and  idenQfy  bonlenecks  §  How  to  improve  performance  without  destroying  code  modularity  and  generality  

Page 19: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

19

Example  Matrix  MulQplicaQon  

¢  Standard  desktop  computer,  vendor  compiler,  using  opQmizaQon  flags  ¢  Both  implementaQons  have  exactly  the  same  operaQons  count  (2n3)  ¢  What  is  going  on?  

Matrix-Matrix Multiplication (MMM) on 2 x Core 2 Duo 3 GHz (double precision) Gflop/s

160x

Triple loop

Best code (K. Goto)

Page 20: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

20

MMM  Plot:  Analysis  Matrix-Matrix Multiplication (MMM) on 2 x Core 2 Duo 3 GHz Gflop/s

Memory hierarchy and other optimizations: 20x

Vector instructions: 4x

Multiple threads: 4x

¢  Reason  for  20x:  Blocking  or  Qling,  loop  unrolling,  array  scalarizaQon,  instrucQon  scheduling,  search  to  find  best  choice

¢  Effect:  fewer  register  spills,    L1/L2  cache  misses,  and  TLB  misses  

Page 21: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

21

Great  Reality  #5:  Computers  do  more  than  execute  programs  

¢  They  need  to  get  data  in  and  out  §  I/O  system  criQcal  to  program  reliability  and  performance  

¢  They  communicate  with  each  other  over  networks  §  Many  system-­‐level  issues  arise  in  presence  of  network  

§  Concurrent  operaQons  by  autonomous  processes  §  Coping  with  unreliable  media  §  Cross  plaworm  compaQbility  §  Complex  performance  issues  

Page 22: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

22

Role  within  Hacenepe  CENG  Curriculum  

BIL  212  Computer OrganizaQon  

Data  Reps.  Memory  Model  

BIL  402  Micro-­‐processors  

BIL  410    Advanced    Computer    

Architectures  

BIL  426  Computer  Networks  

BIL  323    OperaQng Systems  I  

BIL  354  Database    Systems  

BIL  324  OperaQng    Systems  II  

BIL  220    IntroducQon  to  

Computer  Systems  

Processes,  Memory  Management  

Network  Protocols  

ExecuQon  Model  Memory  System  

BIL  131    Bilgisayar    

Programlama  I  

BIL  406  Micro-­‐processing    

Lab  

BIL  428  Computer  

Networks  Lab  

BIL  341  Sooware    

Lab  

BIL  341  Sooware    Lab  2  

System  ImplementaQon  

Page 23: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

23

Course  PerspecQve  

¢ Most  Systems  Courses  are  Builder-­‐Centric  §  Computer  Architecture  

§  Design  pipelined  processor  in  Verilog  §  OperaQng  Systems  

§  Implement  large  porQons  of  operaQng  system  §  Compilers  

§  Write  compiler  for  simple  language  §  Networking  

§  Implement  and  simulate  network  protocols  

Page 24: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

24

Course  PerspecQve  (Cont.)  

¢  Our  Course  is  Programmer-­‐Centric  §  Purpose  is  to  show  how  by  knowing  more  about  the  underlying  system,  one  can  be  more  effecQve  as  a  programmer  

§  Enable  you  to  §  Write  programs  that  are  more  reliable  and  efficient  

§  Not  just  a  course  for  dedicated  hackers  §  We  bring  out  the  hidden  hacker  in  everyone  

§  Cover  material  in  this  course  that  you  won’t  see  elsewhere  

Page 25: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

25

Teaching  staff  

Instructors  ¢  Aykut  Erdem    

[email protected] Office:  111  Tel:  297  7500  /  146    

¢  Office  Hours:    Tuesday,  13:00-­‐15:00  

   

¢  Erkut  Erdem  [email protected] Office:  114  Tel:  297  7500  /  149    

Office  Hours:    Friday,  13:00-­‐15:00  

Teaching  Assistants  Ali  Caglayan  [email protected] Tel:  297  7500  /  165    

Office  Hours:    To  be  announced..  

   

Oguzhan  Guclu  [email protected] Tel:  297  7500  /  149  

 

Office  Hours:    To  be  announced..    

Page 26: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

26

Textbook  

¢  Randal  E.  Bryant  and  David  R.  O’Hallaron,    §  “Computer  Systems:  A  Programmer’s  PerspecQve,  Second  EdiQon”  (CS:APP2e),  PrenQce  Hall,  2011  

§  hnp://csapp.cs.cmu.edu  

¢  This  book  really  maAers  for  the  course!  §  How  to  solve  labs  §  PracQce  problems  typical  of  exam  problems  

Page 27: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

27

Course  Components  

¢  Lectures  §  Higher  level  concepts  

¢  RecitaQons  §  Applied  concepts,  important  tools  and  skills  for  labs,  clarificaQon  of  lectures,  exam  coverage  

¢  Programming  Assignments  (4)  §  The  heart  of  the  course  §  2-­‐3  weeks  each  §  Provide  in-­‐depth  understanding  of  an  aspect  of  systems  §  Programming  and  measurement  

¢  Exams  (2  midterms  +  final)  §  Test  your  understanding  of  concepts  &  mathemaQcal  principles  

Page 28: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

28

Geyng  Help    

¢  Couse  web  page:    §  http://web.cs.hacettepe.edu.tr/~bil220 §  Complete  schedule  of  lectures,  exams,  and  assignments  §  Lecture  notes  and  assignments  

¢  CommunicaQon  §  Announcements  and  course  related  discussions  §  Through                              :   https://www.piazza.com/hacettepe.edu.tr/spring2012/bil220

Page 29: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

29

Policies:  Grading  

¢ Midterm  1  (15%)          –  29  March  2012    ¢ Midterm  2  (15%)          –  26  April  2012    ¢  Final  (40%)            –  To  be  announced  ¢  4  Programming  Assignments  (25%)  ¢  Class  parQcipaQon  (5%)    

Page 30: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

30

Course  Overview  (1)  

¢  Programs  and  Data  §  Bits  operaQons,  arithmeQc,  assembly  language  programs  §  RepresentaQon  of  C  control  and  data  structures  §  Includes  aspects  of  architecture  and  compilers    

¢  Performance  §  Co-­‐opQmizaQon  (control  and  data),  measuring  Qme  on  a  computer  §  Includes  aspects  of  architecture,  compilers,  and  OS  

¢  The  Memory  Hierarchy  §  Memory  technology,  memory  hierarchy,  caches,  disks,  locality  §  Includes  aspects  of  architecture  and  OS  

 

Page 31: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

31

Course  Overview  (2)  

¢  Linking  §  Ideas  from  staQc  and  dynamic  link  §  Includes  aspects  of  architecture  and  compilers  

¢  ExcepQonal  Control  Flow  §  Hardware  excepQons,  processes,  process  control,  Unix  signals,  non-­‐local  jumps  

§  Includes  aspects  of  compilers,  OS,  and  architecture  

¢   Virtual  Memory  §  Virtual  memory,  address  translaQon,  dynamic  storage  allocaQon  §  Includes  aspects  of  architecture  and  OS  

¢  System  Level  I/O  §  Basic  concepts  of  Unix  I/O  such  as  files  and  descriptors  §  Includes  aspects  of  compilers  and  OS  

 

Page 32: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

32

Assignments  

¢  PA1  (datalab):  ManipulaQng  bits  ¢  PA2  (bomblab):  Defusing  a  binary  bomb  ¢  PA3  (bufferlab):  Hacking  a  buffer  bomb  ¢  PA4  (shelllab):  WriQng  your  own  Unix  shell    

You  will  develop  all  your  programming  assignments  on  a  machine  running  Linux  OS!  

Page 33: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

33

Assignment  RaQonale    

¢  Each  assignment  has  a  well-­‐defined  goal  such  as  solving  a  puzzle  or  winning  a  contest  

¢  Doing  the  lab  should  result  in  new  skills  and  concepts  

¢ We  try  to  use  compeQQon  in  a  fun  and  healthy  way  §  Set  a  reasonable  threshold  for  full  credit  §  Post  intermediate  results  (anonymized)  on  Web  page  for  glory!  

Page 34: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

34

Policies:  Assignments  

¢ Work  groups  §  You  must  work  alone  on  all  assignments  stated  unless  otherwise  

¢  Submission  §  Assignments  due  at  23:59  on  Wednesday  evening  §  Electronic  submissions  (no  excepQons!)  

¢  Lateness  penalQes  §  Get  penalized  20%  per  day  §  No  late  submission  later  than  3  days  aKer  due  date  

Page 35: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

35

CheaQng  

¢ What  is  cheaQng?  §  Sharing  code:  by  copying,  retyping,  looking  at,  or  supplying  a  file  §  Coaching:  helping  your  friend  to  write  a  lab,  line  by  line  §  Copying  code  from  previous  course  or  from  elsewhere  on  WWW  

§  Only  allowed  to  use  code  we  supply,  or  from  CS:APP  website  

¢ What  is  NOT  cheaQng?  §  Explaining  how  to  use  systems  or  tools  §  Helping  others  with  high-­‐level  design  issues  

¢  Penalty  for  cheaQng:  §  Removal  from  course  with  failing  grade  

¢  DetecQon  of  cheaQng:  §  We  do  check    §  Our  tools  for  doing  this  are  much  bener  than  most  cheaters  think!  

Page 36: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

36

Always  keep  in  mind  

¢  This  is  a  MUST  course.    ¢  In  order  to  get  the  most  out  of  the  course,  try  to  stay  ahead.  §  By  the  weekend,  make  sure  you  have  reviewed  the  material  covered  in  the  lectures  of  the  preceding  week.    

¢  If  you  have  any  quesQons  or  just  want  to  chat  about  the  course,  do  not  hesitate  to  go  to  office  hours.  

¢  Nothing  is  easy,  but    don’t  give  up!  

Page 37: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

37

§  Our  goal  is  to  make  BIL  220  one  of  the  highly  respected  and  one  of  the  most  enjoyable  courses  in  the  department!  

We  aim  high!  

Photo  courtesy  of  Flickr  user  richardlamprecht  

Page 38: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

38

WELCOME    to  BIL  220!    

Page 39: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

39

Hello  World!  

Section 1.2 Programs Are Translated by Other Programs into Different Forms 5

Pre-processor(cpp)

Compiler(cc1)

Assembler(as)

Linker(ld)

hello.c hello.i hello.s hello.o

printf.o

hello

Sourceprogram

(text)

Modifiedsource

program(text)

Assemblyprogram

(text)

Relocatableobject

programs(binary)

Executableobject

program(binary)

Figure 1.3 The compilation system.

Here, the gcc compiler driver reads the source file hello.c and translates it intoan executable object file hello. The translation is performed in the sequenceof four phases shown in Figure 1.3. The programs that perform the four phases(preprocessor, compiler, assembler, and linker) are known collectively as thecompilation system.

. Preprocessing phase.The preprocessor (cpp) modifies the original C programaccording to directives that begin with the # character. For example, the#include <stdio.h> command in line 1 of hello.c tells the preprocessorto read the contents of the system header file stdio.h and insert it directlyinto the program text. The result is another C program, typically with the .isuffix.

. Compilation phase. The compiler (cc1) translates the text file hello.i intothe text file hello.s, which contains an assembly-language program. Eachstatement in an assembly-language program exactly describes one low-levelmachine-language instruction in a standard text form. Assembly language isuseful because it provides a common output language for different compilersfor different high-level languages. For example, C compilers and Fortrancompilers both generate output files in the same assembly language.

. Assembly phase. Next, the assembler (as) translates hello.s into machine-language instructions, packages them in a form known as a relocatable objectprogram, and stores the result in the object file hello.o. The hello.o file isa binary file whose bytes encode machine language instructions rather thancharacters. If we were to view hello.o with a text editor, it would appear tobe gibberish.

. Linking phase.Notice that ourhelloprogram calls theprintf function, whichis part of the standard C library provided by every C compiler. The printffunction resides in a separate precompiled object file called printf.o, whichmust somehow be merged with our hello.o program. The linker (ld) handlesthis merging. The result is the hello file, which is an executable object file (orsimply executable) that is ready to be loaded into memory and executed bythe system.

unix> gcc -o hello hello.c

#include <stdio.h> int main() ���{ printf("hello, world\n"); }

Page 40: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

40

Understanding  How  CompilaQon  Systems  Work  is  Really  Maners!  

¢  OpQmizing  program  performance  §  Understanding  the  basics  of  machine  code  and  how  the  compiler  translates  different  C  statements  into  machine  code  is  a  must  for  bener  C  programming  

¢  Understanding  link-­‐Qme  errors  §  Some  of  the  most  of  the  puzzling  programming  errors  are  related  to  linking  

¢  Avoiding  security  holes  §  Understanding  the  consequences  of  the  way  data  and  control  informaQon  are  stored  on  the  program  stack  is  crucial  for  secure  programming  

Section 1.2 Programs Are Translated by Other Programs into Different Forms 5

Pre-processor(cpp)

Compiler(cc1)

Assembler(as)

Linker(ld)

hello.c hello.i hello.s hello.o

printf.o

hello

Sourceprogram

(text)

Modifiedsource

program(text)

Assemblyprogram

(text)

Relocatableobject

programs(binary)

Executableobject

program(binary)

Figure 1.3 The compilation system.

Here, the gcc compiler driver reads the source file hello.c and translates it intoan executable object file hello. The translation is performed in the sequenceof four phases shown in Figure 1.3. The programs that perform the four phases(preprocessor, compiler, assembler, and linker) are known collectively as thecompilation system.

. Preprocessing phase.The preprocessor (cpp) modifies the original C programaccording to directives that begin with the # character. For example, the#include <stdio.h> command in line 1 of hello.c tells the preprocessorto read the contents of the system header file stdio.h and insert it directlyinto the program text. The result is another C program, typically with the .isuffix.

. Compilation phase. The compiler (cc1) translates the text file hello.i intothe text file hello.s, which contains an assembly-language program. Eachstatement in an assembly-language program exactly describes one low-levelmachine-language instruction in a standard text form. Assembly language isuseful because it provides a common output language for different compilersfor different high-level languages. For example, C compilers and Fortrancompilers both generate output files in the same assembly language.

. Assembly phase. Next, the assembler (as) translates hello.s into machine-language instructions, packages them in a form known as a relocatable objectprogram, and stores the result in the object file hello.o. The hello.o file isa binary file whose bytes encode machine language instructions rather thancharacters. If we were to view hello.o with a text editor, it would appear tobe gibberish.

. Linking phase.Notice that ourhelloprogram calls theprintf function, whichis part of the standard C library provided by every C compiler. The printffunction resides in a separate precompiled object file called printf.o, whichmust somehow be merged with our hello.o program. The linker (ld) handlesthis merging. The result is the hello file, which is an executable object file (orsimply executable) that is ready to be loaded into memory and executed bythe system.

Page 41: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

41

Hardware  OrganizaQon  of  a  System  8 Chapter 1 A Tour of Computer Systems

Figure 1.4Hardware organizationof a typical system. CPU:Central Processing Unit,ALU: Arithmetic/LogicUnit, PC: Program counter,USB: Universal Serial Bus.

CPU

Register file

PC ALU

Bus interface I/Obridge

System bus Memory bus

Mainmemory

I/O busExpansion slots forother devices suchas network adaptersDisk

controllerGraphicsadapter

DisplayMouse Keyboard

USBcontroller

Diskhello executable

stored on disk

systems, but all systems have a similar look and feel. Don’t worry about thecomplexity of this figure just now. We will get to its various details in stagesthroughout the course of the book.

Buses

Running throughout the system is a collection of electrical conduits called busesthat carry bytes of information back and forth between the components. Busesare typically designed to transfer fixed-sized chunks of bytes known as words. Thenumber of bytes in a word (the word size) is a fundamental system parameter thatvaries across systems. Most machines today have word sizes of either 4 bytes (32bits) or 8 bytes (64 bits). For the sake of our discussion here, we will assume a wordsize of 4 bytes, and we will assume that buses transfer only one word at a time.

I/O Devices

Input/output (I/O) devices are the system’s connection to the external world. Ourexample system has four I/O devices: a keyboard and mouse for user input, adisplay for user output, and a disk drive (or simply disk) for long-term storage ofdata and programs. Initially, the executable hello program resides on the disk.

Each I/O device is connected to the I/O bus by either a controller or an adapter.The distinction between the two is mainly one of packaging. Controllers are chipsets in the device itself or on the system’s main printed circuit board (often calledthe motherboard). An adapter is a card that plugs into a slot on the motherboard.Regardless, the purpose of each is to transfer information back and forth betweenthe I/O bus and an I/O device.

8 Chapter 1 A Tour of Computer Systems

Figure 1.4Hardware organizationof a typical system. CPU:Central Processing Unit,ALU: Arithmetic/LogicUnit, PC: Program counter,USB: Universal Serial Bus.

CPU

Register file

PC ALU

Bus interface I/Obridge

System bus Memory bus

Mainmemory

I/O busExpansion slots forother devices suchas network adaptersDisk

controllerGraphicsadapter

DisplayMouse Keyboard

USBcontroller

Diskhello executable

stored on disk

systems, but all systems have a similar look and feel. Don’t worry about thecomplexity of this figure just now. We will get to its various details in stagesthroughout the course of the book.

Buses

Running throughout the system is a collection of electrical conduits called busesthat carry bytes of information back and forth between the components. Busesare typically designed to transfer fixed-sized chunks of bytes known as words. Thenumber of bytes in a word (the word size) is a fundamental system parameter thatvaries across systems. Most machines today have word sizes of either 4 bytes (32bits) or 8 bytes (64 bits). For the sake of our discussion here, we will assume a wordsize of 4 bytes, and we will assume that buses transfer only one word at a time.

I/O Devices

Input/output (I/O) devices are the system’s connection to the external world. Ourexample system has four I/O devices: a keyboard and mouse for user input, adisplay for user output, and a disk drive (or simply disk) for long-term storage ofdata and programs. Initially, the executable hello program resides on the disk.

Each I/O device is connected to the I/O bus by either a controller or an adapter.The distinction between the two is mainly one of packaging. Controllers are chipsets in the device itself or on the system’s main printed circuit board (often calledthe motherboard). An adapter is a card that plugs into a slot on the motherboard.Regardless, the purpose of each is to transfer information back and forth betweenthe I/O bus and an I/O device.

Page 42: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

42

Running  the  hello  Program  Section 1.4 Processors Read and Interpret Instructions Stored in Memory 11

Figure 1.5Reading the hellocommand from thekeyboard.

CPU

Register file

PC ALU

Bus interface I/Obridge

System bus Memory bus

Mainmemory

I/O busExpansion slots forother devices suchas network adaptersDisk

controllerGraphicsadapter

DisplayMouse Keyboard

USBcontroller

Disk

“hello”

Usertypes“hello”

Disk

CPU

Register file

PC ALU

Bus interface I/Obridge

System bus Memory bus

Mainmemory

I/O busExpansion slots forother devices suchas network adaptersDisk

controllerGraphicsadapter

DisplayMouse Keyboard

USBcontroller

“hello, world\n”

hello code

hello executablestored on disk

Figure 1.6 Loading the executable from disk into main memory.

Reading  the  hello  command  from  the  keyboard.  

Page 43: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

43

Running  the  hello  Program  

Loading  the  executable  from  disk  into  main  memory.  

Section 1.4 Processors Read and Interpret Instructions Stored in Memory 11

Figure 1.5Reading the hellocommand from thekeyboard.

CPU

Register file

PC ALU

Bus interface I/Obridge

System bus Memory bus

Mainmemory

I/O busExpansion slots forother devices suchas network adaptersDisk

controllerGraphicsadapter

DisplayMouse Keyboard

USBcontroller

Disk

“hello”

Usertypes“hello”

Disk

CPU

Register file

PC ALU

Bus interface I/Obridge

System bus Memory bus

Mainmemory

I/O busExpansion slots forother devices suchas network adaptersDisk

controllerGraphicsadapter

DisplayMouse Keyboard

USBcontroller

“hello, world\n”

hello code

hello executablestored on disk

Figure 1.6 Loading the executable from disk into main memory.

Page 44: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

44

Running  the  hello  Program  12 Chapter 1 A Tour of Computer Systems

Figure 1.7Writing the output stringfrom memory to thedisplay.

CPU

Register file

PC ALU

Bus interface I/Obridge

System bus Memory bus

Mainmemory

I/O busExpansion slots forother devices suchas network adaptersDisk

controllerGraphicsadapter

DisplayMouse Keyboard

USBcontroller

Disk“hello, world\n”

“hello, world\n”

hello code

hello executablestored on disk

1.5 Caches Matter

An important lesson from this simple example is that a system spends a lot oftime moving information from one place to another. The machine instructions inthe hello program are originally stored on disk. When the program is loaded,they are copied to main memory. As the processor runs the program, instruc-tions are copied from main memory into the processor. Similarly, the data string“hello,world\n”, originally on disk, is copied to main memory, and then copiedfrom main memory to the display device. From a programmer’s perspective, muchof this copying is overhead that slows down the “real work” of the program. Thus,a major goal for system designers is to make these copy operations run as fast aspossible.

Because of physical laws, larger storage devices are slower than smaller stor-age devices. And faster devices are more expensive to build than their slowercounterparts. For example, the disk drive on a typical system might be 1000 timeslarger than the main memory, but it might take the processor 10,000,000 timeslonger to read a word from disk than from memory.

Similarly, a typical register file stores only a few hundred bytes of information,as opposed to billions of bytes in the main memory. However, the processor canread data from the register file almost 100 times faster than from memory. Evenmore troublesome, as semiconductor technology progresses over the years, thisprocessor-memory gap continues to increase. It is easier and cheaper to makeprocessors run faster than it is to make main memory run faster.

To deal with the processor-memory gap, system designers include smallerfaster storage devices called cache memories (or simply caches) that serve astemporary staging areas for information that the processor is likely to need inthe near future. Figure 1.8 shows the cache memories in a typical system. An L1

WriQng  the  output  string  from  memory  to  the  display.  

Page 45: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

45

Caches  Maner  

¢  A  system  spends  a  lot  of  Qme  moving  informaQon  from  one  place  to  another.    

¢  Larger  storage  devices  are  slower  than  smaller  storage  devices.  §  e.g.,  the  processor  can  read  data  from  the  register  file  almost  100  Qmes  faster  than  from  main  memory.    

¢  Faster  devices  are  more  expensive  to  build  than  their  slower  counterparts.  

¢  Cache  memories  as  temporary  staging  areas  Section 1.6 Storage Devices Form a Hierarchy 13

Figure 1.8Cache memories.

I/Obridge

CPU chip

Cachememories

Register file

System bus Memory bus

Bus interfaceMain

memory

ALU

cache on the processor chip holds tens of thousands of bytes and can be accessednearly as fast as the register file. A larger L2 cache with hundreds of thousandsto millions of bytes is connected to the processor by a special bus. It might take 5times longer for the process to access the L2 cache than the L1 cache, but this isstill 5 to 10 times faster than accessing the main memory. The L1 and L2 caches areimplemented with a hardware technology known as static random access memory(SRAM). Newer and more powerful systems even have three levels of cache: L1,L2, and L3. The idea behind caching is that a system can get the effect of botha very large memory and a very fast one by exploiting locality, the tendency forprograms to access data and code in localized regions. By setting up caches to holddata that is likely to be accessed often, we can perform most memory operationsusing the fast caches.

One of the most important lessons in this book is that application program-mers who are aware of cache memories can exploit them to improve the perfor-mance of their programs by an order of magnitude. You will learn more aboutthese important devices and how to exploit them in Chapter 6.

1.6 Storage Devices Form a Hierarchy

This notion of inserting a smaller, faster storage device (e.g., cache memory)between the processor and a larger slower device (e.g., main memory) turns outto be a general idea. In fact, the storage devices in every computer system areorganized as a memory hierarchy similar to Figure 1.9. As we move from the topof the hierarchy to the bottom, the devices become slower, larger, and less costlyper byte. The register file occupies the top level in the hierarchy, which is knownas level 0, or L0. We show three levels of caching L1 to L3, occupying memoryhierarchy levels 1 to 3. Main memory occupies level 4, and so on.

The main idea of a memory hierarchy is that storage at one level serves as acache for storage at the next lower level. Thus, the register file is a cache for theL1 cache. Caches L1 and L2 are caches for L2 and L3, respectively. The L3 cacheis a cache for the main memory, which is a cache for the disk. On some networkedsystems with distributed file systems, the local disk serves as a cache for data storedon the disks of other systems.

Page 46: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

46

The  Memory  Hierarchy    

¢  Storage  devices  form  a  hierarchy    ¢  Storage  at  one  level  serves  as  a  cache  for  storage  at  the  next  lower  level.  

14 Chapter 1 A Tour of Computer Systems

CPU registers hold words retrieved from cache memory.

L1 cache holds cache lines retrieved from L2 cache.

L2 cache holds cache linesretrieved from L3 cache.

Main memory holds disk blocks retrieved from local disks.

Local disks hold filesretrieved from disks onremote network server.

Regs

L3 cache(SRAM)

L2 cache(SRAM)

L1 cache(SRAM)

Main memory(DRAM)

Local secondary storage(local disks)

Remote secondary storage(distributed file systems, Web servers)

Smaller,faster,and

costlier(per byte)storagedevices

Larger,slower,

andcheaper

(per byte)storagedevices

L0:

L1:

L2:

L3:

L4:

L5:

L6:

L3 cache holds cache linesretrieved from memory.

Figure 1.9 An example of a memory hierarchy.

Just as programmers can exploit knowledge of the different caches to improveperformance, programmers can exploit their understanding of the entire memoryhierarchy. Chapter 6 will have much more to say about this.

1.7 The Operating System Manages the Hardware

Back to our hello example. When the shell loaded and ran the hello program,and when the hello program printed its message, neither program accessed thekeyboard, display, disk, or main memory directly. Rather, they relied on theservices provided by the operating system. We can think of the operating system asa layer of software interposed between the application program and the hardware,as shown in Figure 1.10. All attempts by an application program to manipulate thehardware must go through the operating system.

The operating system has two primary purposes: (1) to protect the hardwarefrom misuse by runaway applications, and (2) to provide applications with simpleand uniform mechanisms for manipulating complicated and often wildly differentlow-level hardware devices. The operating system achieves both goals via the

Figure 1.10Layered view of acomputer system.

Application programs

Operating system

Main memory I/O devicesProcessor

Software

Hardware

Page 47: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

47

The  OperaQng  System  Manages    the  Hardware    ¢  The  operaQng  system  as  a  layer  of  sooware  interposed  between  the  applicaQon  program  and  the  hardware  

¢  All  anempts  by  an  applicaQon  program  to  manipulate  the  hardware  must  go  through  the  operaQng  system.  

¢  The  operaQng  system  has  two  primary  purposes:    1.  To  protect  the  hardware  from  misuse  by  runaway  applicaQons,  and    2.  To  provide  applicaQons  with  simple  and  uniform  mechanisms  for  

manipulaQng  complicated  and  ooen  wildly  different  low-­‐level  hardware  devices.  

14 Chapter 1 A Tour of Computer Systems

CPU registers hold words retrieved from cache memory.

L1 cache holds cache lines retrieved from L2 cache.

L2 cache holds cache linesretrieved from L3 cache.

Main memory holds disk blocks retrieved from local disks.

Local disks hold filesretrieved from disks onremote network server.

Regs

L3 cache(SRAM)

L2 cache(SRAM)

L1 cache(SRAM)

Main memory(DRAM)

Local secondary storage(local disks)

Remote secondary storage(distributed file systems, Web servers)

Smaller,faster,and

costlier(per byte)storagedevices

Larger,slower,

andcheaper

(per byte)storagedevices

L0:

L1:

L2:

L3:

L4:

L5:

L6:

L3 cache holds cache linesretrieved from memory.

Figure 1.9 An example of a memory hierarchy.

Just as programmers can exploit knowledge of the different caches to improveperformance, programmers can exploit their understanding of the entire memoryhierarchy. Chapter 6 will have much more to say about this.

1.7 The Operating System Manages the Hardware

Back to our hello example. When the shell loaded and ran the hello program,and when the hello program printed its message, neither program accessed thekeyboard, display, disk, or main memory directly. Rather, they relied on theservices provided by the operating system. We can think of the operating system asa layer of software interposed between the application program and the hardware,as shown in Figure 1.10. All attempts by an application program to manipulate thehardware must go through the operating system.

The operating system has two primary purposes: (1) to protect the hardwarefrom misuse by runaway applications, and (2) to provide applications with simpleand uniform mechanisms for manipulating complicated and often wildly differentlow-level hardware devices. The operating system achieves both goals via the

Figure 1.10Layered view of acomputer system.

Application programs

Operating system

Main memory I/O devicesProcessor

Software

Hardware

Page 48: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

48

AbstracQons  provided  by  an  OS  

¢  The  operaQng  system  achieves  both  goals  via  the  fundamental  abstracQons”  §  Processes:  The  operaQng  system’s  abstracQon  for  running  programs.  §  Virtual  Memory:  An  abstracQon  that  provides  each  process  with  the  illusion  that  it  has  exclusive  use  of  the  main  memory.    

§  Files:  AbstracQons  for  I/O  devices  Section 1.7 The Operating System Manages the Hardware 15

Figure 1.11Abstractions provided byan operating system.

Main memory I/O devicesProcessor

Processes

Virtual memory

Files

fundamental abstractions shown in Figure 1.11: processes, virtual memory, andfiles. As this figure suggests, files are abstractions for I/O devices, virtual memoryis an abstraction for both the main memory and disk I/O devices, and processesare abstractions for the processor, main memory, and I/O devices. We will discusseach in turn.

Aside Unix and Posix

The 1960s was an era of huge, complex operating systems, such as IBM’s OS/360 and Honeywell’sMultics systems. While OS/360 was one of the most successful software projects in history, Multicsdragged on for years and never achieved wide-scale use. Bell Laboratories was an original partner in theMultics project, but dropped out in 1969 because of concern over the complexity of the project and thelack of progress. In reaction to their unpleasant Multics experience, a group of Bell Labs researchers—Ken Thompson, Dennis Ritchie, Doug McIlroy, and Joe Ossanna—began work in 1969 on a simpleroperating system for a DEC PDP-7 computer, written entirely in machine language. Many of the ideasin the new system, such as the hierarchical file system and the notion of a shell as a user-level process,were borrowed from Multics but implemented in a smaller, simpler package. In 1970, Brian Kernighandubbed the new system “Unix” as a pun on the complexity of “Multics.” The kernel was rewritten inC in 1973, and Unix was announced to the outside world in 1974 [89].

Because Bell Labs made the source code available to schools with generous terms, Unix developeda large following at universities. The most influential work was done at the University of Californiaat Berkeley in the late 1970s and early 1980s, with Berkeley researchers adding virtual memory andthe Internet protocols in a series of releases called Unix 4.xBSD (Berkeley Software Distribution).Concurrently, Bell Labs was releasing their own versions, which became known as System V Unix.Versions from other vendors, such as the Sun Microsystems Solaris system, were derived from theseoriginal BSD and System V versions.

Trouble arose in the mid 1980s as Unix vendors tried to differentiate themselves by adding newand often incompatible features. To combat this trend, IEEE (Institute for Electrical and ElectronicsEngineers) sponsored an effort to standardize Unix, later dubbed “Posix” by Richard Stallman. Theresult was a family of standards, known as the Posix standards, that cover such issues as the C languageinterface for Unix system calls, shell programs and utilities, threads, and network programming. Asmore systems comply more fully with the Posix standards, the differences between Unix versions aregradually disappearing.

Page 49: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

49

Binary  RepresentaQons  

0.0V"0.5V"

2.8V"3.3V"

0" 1" 0"

Page 50: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

50

Encoding  Byte  Values  

¢  Byte  =  8  bits  §  Binary  000000002  to  111111112  §  Decimal:  010  to  25510  §  Hexadecimal  0016  to  FF16  

§  Base  16  number  representaQon  §  Use  characters  ‘0’  to  ‘9’  and  ‘A’  to  ‘F’  §  Write  FA1D37B16  in  C  as  

–  0xFA1D37B  –  0xfa1d37b      

0 0 0000 1 1 0001 2 2 0010 3 3 0011 4 4 0100 5 5 0101 6 6 0110 7 7 0111 8 8 1000 9 9 1001 A 10 1010 B 11 1011 C 12 1100 D 13 1101 E 14 1110 F 15 1111

Page 51: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

51

Byte-­‐Oriented  Memory  OrganizaQon  

¢  Programs  Refer  to  Virtual  Addresses  §  Conceptually  very  large  array  of  bytes  §  Actually  implemented  with  hierarchy  of  different  memory  types  §  System  provides  address  space  private  to  parQcular  “process”  

§  Program  being  executed  §  Program  can  clobber  its  own  data,  but  not  that  of  others  

¢  Compiler  +  Run-­‐Time  System  Control  AllocaQon  §  Where  different  program  objects  should  be  stored  §  All  allocaQon  within  single  virtual  address  space  

• • •"

Page 52: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

52

Machine  Words  

¢ Machine  Has  “Word  Size”  §  Nominal  size  of  integer-­‐valued  data  

§  Including  addresses  §  Most  current  machines  use  32  bits  (4  bytes)  words  

§  Limits  addresses  to  4GB  §  Becoming  too  small  for  memory-­‐intensive  applicaQons  

§  High-­‐end  systems  use  64  bits  (8  bytes)  words  §  PotenQal  address  space  ≈  1.8  X  1019  bytes  §  x86-­‐64  machines  support  48-­‐bit  addresses:  256  Terabytes  

§  Machines  support  mulQple  data  formats  §  FracQons  or  mulQples  of  word  size  §  Always  integral  number  of  bytes  

Page 53: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

53

Word-­‐Oriented  Memory  OrganizaQon  ¢  Addresses  Specify  Byte  LocaQons  

§  Address  of  first  byte  in  word  §  Addresses  of  successive  words  differ  by  4  (32-­‐bit)  or  8  (64-­‐bit)  

0000 0001 0002 0003 0004 0005 0006 0007 0008 0009 0010 0011

32-bit"Words" Bytes" Addr."

0012 0013 0014 0015

64-bit"Words"

Addr "="??

Addr "="??

Addr "="??

Addr "="??

Addr "="??

Addr "="??

0000

0004

0008

0012

0000

0008

Page 54: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

54

Data  RepresentaQons  

C Data Type Typical 32-bit Intel IA32 x86-64

char 1 1 1

short 2 2 2

int 4 4 4

long 4 4 8

long long 8 8 8

float 4 4 4

double 8 8 8

long double 8 10/12 10/16

pointer 4 4 8

Page 55: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

55

Byte  Ordering  

¢  How  should  bytes  within  a  mulQ-­‐byte  word  be  ordered  in  memory?  

¢  ConvenQons  §  Big  Endian:  Sun,  PPC  Mac,  Internet  

§  Least  significant  byte  has  highest  address  §  Linle  Endian:  x86  

§  Least  significant  byte  has  lowest  address  

Page 56: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

56

Byte  Ordering  Example  

¢  Big  Endian  §  Least  significant  byte  has  highest  address  

¢  Linle  Endian  §  Least  significant  byte  has  lowest  address  

¢  Example  §  Variable  x  has  4-­‐byte  representaQon  0x01234567  §  Address  given  by  &x  is  0x100  

0x100 0x101 0x102 0x103

01 23 45 67

0x100 0x101 0x102 0x103

67 45 23 01

Big Endian"

Little Endian"

01 23 45 67

67 45 23 01

Page 57: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

57

The  origin  of  “endian”  

“Gulliver  finds  out  that  there  is  a  law,  proclaimed  by  the  grandfather  of  the  present  ruler,  requiring  all  ciQzens  of  Lilliput  to  break  their  eggs  only  at  the    linle  ends.  Of  course,  all  those  ciQzens  who  broke  their  eggs  at  the  big  ends    were  angered  by  the  proclamaQon.  Civil  war  broke  out  between  the  Linle-­‐Endians  and  the  Big-­‐Endians,  resulQng  in  the  Big-­‐Endians  taking  refuge  on  a  nearby    island,  the  kingdom  of  Blefuscu.”      

–  Danny  Cohen,  On  Holy  Wars  and  A  Plea  For  Peace,  1980  

IllustraQon  by  Chris  Beatrice  

Page 58: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

58

Address "Instruction Code "Assembly Rendition" 8048365: 5b pop %ebx 8048366: 81 c3 ab 12 00 00 add $0x12ab,%ebx 804836c: 83 bb 28 00 00 00 00 cmpl $0x0,0x28(%ebx)

Reading  Byte-­‐Reversed  LisQngs  

¢  Disassembly  §  Text  representaQon  of  binary  machine  code  §  Generated  by  program  that  reads  the  machine  code  

¢  Example  Fragment  

¢  Deciphering  Numbers  §  Value:  0x12ab  §  Pad  to  32  bits:  0x000012ab  §  Split  into  bytes:  00 00 12 ab  §  Reverse:  ab 12 00 00

Page 59: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

59

Examining  Data  RepresentaQons  

¢  Code  to  Print  Byte  RepresentaQon  of  Data  §  CasQng  pointer  to  unsigned  char  *  creates  byte  array  

Printf directives:"%p: "Print pointer"%x: "Print Hexadecimal"

typedef unsigned char *pointer; void show_bytes(pointer start, int len){ int i; for (i = 0; i < len; i++) printf(”%p\t0x%.2x\n",start+i, start[i]); printf("\n"); }

Page 60: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

60

show_bytes  ExecuQon  Example  int a = 15213; printf("int a = 15213;\n"); show_bytes((pointer) &a, sizeof(int));

Result (Linux):"int a = 15213; 0x11ffffcb8 0x6d 0x11ffffcb9 0x3b 0x11ffffcba 0x00 0x11ffffcbb 0x00

Page 61: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

61

RepresenQng  Integers   Decimal: "15213 Binary: 0011 1011 0110 1101

Hex: 3 B 6 D

6D 3B 00 00

IA32, x86-64"

3B 6D

00 00

Sun"

int A = 15213;

93 C4 FF FF

IA32, x86-64"

C4 93

FF FF

Sun"

Two’s complement representation"(Covered later)"

int B = -15213;

long int C = 15213;

00 00 00 00

6D 3B 00 00

x86-64"

3B 6D

00 00

Sun"

6D 3B 00 00

IA32"

Page 62: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

62

RepresenQng  Pointers  

Different  compilers  &  machines  assign  different  locaQons  to  objects  

int B = -15213; int *P = &B;

x86-64"Sun" IA32"EFFFFB2C

D4F8FFBF

0C89ECFFFF7F0000

Page 63: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

63

char S[6] = "18243";

RepresenQng  Strings  

¢  Strings  in  C  §  Represented  by  array  of  characters  §  Each  character  encoded  in  ASCII  format  

§  Standard  7-­‐bit  encoding  of  character  set  §  Character  “0”  has  code  0x30  

–  Digit  i    has  code  0x30+i  §  String  should  be  null-­‐terminated  

§  Final  character  =  0  

¢  CompaQbility  §  Byte  ordering  not  an  issue  

Linux/Alpha" Sun"

313832343300

313832343300

Page 64: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

64

Boolean  Algebra  

¢  Developed  by  George  Boole  in  19th  Century  §  Algebraic  representaQon  of  logic  

§  Encode  “True”  as  1  and  “False”  as  0  

And  n   A&B  =  1  when  both  A=1  and  B=1  

Or  n   A|B  =  1  when  either  A=1  or  B=1  

Not  n   ~A  =  1  when  A=0  

Exclusive-­‐Or  (Xor)  n   A^B  =  1  when  either  A=1  or  B=1,  but  not  both  

Page 65: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

65

ApplicaQon  of  Boolean  Algebra  

¢  Applied  to  Digital  Systems  by  Claude  Shannon  §  1937  MIT  Master’s  Thesis  §  Reason  about  networks  of  relay  switches  

§  Encode  closed  switch  as  1,  open  switch  as  0  

A"

~A"

~B"

B"

Connection when" "

A&~B | ~A&B"" "

A&~B"

~A&B"= A^B"

Page 66: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

66

General  Boolean  Algebras  

¢  Operate  on  Bit  Vectors  §  OperaQons  applied  bitwise  

¢  All  of  the  ProperQes  of  Boolean  Algebra  Apply  

01101001 & 01010101 01000001

01101001 | 01010101 01111101

01101001 ^ 01010101 00111100

~ 01010101 10101010 01000001 01111101 00111100 10101010

Page 67: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

67

RepresenQng  &  ManipulaQng  Sets  

¢  RepresentaQon  §  Width  w  bit  vector  represents  subsets  of  {0,  …,  w–1}  §  aj  =  1  if  j    ∈  A  

§   01101001  {  0,  3,  5,  6  }  §   76543210  

§   01010101  {  0,  2,  4,  6  }  §   76543210  

¢  OperaQons  §  &      IntersecQon    01000001  {  0,  6  }  §  |        Union      01111101  {  0,  2,  3,  4,  5,  6  }  §  ^        Symmetric  difference  00111100  {  2,  3,  4,  5  }  §  ~        Complement    10101010  {  1,  3,  5,  7  }  

Page 68: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

68

Bit-­‐Level  OperaQons  in  C  

¢  OperaQons  &,    |,    ~,    ^  Available  in  C  §  Apply  to  any  “integral”  data  type  

§  long, int, short, char, unsigned§  View  arguments  as  bit  vectors  §  Arguments  applied  bit-­‐wise  

¢  Examples  (Char  data  type)  §  ~0x41 ➙ 0xBE

§  ~010000012 ➙ 101111102§  ~0x00 ➙ 0xFF

§  ~000000002 ➙ 111111112§  0x69 & 0x55 ➙ 0x41

§  011010012 & 010101012 ➙ 010000012§  0x69 | 0x55 ➙ 0x7D

§  011010012 | 010101012 ➙ 011111012

Page 69: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

69

Contrast:  Logic  OperaQons  in  C  

¢  Contrast  to  Logical  Operators  §  &&, ||, !

§  View  0  as  “False”  §  Anything  nonzero  as  “True”  §  Always  return  0  or  1  §  Early  terminaQon  

¢  Examples  (char  data  type)  §  !0x41 ➙ 0x00§  !0x00 ➙ 0x01§  !!0x41 ➙ 0x01

§  0x69 && 0x55 ➙ 0x01§  0x69 || 0x55 ➙ 0x01§  p && *p  (avoids  null  pointer  access)  

Page 70: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

70

Shio  OperaQons  

¢  Leo  Shio:    x << y  §  Shio  bit-­‐vector  x  leo  y  posiQons  

–  Throw  away  extra  bits  on  leo  §  Fill  with  0’s  on  right  

¢  Right  Shio:    x >> y  §  Shio  bit-­‐vector  x  right  y  posiQons  

§  Throw  away  extra  bits  on  right  §  Logical  shio  

§  Fill  with  0’s  on  leo  §  ArithmeQc  shio  

§  Replicate  most  significant  bit  on  right  

¢  Undefined  Behavior  §  Shio  amount  <  0  or  ≥  word  size  

01100010 Argument x

00010000 << 3

00011000 Log. >> 2

00011000 Arith. >> 2

10100010 Argument x

00010000 << 3

00101000 Log. >> 2

11101000 Arith. >> 2

00010000 00010000

00011000 00011000

00011000 00011000

00010000

00101000

11101000

00010000

00101000

11101000

Page 71: BIL.220.–Introduc7on.to.Systems. Programmingbil220/w01-overview-bits.pdf · x86$assembly$is$the$language$of$choice!$ 10! ... Inserts$assembly$code$into$machine$code$generated$by$compiler$

71

The  Strange  Birth  and  Long  Life  of  Unix  

¢  hnp://spectrum.ieee.org/compuQng/sooware/the-­‐strange-­‐birth-­‐and-­‐long-­‐life-­‐of-­‐unix  

Photo:  Alcatel-­‐Lucent