lecture 13 - review
DESCRIPTION
Lecture 13 - Review. Review. L ecture 1 - Address Map - Global vs Local. Pointer. int var (var is a variable and occupies 4 bytes ) int *var (*var is a pointer that points to an integer). Example. Example 1 *. Pointer to an integer. int var = 108; int *varpointer; - PowerPoint PPT PresentationTRANSCRIPT
1
Lecture 13 - ReviewLecture 13 - Review
22
Review Review
33
LLecture 1 - Address Map - Global vs ecture 1 - Address Map - Global vs LocalLocal
44
Pointer Pointer
int var (var is a variable and occupies 4 bytes )
int *var (*var is a pointer that points to an integer)
55
ExampleExample
66
Example 1 *Example 1 *
int var = 108;
int *varpointer;
&varpt = var; (*varpt = 123)
Pointer to an integer
77
Example 2Example 2Point to location
0x0012FF7C Value is
0x61 = ‘a’
88
Array and Pointer * Array and Pointer *
char a[10]=“1234567890”;a[0] =“1”;a[1] =“2”;char *ptr;ptr = &a; (or just a in C);*ptr -> 1 (a[0])*(ptr + 1) ->2 (a[1])
array[i] * (array + i) &array[i] array + i array[i + j] * (array + i + j) &array[i + j] array + i + j
Same result,
different expression
99
Multi-Dimensional Arrays – pointer of Multi-Dimensional Arrays – pointer of pointerpointer
array2[7][10] or **array2
1010
Naughty PointersNaughty Pointers
The value pointed by pointer is modified.
It will destroy the program and is not recommended.
However, if you can master pointer, you can write a very elegant program.
1111
SummarySummary
Integer : 4 bytes such as int a = 3;
0x0065FDF1:03
0x0065FDF2:00
0x0065FDF3:00
0x0065FDF4:00
You have to rotate the data 0x00000003
1212
SummarySummary
Short: two bytes
Short a = 3;
0x0065FDF3: 03
0x0065FDF4: 00
Short b = 4;
0x0065FDF0: 04
0x0065FDF1: 00
Not used
As short uses 2 bytes, remaining two bytes in memory 0x0065FDF2
(0xCC) and 0x0065FDF1 (0xCC) are not used
1313
Lecture 2 Lecture 2
1414
Example – abc Example – abc program nameprogram name
#include <stdio.h> int first; int second;
void callee ( int first ) { int second;
second = 1; first = 2; printf("callee: first = %d second = %d\n", first, second); }
int main (int argc, char *argv[]) { first = 1; second = 2; callee(first); printf("caller: first = %d second = %d\n", first, second); return 0; }
DOS>abc 12 34
Here, argc = 3,argv[0] = abcargv[1]= 12argv[2] =34
Same variable “second”, but different memory
location
1515
Example (passed by pointer) *Example (passed by pointer) *
void callee ( int * first ) //not a variable, but an address { int second; second = 1; *first = 2; printf("callee: first = %d second = %d\n", *first, second); } int main (int argc, char *argv[]) { first = 1; second = 2; callee(&first); //passed by address --- printf("caller: first = %d second = %d\n", first, second); return 0; }
Content by address
1616
Diagram - stack push (create)Diagram - stack push (create)
1717
Diagram - stack pop (return)Diagram - stack pop (return)
1818
The CPU also Has MemoryThe CPU also Has Memory
The CPU also maintains its own banks of memory called registers.
They temporarily hold the data
As a result, the program is faster.register
Cache memory
Main memory
Disk
Memoryhierarch
y
1919
Lecture 3 - attentionLecture 3 - attention
2020
Bit OperationsBit Operations
AND &
OR |
ONE'S COMPLEMENT ~
EXCLUSIVE OR ^
SHIFT (right) >>
SHIFT (left) <<
2121
Operation - examplesOperation - examples
AND 1 & 1 = 1; 1& 0 = 0
OR 1 |1 = 1; 1| 0 = 1; 0|0 = 0
~ 0 =~1; 1 =~0;
^ 0^ 0 = 0; 1^1 = 0; 1^0 =1; 0^1 = 1
>> 0x010 = 0x001 <<1
<< 0x001 = 0x010 >>1
2222
One’s complementOne’s complement
1111 0010 (0xf2)
-------------- ~0000 1101 (0x0d)
char c = 0xf2;char e = ~c; //e is 0x0d
2323
EXCLUSIVE OREXCLUSIVE OR
1111 0010 (0xf2)1111 1110 (0xfe)-------------- (^) 0000 1100 (0x0c)
char c = 0xf2;char d = 0xfe;char e = c ^ d; //e is 0x0c
2424
SHIFT >> (right) by one bitSHIFT >> (right) by one bit
1111 0010 (0xf2)>> 1 (shift right by one bit)---------------------
0111 10001 (0x79)
char c = 0xf2;char e = c >>1; //e is 0x79
2525
SHIFT << (left) by one bitSHIFT << (left) by one bit
1111 0010 (0xf2)<< 1 (shift right by one bit)---------------------
1110 0100 (0xe4)
char c = 0xf2;char e = c <<1; //e is 0xe4
2626
SHIFT << by two bitsSHIFT << by two bits
1111 0010 (0xf2)>> 2 (shift right by one bit)---------------------
1100 1000 (0xc8)
char c = 0xf2;char e = c <<2; //e is 0xc8
2727
Lecture 4Lecture 4
2828
ExpressionExpression
1 bit sign bit, 8 bit exponent, and 23 bit Mantissa (total 32 bits)
-1^Sign * 2^(Exponent - 127) * (1 + Mantissa * 2^-23)
Zero, sign bit is 0, Negative, sign bit is 1
Exponent is unsigned, minus 127. That is if the value is 128, it means 128 – 127 = 1, if the value is 256, it means 256 – 127 = 128, or the value is zero, it means 0 – 127 = -127.
2929
Example Example
3030
ExampleExample
2.5 (floating point)
0100 0000 0010 0000 0000 0000 0000 0000
Sign: positive (1)
Exponent : 1000 0000 : 128 (128 – 127 = 1)
Mantissa: 1. 010 0000 0000 0000 0000 0000, 1.25
Result 1 x 1.25 x 2^1 = 2.5
3131
StringString
Is an array of character and is terminated by a null character (0x00)
char a[4] = “Hi?”;
a[0] = H;
a[1] = I;
a[2] =?;
a[3] = 0x00
Incorrect declaration: char char[3] = “Hi?”,
as 0x00 is missing
3232
An exampleAn example
struct {
char a, b, c, cc;
int i;
double d;
} mystruct;
Name is mystruct
3333
Lecture 5Lecture 5
3434
Static AllocationStatic Allocation
The word static (fix) refers to things that happen at compile time (compile) and link (link) time when the program is constructed.For example, you can define
char a[9] =“12345678”; //assign 9 bytes for array a
The compiler will assign 9 bytes during compilationLinker will assign the correct address for array aYou cannot change it even you think you need 10 bytes while running this program
3535
An exampleAn example
int my_var[128]; // a statically allocated variable static bool my_var_initialized = false; //static
declaration int my_fn(int x) { if (my_var_initialized) return; my_var_initialized = true; for (int i = 0; i < 128; i++) my_var[i] = 0; }
Initially, it
is false
3636
Dynamic allocationDynamic allocation
Limitations of Static AllocationIf two procedures use a local variable named i, there will be a conflict if both i's are globally visible. If i is only declared once, then i will be shared by the two procedures. One might call the other, even indirectly, and cause i to be overwritten unexpectedly. It would be better if each procedure could have its own copy of i.
3737
Grab memoryGrab memory
To grab memory, we have to use malloc(size). For exampleptr = malloc(4) will return a pointer with memory size of 4 bytesptr = malloc(4*int) will return a pointer with 16 bytes = 4 x 4 (integer) = 16 bytesmalloc(4*long) will return a pointer with 16 bytes = 4 x 4 (long) = 16 bytesfree(ptr), free this pointer to the memory
3838
Fragmentation – holes Fragmentation – holes
Although it has memory
3939
Example of First fitExample of First fit
4040
Example of Best fitExample of Best fit
4141
Example of Worst fitExample of Worst fit
4242
Lecture 6Lecture 6
4343
Block sizes – the size to hold the data for users’ usageBlock sizes – the size to hold the data for users’ usage
The standard method for determining the size of a block, given a pointer to the block, is to store its size in the word before the pointer.
Here, the memory block that can be used is 16 bytes, the block size is 20 bytes including 4 bytes for the size
Only 16
bytes
4444
Determine the sizeDetermine the size
Note that it uses [-1] to point to location before the pointer (location that contains block size)As the size is a multiple of 4, it clears the lowest two bits (3 =0000 0000 0000 0011 (hex), ~3 = 1111 1111 1111 1100Free means the block can be used by user (binary 1)
size = ((int *) ptr)[-1]; // read integer before the memory block
correct_size = size & ~3; // clear the lower 2 bits
free = size & 1; // get low-order bit
4545
Splitting a Free BlockSplitting a Free Block
The heap is normally initialized to look like one giant free block. (40 bytes)
When allocations occur, it would be wasteful to return a large block of free memory when a small one would do just as well. (I need 8 bytes, no point to return 40 bytes)
Therefore, the memory allocator will typically split a block if the block size is larger than the requested size. (12 allocated and 28 free)
4646
Common bug in scanf Common bug in scanf
Note that you should supply the address rather than the variable
Use &i; instead of i
It is important in your exam.
int i;
double d;
scanf("%d %g", i, d); // wrong!!!
// here is the correct call:
scanf("%d %g", &i, &d);
4747
Overwriting MemoryOverwriting Memory
Here, i will be incremented from 0 to array_size, not array_size – 1;The solution is
; i < array_size; not; i <= array_size;
#define array_size 100
int *a = (int *) malloc(sizeof(int *) * array_size);
for (int i = 0; i <= array_size; i++)
a[i] = NULL;
4848
Memory bugMemory bug
Here, the memory allocated is 100 bytes, not 400 bytes and a[] is defined as array pointer
The solution is:
int *a = (int *) malloc( array_size* sizeof(int));
#define array_size 100
int *a = (int *) malloc(array_size);
a[99] = 0; // this overwrites memory beyond the block
4949
String must be terminated by 0x00String must be terminated by 0x00
String must be terminated by 0x00;
The solution is:
char *new_s = (char *) malloc(len + 1);
char *heapify_string(char *s)
{ int len = strlen(s);
char *new_s = (char *) malloc(len);
strcpy(new_s, s);
return new_s;
}
By 0x00
5050
Memory leaksMemory leaks
The memory that is on longer used is not returned to the memory pool.
The result is that the system will run out of memory.
The failure to deallocate (free) a block of memory when it is no longer needed is often called a memory leak
Do not return the memory block to
the pool
5151
Lecture 7Lecture 7
5252
ProcedureProcedure
A slow but correctProgram
Modify the programTo make it faster
5353
What to Measure (Wall clock)What to Measure (Wall clock)
An alternative is to measure real time or "wall clock time“This is the time an ordinary clock on the wall or a wrist watch shows.
The difference between CPU time and wall time can give some indication of the time spent waiting for I/O.
Wall time
CPU time
I/O time
5454
Principles - PerformancePrinciples - Performance
The 80/20 Rule – It means 80% of the CPU time is spent in 20% of the program.
In this case, you can have better performance by looking at this 20%.
Amdahl's Law – for parallel processing, the performance is limited by sequential part of the program.
5555
Example of 80/20: Example of 80/20: 10% on one means 2% as a whole10% on one means 2% as a whole
A module consists of 5 modules
20 ms
20 ms
20 ms
20 ms
20 ms
20 ms
18 ms
20 ms
20 ms
20 ms
5656
Example of 80/20: Example of 80/20: 10% on one means 5% as a whole10% on one means 5% as a whole
A module consists of 5 modules
10 ms
50 ms
10 ms
10 ms
10 ms
10 ms
45 ms
10 ms
10 ms
10 ms
Conclusion: focus on
module with more CPU time
5757
Lecture 8Lecture 8
5858
Coding for Speed Coding for Speed http://http://www.abarnett.demon.co.uk/tutorial.htmlwww.abarnett.demon.co.uk/tutorial.html mainly from this web mainly from this web
sitesite
Array Indices Aliases Registers Integers Loop Jamming Dynamic Loop Unrolling Faster for() loops Switch Pointers Early loop breaking Misc Using array indices
There are many ways to speed up
the operation.
5959
Aliases (1)Aliases (1)
void func1( int *data ) { int i; for(i=0; i<10; i++) {
somefunc2( *data, i); } }
Not very good
6060
Aliases – better change to this Aliases – better change to this
void func1( int *data ){
int i; int localdata; localdata = *data; for(i=0; i<10; i++) {
somefunc2( localdata, i); }
}
Better way
6161
Loop JammingLoop Jamming
Never use two loops where one will suffice: for(i=0; i<100; i++) {
stuff(); } for(i=0; i<100; i++) {
morestuff(); }
Better combine
them
6262
Early loop breakingEarly loop breaking
This loop searches a list of 10000 numbers to see if there is a -99 in it. found = FALSE; for(i=0;i<10000;i++) { if( list[i] == -99 ) { found = TRUE; } } if( found ) printf("Yes, there is a -99. Hooray!\n"); This works well but searches the whole list.
6363
Early loop breakingEarly loop breaking
A better way is to abort the search when it is found.
found = FALSE; for(i=0; i<10000; i++) { if( list[i] == -99 ) { found = TRUE; break; } } if( found ) printf("Yes, there is a -99. Hooray!\n");
6464
Lecture 9Lecture 9
6565
Memory and CPUMemory and CPU
Program here
Cache and register
here
6666
Memory hierarchiesMemory hierarchies
Within CPU
6767
common memory technologiescommon memory technologies
Static Random Access Memory (SRAM) Dynamic Random Access Memory (DRAM)
Magnetic disks Magnetic tapes Optical disks
6868
SpeedSpeed
6969
Size and CostSize and Cost
7070
Principle of LocalityPrinciple of Locality
references to a single address occur close together in time
like int i, j; (like i and j)
(this is called temporal locality).
references to addresses that are near to each other occur together in time
Like it calls i and then j later
(this is called spatial locality).
7171
Principle of localityPrinciple of locality
The principle of locality of reference is not an assurance, but rather a conjecture. (means GUESS)
Empirically, however, there is little doubt that programs behave according to this principle.
Think about it: If you need to
use the same variable i later, it is better to keep this in the cache. Not to release to the memory.
7272
Graph showing CPU, DRAM & Graph showing CPU, DRAM & SRAMSRAM
7373
Four-level hierarchyFour-level hierarchy
7474
Lecture 10Lecture 10
7575
ExampleExample
/* Assumes n is a power of two */ void merge_sort (int * data, int n) {
int half = n >> 1; if (n == 1) return; binary_sort(data, half); binary_sort(data + half, half); merge(data, data + half, half); }
// no need to memorise
7676
Graph of Merge SortGraph of Merge Sort
the access times in nanoseconds (ns) for the L1 cache (T1), L2 cache (T2), L3 cache (T3), and main memory (Tm).
7777
Looking at the CachesLooking at the Caches
We can deduce many things about the cache design of a particular computer by carefully examining its memory performance.
We can design a benchmark program whose locality we control.
int data[MAXSIZE]; for (i = 0; i < repeat; i++) { for (i = 0; i < N; i++) { dummy = data[i]; } }
7878
Control the spatial localityControl the spatial locality
Here, stride controls the amount of spatial locality
int data[MAXSIZE]; for (i = 0; i < repeat; i++) { for (i = 0; i < N; i += stride) { dummy = data[i]; } }
7979
Graph showing the effectGraph showing the effect
8080
Example of a matrixExample of a matrix
int data[M][N];
for (i = 0 ; i < N; i++) {
for (j = 0; j < M; j++) {
sum += data[j][i];
}
}
8181
Changing the order of the iterations is not always better. Below is Changing the order of the iterations is not always better. Below is an example.an example.
int original[M][N];
int transposed[N][M];
for (i = 0; i < M; i++) {
for (j = 0; j < N; j++) {
transposed[i][j] = original[j][i];
}
}
8282
Insufficient Temporal LocalityInsufficient Temporal Locality
int original[M][N]; int transposed[N][M];
for (k = 0; k < M / m; k++) { for (l = 0; l < N / n; k++) { for (i = k*m; i < (k+1)*m; i++) { for (j = l*n; j < (l+1)*n; j++) { transposed[i][j] = original[j][i]; } } } }
8383
Lecture 11Lecture 11
8484
Example of a matrixExample of a matrix
int data[M][N];
for (i = 0 ; i < N; i++) {
for (j = 0; j < M; j++) {
sum += data[j][i];
}
}
This is a
MxN matrix
8585
Row-major and Column-majorRow-major and Column-major
Row major – sequence of access
data
Column major
8686
Accessing a column-majorAccessing a column-major
8787
Accessing row dataAccessing row data
It will be faster, as it accesses [0,0], [0,1][0,2] which will be loaded into cache line after reading [00] up to [13], as the data is already in memory in this sequence
Row major is faster than column major
8888
Segment address translationSegment address translation
DiskMemory
8989
PagingPaging
the allocation of memory into chunks of varying size causes external fragmentation.
To solve this problem we can change the nature of the address translation so that, instead of mapping virtual to physical address in big chunks of varying size, it maps them in small chunks of constant size,
9090
PagingPaging
9191
Impact of VM on PerformanceImpact of VM on Performance
int data[M][N]; for (i = 0 ; i < N; i++){ for (j = 0; j < M; j++){ sum += data[j][i]; } } //column major – more page fault
9292
Impact of VM on PerformanceImpact of VM on Performance
int data[M][N]; for (j = 0 ; j < N; j++){ for (i = 0; i < M; i++){ sum += data[j][i]; } } //row major – less page fault
9393
Example of Context SwitchingExample of Context Switching
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
9494
Process stateProcess state
Here, there are three states for one process. Running means it uses the CPU, ready means it is ready to use the CPU,while suspended means it is waiting for an I/O.
9595
Non-Preemptive processNon-Preemptive process
Must finish before CPU can switch to others, say you have three processes, P1, P2, P3
9696
Preemptive processPreemptive process
CPU can switch without finishing the process
97
Lecture 12 Lecture 12
Network Programming
9898
ReviewReview
Client Server Programming Model
Networks
Global IP Internet
Socket Interface
Web Servers
9999
Client Server Programming ModelClient Server Programming Model
1. When a client needs service, it initiates a transaction2. The server receives the request, interprets and
manipulates3. The server sends a response to the client and waits for the
next request4. The client receives the response and manipulates it.
resourceserver
processClient
process
100100
Hardware and Software OrganisationsHardware and Software Organisations
Client
TCP/IP
Network Adaptor
Client
TCP/IP
Network Adaptor
101101
Internet Domain NamesInternet Domain Names
U nnam ed ro o t
m il ed u go v c o m
c ityu c uhk
D C O
/* DNS entry structure */Struct hostnet {char *h_name; /* official domain name of host */char **h_aliases /* null-terminated array of domain name */
int h_addrtype; /* host address */Int h_length; /* length of an address in bytes */char **h_addr_list; /*null terminated array of in_addr structs */
};
102102
Internet ConnectionInternet Connection
Internet Clients and severs communicate by sending and receiving streams of bytes over connection.
A connection is point-to-point.
A connection is full duplex in the sense that data can flow in both directions.
A socket is an end point connection.
103103
Socket ConnectionSocket Connection
Client Server
Each socket has a corresponding socket address that consists of IP address and 16-bit integer port. It is denoted by address: port (such as address:port 121.2.3.4:12345)
104104
Socket InterfaceSocket Interface
Socket
Connect
Rio_written
Rio_readlineb
close
Socket
accept
Rio_readlineb
Rio_written
close
listen
bind
Rio_readlineb
105105
Socket FunctionSocket Function
/* listen function
#include <sys/socket.h>
int listen (int sockfd, int backlog); /* return -1 on Unix error, -2 on DNS error */
/* accept function */
#include <sys/socket.h>
int accept (int listenfd, struct sockaddr, *addr, int *addrlen); /* return -1 on Unix error, -2 on DNS error */
106106
Role of The listening and Connected DescriptorsRole of The listening and Connected Descriptors
Client Server
Client Server
Client Server
107107
Last PageLast Page