Download - Lab Exercises
Programming Multi-Core Processors based
Embedded Systems
A Hands-On Experience on Cavium Octeon based Platforms
Lab Exercises
Dr. Abdul WaheedCopyright © 2009 5-2
Lab # 1: Parallel Programming and
Performance measurement using MPAC
Dr. Abdul WaheedCopyright © 2009 5-3
Lab Goals
Objective Performance measurement using MPAC
benchmarks Learning parallel programming using MPAC
Dr. Abdul WaheedCopyright © 2009 5-4
MPAC fork and join infrastructure
Dr. Abdul WaheedCopyright © 2009 5-5
Lab # 2: Parallel Sort using MPAC
Dr. Abdul WaheedCopyright © 2009 5-6
Lab Goals
Parallel Sorting This lab implements two parallel sorting
algorithm Quick sort Bucket Sort
Objective Partitioning of data array Worker threads sorting partitioned array Merge partitioned arrays Performance measurements
Dr. Abdul WaheedCopyright © 2009 5-7
Parallel Quick Sort31 23 14 26 8 36 4 21 4 7 1 43 32 12 21 7
31 23 14 26
8 36 4 21 4 7 1 43
32 12 21 7
14 23 26 31
4 8 21 36 1 4 7 43
7 12 21 32
14 23 26 31 4 8 21 36 1 4 7 43 7 12 21 32
1 4 4 7 7 8 12 14 21 21 23 26 31 32 36 43
Thread Function
(1)
(2)
(3)
(4)
Dr. Abdul WaheedCopyright © 2009 5-8
Parallel Bucket Sort
14 21 12 21 31 23 26 32
36 43
1 4 4 7
12 14 21 21 23 26 31 32
7 8 36 43
1 4 4 7 7 8 12 14 21 21 23 26 31 32 36 43
Thread Function
31 23 14 26 8 36 4 21 4 7 1 43 32 12 21 7
1 - 11 12 - 22 23 - 33 34 - 44
8 4 4 7 1 7
(1)
(2)
(3)
(4)
Dr. Abdul WaheedCopyright © 2009 5-9
Performance MeasurementQuick Sort
020406080
100120140160180200
1 2 3 4 5 6 7 8
No. of Threads
Ela
pse
d T
ime
(sec
onds)
Bucket Sort
0
5
10
15
20
25
30
35
1 2 3 4 5 6 7 8
No. of Threads
Ela
pse
d T
ime
(sec
onds)
Observations Observe the decreasing elapsed time indicating
increase in performance with increasing number of threads
Bucker Sort more efficient than Quick Sort
Dr. Abdul WaheedCopyright © 2009 5-10
Lab # 3-5: Packet Sniffing Labs
An overview
Dr. Abdul WaheedCopyright © 2009 5-11
Lab Goals
Objective Learning parallel programming using
threads Utilizing many core systems efficiently Performance measurement
Packet capture / filter / analyze - A case study We will use a series of labs to achieve our
objectives.
Dr. Abdul WaheedCopyright © 2009 5-12
Prerequisites
Sniffing Capturing of network packets arriving or
departing from a network interface Mechanism
We use raw sockets as followsrawSock = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_ALL))
This system call picks every packet going out or coming in on an Ethernet interface
Dr. Abdul WaheedCopyright © 2009 5-13
Prerequisites
Testing You can use loop back device as a network
interface Use Netperf or Mpac for traffic generation
on the network interface
Dr. Abdul WaheedCopyright © 2009 5-14
Packet Capturing on Many Core
0 2 4 6 8 10 12 14
1 3 5 7 9 11 13 15
Sender
Receiver
Packet Sniffer
Core
CPU Affinity
DataDedicated Cores
Dr. Abdul WaheedCopyright © 2009 5-15
Sniffing Labs Framework
Sniffing One thread, called the dispatcher, sniffs the
packets from the interface and puts it in one of the workers’ queues
Filtering / Analysis Any kind of processing on a packet is the
responsibility of the workers Each worker has its own queue Dispatcher assigns packets to worker
queues
Dr. Abdul WaheedCopyright © 2009 5-16
Lab 3 – Packet Sniffing
Sniff a frame This lab captures Ethernet packets which
are destined to or departing from a specific interface
Objective Can a dispatcher sniff at the line rate Hands on experience of plain sniffing Observing the base case performance of the
dispatcher – worker model
Dr. Abdul WaheedCopyright © 2009 5-17
Lab 4 – Packet Filtering
Objective Use different packet header information to
sniff specific type of packets Mechanism
Dispatcher will sniff frames and will put in worker queues in round-robin fashion
User will specify source IP, destination IP, source port and destination port for filtering in TCP packets
Dr. Abdul WaheedCopyright © 2009 5-18
Lab 4 – Packet Filtering
Mechanism Each worker will process packets residing in
its queues Observations
Observe the throughput performance with increasing number of threads
Compare the throughput with lab 3 throughput
Use core affinity and observe throughput
Dr. Abdul WaheedCopyright © 2009 5-19
Lab 5 – Deep Packet Inspection Objective
A user provided string will be searched in the TCP based application payload
Mechanisms Same as Lab 4 except each worker now
finds a string in the application payload String to find is provided by the user
Dr. Abdul WaheedCopyright © 2009 5-20
Lab 5 – Deep Packet Inspection Observations
Observe the throughput performance with increasing number of threads
Compare the throughput with lab 3 and 4 throughput
Use core affinity and observer performance