cs 1699 special topic in cloud computing unix programming...

Dr. Taieb Znati

Computer Science Department

University of Pittsburgh

CS 1699 – Special Topic in Cloud Computing

Unix Programming Client-Server and Socket API

PROCESS MODEL Unix Programming

Process Structure

A process in Unix comprises: An address space – usually protected and

virtual – mapped into memory

The code for the running program

The data for the running program

An execution stack and stack pointer (SP)

The program counter (PC)

A set of processor registers – general purpose and status

A set of system resources Files, Network connections, Privileges, …

Processes – Address Space

0x00000000

0xFFFFFFFF

Virtual

address space

Program Code

(Text)

Static Data

(bss)

Heap

(Dynamically allocated)

Execution Stack

(Dynamically Allocated)

PC

SP

Processes Representation

To users and to other processes, a process is identified by its unique Process ID (PID) PID <= 30,000 in Unix

In the OS, processes are represented by entries in a Process Table (PT) PID is index to (or pointer to) a PT entry

PT entry = Process Control Block (PCB)

PCB is a large data structure that contains or points to all information about the process In Linux it is defined in task_struct

Over 70 fields

Process Control Block (PCB)

PCB typically contains: Execution state

PC, SP & processor registers Stored when process is made inactive

Memory management information

Privileges and owner information

Scheduling priority

Resource information

Accounting information

Process CPU State

The CPU state is defined by the registers’ contents Process Status Word (PSW)

exec. mode, last op. outcome, interrupt level

Instruction Register (IR) Current instruction being executed

Program counter (PC)

Stack pointer (SP)

General purpose registers

Process control block (PCB)

state

memory

files

accounting

priority

user

CPU registers storage

text

data

heap

stack

PSW

IR

PC

SP

general purpose registers

PCB CPU kernel user

Process Environment

External entities Terminal

Open files

Communication channels

Local connections

Remote connections to other machines

Process Snapshot

Static data

Code

Dynamic data

Free space

Stack

CPU

(Virtual) Memory

PSW

Program Counter

Stack Pointer

DESCRIPTORS Unix Programming

Unix Descriptors

Each process has its own descriptor table A descriptor table is a data structure

which allows a process to access the

rest of the system’s objects, when it is

allowed Each descriptor is a handle allowing a

process to reference the corresponding

object

Files, devices, …. and sockets.

Keyboard

•Per-process Descriptor Table

•By default, 0, 1 and 2 point to Stdin,

Stdout and Stderr

Standard Input, Output and Error Descriptors

1

2

0

Console

Descriptor Operators

Created by open( ),

dup( ),

dup2( ),

pipe( ),

socket( )

Removed by close( )

FILE SYSTEM AND I/OS

Unix Programming

Unix Files

A Unix file is a sequence of m bytes:

B0, B1, .... , Bk , .... , Bm-1

Named using a rooted hierarchy

All I/O devices are represented as files:

/dev/sda2 (/usr disk

partition)

/dev/tty2 (terminal)

Even the kernel is represented as a file:

/dev/kmem (kernel

memory image)

/proc (kernel data

structures)

Unix I/O

The elegant mapping of files to devices allows kernel to export simple interface called Unix I/O.

Key Unix idea: All input and output is handled in a consistent and uniform way.

Low level Unix I/O API (system calls):

Opening and closing files

open()and close()

Changing the current file

position (seek)

lseek()

Reading and writing a file

read() and write()

PROCESS CREATION Unix Programming

The Fork System Call

The fork() system call creates a "clone" of the calling process.

Identical in every respect except the parent process is returned a non-zero value (namely, the

process id of the child)

the child process is returned zero.

The process id returned to the parent can be used by parent in a wait or kill system call.

int main(int argc, char *argv[]) {

pid = fork();

if (pid == 0) {

fd = open("temp", O_WRONLY|O_CREAT|O_TRUNC, S_IRWXU);

dup2(fd, STDOUT_FILENO);

if (execl(“/bin/ls”, “ls”, NULL) == -1)

perror("execl");

} else {

close(fd);

wait(&status);

}

}

Create a subprocess from program “ls”; redirect standard output of “ls”

into file named “temp”

Unix shell notation: $ ls > temp

Example: ls > temp

Creating processes in UNIX

To see how processes can be used in application and how they are implemented, we study how processes are created and manipulated in UNIX.

Important source of information on UNIX is “man.”

UNIX supports multiprogramming, so there will be many processes in existence at any given time. Processes are created in UNIX with the fork()

system call.

When a process P creates a process Q, Q is called

the child of P and P is called the parent of Q.

Process Hierarchies

Parent creates a child process, child processes can create its own process

Forms a hierarchy UNIX calls this a process group

Signals can be sent all processes of a group

Windows has no concept of process hierarchy all processes are created equal

Initialization

At the root of the family tree of processes in a UNIX system is the special process init: When the computer is switched on, the first thing

it does is to activate resident on the system board

in a ROM (read-only memory) chip.

OS is not available at this stage so that the computer

must pull itself up by its own boot-straps'.

This procedure is thus often referred to as

bootstrapping, also known as cold boot.

Init process is created as part of bootstrapping

Among other things, init spawns a child to listen

to each terminal, so that a user may log on.

UNIX Process Control

UNIX provides a number of system calls for process control including: fork - used to create a new process

exec - to change the program a process is executing

exit - used by a process to terminate itself normally

abort - used by a process to terminate itself

abnormally

kill - used by one process to kill or signal another

wait - to wait for termination of a child process

sleep - suspend execution for a specified time interval

getpid - get process id

getppid - get parent process id

Example using fork 1. #include <unistd.h>

2. main(){

3. pid_t pid;

4. printf(“Just one process so far\n”);

5. pid = fork();

6. if (pid == 0) /* code for child */

7. printf(“I’m the child\n”);

8. else if (pid > 0) /* code for parent */

9. printf(“My child pid is =%d\n”, pid);

10. else /* error handling */

11. printf(“An error has occurred\n”);

12. }

Process Virtual Address Space

Program Code

Static Data

Heap


Kernel Code and Data

PC

SP

User Space

Execution Stack


Kernel Space

0x00000000

0xFFFFFFFF

Virtual

Address Space

System Call fork() – Snaphsot

Static Data

Code

Dynamic Data

Free Space

Stack

CPU

Virtual Memory

Parent PSW

Program Counter

Stack Pointer

Child ID

Parent

Static Data

Code

Dynamic Data

Free Space

Stack

CPU

Virtual Memory

Child PSW

Program Counter

Stack Pointer

0

Chid

fork()

Sample Question

main(){

int x=0;

fork();

x++;

printf(“The value of x is

%d\n”, x);

}

What would be the value of x?

Spawning Applications

fork() is typically used in conjunction with exec (or variants)

pid_t pid;

if ( ( pid = fork() ) == 0 ) {

/* child code: replace executable image */

execv( "/usr/games/tetris", "-easy" )

} else {

/* parent code: wait for child to terminate */

wait( &status )

}

A simple shell while (TRUE) { /* Repeat forever */

type_prompt( ); /* Display prompt */

read_command (command, parameters) /* Input from terminal */

if (fork() != 0) /* Fork off child process */

{

[ Parent code goes here ]

waitpid( -1, &status, 0); /* Wait for child to exit */

}

else

{

/* Child code */

execve (cmnd, param, 0); /* Load and execute command */

}

}

exec System Call

A family of routines, exec: execl, execle, execlp, execv, execve, execvp, or exect Subroutine

execve( program_name, arg1, arg2, ..., environment )

text and data segments of current process replaced

with those of program_name

stack reinitialized with parameters

open file table of current process remains intact

the last argument can pass environment settings

as in example, program_name is actually path name of executable file containing program

Note: unlike subroutine call, there is no return after this call. That is, the program calling exec is gone forever!

The Address Space of an Unix Process

text

data

bss

dynamic

stack

Creating a Process: Before fork()

fork( )

parent process

Creating a Process: After fork()

fork( ) // returns p

parent process

fork( ) // returns 0

child process

(process id = p)

exec(): Loading a New Image

exec(prog, args)

Before

prog’s text

prog’s data

prog’s bss

args

After

Copyright © 2002 Thomas W. Doeppner. All rights reserved.

Parent-Child Synchronization

exit( status ) - executed by a child process when it wants to terminate. Makes status (an integer) available to parent.

wait( &status ) - suspends execution of process until some child process terminates status indicates reason for termination

return value is process-id of terminated child

waitpid (pid, &status, options) pid can specify a specific child

Options can be to wait or to check and proceed

Process Termination

Besides being able to terminate itself with exit, a process can be killed by another process using kill:

kill( pid, sig ) - sends signal sig to process with process-id pid. One signal is SIGKILL (terminate the target process immediately).

When a process terminates, all the resources it owns are reclaimed by the system: “process control block” reclaimed

its memory is deallocated

all open files closed and Open File Table reclaimed.

Note: a process can kill another process only if: it belongs to the same user or it is a super user

How shell executes a command

When user types command, shell forks a clone of itself Child process makes an exec call, which causes it to stop

executing the shell and start executing user command Parent process, still running the shell, waits for the child

to terminate

fork wait

exit exec Required job

Parent shell

Child

CLIENT-SERVER MODEL SOCKET API

Unix Programming

A Client-Server Transaction

Client

process

Server

process

1. Client sends request

2. Server

handles

request

3. Server sends response 4. Client

handles

response

Resource

Every network application is based on the client-server model: A server process and one or more client

processes

Server manages some resource.

Server provides service by manipulating

resource for clients.

Clients and Servers are processes running on hosts – the same or different hosts.

Network Applications

Access to Network via Program Interface Sockets make network I/O look like files

Call system functions to control and communicate

Network code handles issues of routing, reliability, ordering,

&c.

Client Computer

OS

Network

Interface

Client

Appl.

Socket

OS +

Network

APIs

Server Computer

OS

Network

Interface

Server

Appl.

Socket OS +

Network

APIs

Internet

The Internet Hours Glass Architecture

UDP TCP

Data Link

Physical

Hour Glass Model

FTP HTTP

TCP UDP

IP

NET1 NET2 NETn

Applications TFTP NV

NET2

Internet Protocol Encapsulation

Data Packet Data Packet Application Layer

Transport Layer

Network Layer

Link Layer

HDR HDR

HDR HDR

Frame

HDR

Frame

HDR

Data Packet Data Packet

HDR HDR Data Packet Data Packet

IP

HDR

IP

HDR

TCP

HDR

TCP

HDR Data Packet Data Packet Frame

TRL

Frame

TRL

22Bytes 20Bytes 20Bytes 4Bytes

64 to 1500 Bytes Physical Layer Physical Layer

Internet Protocol (IP)

Datagram (packet) protocol

Best-effort service Loss

Reordering

Duplication

Delay

Transport Protocols

Best-effort not sufficient!

Add services on top of IP

User Datagram Protocol (UDP) Data checksum

Best-effort

Transmission Control Protocol (TCP) Data checksum

Reliable byte-stream delivery

Flow and congestion control

Clients

How does a client specify a server? The IP address in the server socket address identifies the

host (more precisely, an adaptor on the host)

The (well-known) port in the server socket address

identifies the service, and thus implicitly identifies the

server process that performs that service.

Examples of well-known ports

Port 7: Echo server;

Port 23: Telnet server

Port 25: Mail server

Port 80: Web server

IP Address

32-bit identifier

Dotted-quad: 192.118.56.25

www.mkp.com -> 167.208.101.28

Identifies a host interface (not a host)

192.18.22.13 209.134.16.123

Ports

Identifying the ultimate destination IP addresses identify hosts

Host has many applications

Ports (16-bit identifier)

192.18.22.13

Port 80 25 23

Application WWW E-mail Telnet

Internet Connections (TCP/IP)

Connection socket pair

(128.2.194.242:3479, 208.216.181.15:80)

Server

(port 80) Client

Client socket address

128.2.194.242:3479

Server socket address

208.216.181.15:80

Client host address

128.2.194.242

Server host address

208.216.181.15

Clients and servers communicate by sending streams of bytes over connections.

Connections are point-to-point, full-duplex (2-way communication), and reliable.

Note: 3479 is an ephemeral port allocated

by the kernel

Note: 80 is a well-known port associated with Web servers

Using Ports to Identify Services

Web server

(port 80)

Client host

Server host 128.2.194.242

Echo server

(port 7)

Service request for

128.2.194.242:80

(i.e., the Web server)

Web server

(port 80)

Echo server

(port 7)

Service request for

128.2.194.242:7

(i.e., the echo server)

Kernel

Kernel

Client

Client

Servers

Servers are long-running processes (daemons). Created at boot-time (typically) by the init process

(process 1)

Run continuously until the machine is turned off.

Each server waits for requests to arrive on a well-known port associated with a particular service. Port 7: echo server

Port 23: telnet server

Port 25: mail server

Port 80: HTTP server

See /etc/services for a

list of service to port bindings.

Sockets Interface

Created in the early 80’s as part of the original Berkeley distribution of Unix that contained an early version of the Internet protocols.

Provides a user-level interface to the network.

Underlying basis for all Internet applications.

Based on client/server programming model.

Client / Server

Session

Overview of the Sockets Interface Client Server

Socket() Socket()

Bind()

Listen()

Read()

Write() Read()

Write()

Connection

request

Read()

Close()

close*() EOF

Await connection

request from

next client

open_listenfd

open_clientfd

Accept() Connect()

What is a Socket?

To the kernel – an endpoint of communication.

To an application – a file descriptor that lets the application read/write from/to the network. All Unix I/O devices, including networks, are modelled as

files.

Clients and servers communicate with each by reading from and writing to socket descriptors.

The main distinction between regular file I/O and socket I/O is how the application “opens” the socket descriptors.

TCP/IP Sockets

Family Type Protocol

TCP PF_INET

SOCK_STREAM IPPROTO_TCP

UDP SOCK_DGRAM IPPROTO_UDP

mySock = socket(family, type, protocol);

TCP/IP-specific sockets

Socket reference

File (socket) descriptor in UNIX

Socket handle in WinSock

struct sockaddr

{

unsigned short sa_family; /* Address family (e.g., AF_INET) */

char sa_data[14]; /* Protocol-specific address information */

};

struct sockaddr_in

{

unsigned short sin_family; /* Internet protocol (AF_INET) */

unsigned short sin_port; /* Port (16-bits) */

struct in_addr sin_addr; /* Internet address (32-bits) */

char sin_zero[8]; /* Not used */

};

struct in_addr

{

unsigned long s_addr; /* Internet address (32-bits) */

};

Generic

IP S

peci

fic

sockaddr

sockaddr_in

Family

Family Port

Blob

Internet address Not used

2 bytes 2 bytes 4 bytes 8 bytes

TCP Client/Server Interaction

Client

1. Create a TCP socket

2. Establish connection

3. Communicate

4. Close the connection

Server


2. Assign a port to socket

3. Set socket to listen

4. Repeatedly:

a. Accept new connection

b. Communicate

c. Close the connection

Server starts by getting ready to receive client connections…


Client



3. Communicate


Server


2. Bind socket to a port


4. Repeatedly:


b. Communicate


/* Create socket for incoming connections */ if ((servSock = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP)) < 0) DieWithError("socket() failed");


Client



3. Communicate


Server




4. Repeatedly:


b. Communicate


echoServAddr.sin_family = AF_INET; /* Internet address family */ echoServAddr.sin_addr.s_addr = htonl(INADDR_ANY);/* Any incoming interface */ echoServAddr.sin_port = htons(echoServPort); /* Local port */ if (bind(servSock, (struct sockaddr *) &echoServAddr, sizeof(echoServAddr)) < 0) DieWithError("bind() failed");


Client



3. Communicate


Server




4. Repeatedly:


b. Communicate


/* Mark the socket so it will listen for incoming connections */ if (listen(servSock, MAXPENDING) < 0) DieWithError("listen() failed");


Client



3. Communicate


Server




4. Repeatedly:


b. Communicate


for (;;) /* Run forever */ { clntLen = sizeof(echoClntAddr); if ((clntSock=accept(servSock,(struct sockaddr *)&echoClntAddr,&clntLen)) < 0) DieWithError("accept() failed");


Client



3. Communicate


Server




4. Repeatedly:


b. Communicate


Server is now blocked waiting for connection from a client

Later, a client decides to talk to the server…


Client



3. Communicate


Server




4. Repeatedly:


b. Communicate


/* Create a reliable, stream socket using TCP */ if ((sock = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP)) < 0) DieWithError("socket() failed");


Client



3. Communicate


Server




4. Repeatedly:


b. Communicate


echoServAddr.sin_family = AF_INET; /* Internet address family */ echoServAddr.sin_addr.s_addr = inet_addr(servIP); /* Server IP address */ echoServAddr.sin_port = htons(echoServPort); /* Server port */ if (connect(sock, (struct sockaddr *) &echoServAddr, sizeof(echoServAddr)) < 0) DieWithError("connect() failed");


Client



3. Communicate


Server




4. Repeatedly:


b. Communicate


if ((clntSock=accept(servSock,(struct sockaddr *)&echoClntAddr,&clntLen)) < 0) DieWithError("accept() failed");


Client



3. Communicate


Server




4. Repeatedly:


b. Communicate


echoStringLen = strlen(echoString); /* Determine input length */ /* Send the string to the server */ if (send(sock, echoString, echoStringLen, 0) != echoStringLen) DieWithError("send() sent a different number of bytes than expected");


Client



3. Communicate


Server




4. Repeatedly:


b. Communicate


/* Receive message from client */ if ((recvMsgSize = recv(clntSocket, echoBuffer, RCVBUFSIZE, 0)) < 0) DieWithError("recv() failed");


Client



3. Communicate


Server




4. Repeatedly:


b. Communicate


close(sock); close(clntSocket)

Closing a Connection

close() used to delimit communication

Analogous to EOF

Echo Client send(string)

while (not received entire string)

recv(buffer)

print(buffer)

close(socket)

Echo Server

recv(buffer)

while(client has not closed connection)

send(buffer)

recv(buffer)

close(client socket)

ADVANCED MATERIAL

Socket

Addresses struct hostent * he = gethostbyname(argv[1]);

struct sockaddr_in their_addr;

their_addr.sin_addr = *((struct in_addr *)he->h_addr);

Look up destination host name

Address Access/Conversion Functions

All binary values are network byte ordered

struct hostent* gethostbyname (const char* hostname);

Translate English host name to IP address (uses DNS)

struct hostent* gethostbyaddr (const char* addr, size_t len, int family);

Translate IP address to English host name (not secure)

char* inet_ntoa (struct in_addr inaddr);

Translate IP address to ASCII dotted-decimal notation (e.g., “128.32.36.37”)

Structure: hostent

The hostent data structure (from /usr/include/netdb.h)

canonical domain name and aliases

list of addresses associated with machine

also address type and length information struct hostent {

char* h_name; /* official name of host */

char** h_aliases; /* NULL-terminated alias list */

int h_addrtype /* address type (AF_INET) */

int h_length; /* length of addresses (4B) */

char** h_addr_list; /* NULL-terminated address list */

#define h_addr h_addr_list[0];/* backward-compatibility */

};

Choose port

their_addr.sin_port = htons(atoi(argv[2]));

Select a destination port

Convert byte order

Byte Ordering

Big Endian vs. Little Endian Little Endian (Intel, DEC):

Least significant byte of word is stored in the lowest memory address

Big Endian (Sun, SGI, HP): Most significant byte of word is stored in the lowest

memory address

Network Byte Order = Big Endian Allows both sides to communicate

Must be used for some data (i.e. IP Addresses)

Good form for all binary data

Byte Ordering Functions

16- and 32-bit conversion functions (for platform independence)

Examples:

int m, n;

short int s,t;

m = ntohl (n) net-to-host long (32-bit) translation

s = ntohs (t) net-to-host short (16-bit) translation

n = htonl (m) host-to-net long (32-bit) translation

t = htons (s) host-to-net short (16-bit) translation

Functions: sendto int sendto (int sockfd, char* buf, size_t nbytes,

int flags, struct sockaddr* destaddr, int

addrlen);

Send a datagram to another UDP socket. Returns number of bytes written or -1. Also sets errno on failure.

sockfd: socket file descriptor (returned from socket)

buf: data buffer

nbytes: number of bytes to try to read

flags: see man page for details; typically use 0

destaddr: IP address and port number of destination socket

addrlen: length of address structure = sizeof (struct sockaddr_in)

Functions: recvfrom int recvfrom (int sockfd, char* buf, size_t

nbytes, int flags, struct sockaddr* srcaddr, int* addrlen);

Read a datagram from a UDP socket. Returns number of bytes read (0 is valid) or -1. Also sets errno on

failure. sockfd: socket file descriptor (returned from socket) buf: data buffer nbytes: number of bytes to try to read flags: see man page for details; typically use 0 srcaddr: IP address and port number of sending socket (returned

from call) addrlen: length of address structure = pointer to int set to

sizeof (struct sockaddr_in)

TCP and UDP Ports

Allocated and assigned by the Internet Assigned Numbers Authority see RFC 1700 or ftp://ftp.isi.edu/in-notes/iana/assignments/port-numbers

1-512 standard services (see /etc/services)

super-user only

513-1023 registered and controlled, also used for identity

verification

super-user only

1024-49151 registered services/ephemeral ports

49152-65535 private/ephemeral ports

Conclusion

Process Management in Unix

Socket API

Socket Operations

TCP Client Server Model

cs 1699 special topic in cloud computing unix programming...

Documents