ipv4 and ipv6 interoperability

UNIT IV - ADVANCED SOCKETS

IPv4 and IPv6 Interoperability:

Introduction

IPv4 applications and IPv6 applications can communicate with each other. There are four combinations of clients and servers using either IPv4 or IPv6

IPv4 Client, IPv6 Server

We have an IPv4 client and an IPv6 client on the left. The server on the right is written using IPv6 and it is running on a dual-stack host. The server has created an IPv6 listening TCP socket that is bound to the IPv6 wildcard address and TCP port 9999.

We assume the clients and server are on the same Ethernet. They could also be connected by routers, as long as all the routers support IPv4 and IPv6, but that adds nothing to this discussion.

We can summarize the steps that allow an IPv4 TCP client to communicate with an IPv6 server as follows:

1. The IPv6 server starts, creates an IPv6 listening socket, and we assume it binds the wildcard address to the socket.

2. The IPv4 client calls gethostbyname and finds an A record for the server. The server host will have both an A record and a AAAA record since it supports both protocols, but the IPv4 client asks for only an A record.

3. The client calls connect and the client's host sends an IPv4 SYN to the server.4. The server host receives the IPv4 SYN directed to the IPv6 listening socket, sets a

flag indicating that this connection is using IPv4-mapped IPv6 addresses, and responds with an IPv4 SYN/ACK. When the connection is established, the address returned to the server by accept is the IPv4-mapped IPv6 address.

5. When the server host sends to the IPv4-mapped IPv6 address, its IP stack generates IPv4 datagrams to the IPv4 address. Therefore, all communication between this client and server takes place using IPv4 datagrams.

6. Unless the server explicitly checks whether this IPv6 address is an IPv4-mapped IPv6 address (using the IN6_IS_ADDR_V4MAPPED macro described in Section 12.4), the server never knows that it is communicating with an IPv4 client. The dual-protocol stack handles this detail. Similarly, the IPv4 client has no idea that it is communicating with an IPv6 server.

Processing of received IPv4 or IPv6 datagrams, depending on type of receiving socket.

IPv6 Client, IPv4 Server

First consider an IPv6 TCP client running on a dual-stack host.

1. An IPv4 server starts on an IPv4-only host and creates an IPv4 listening socket.

mk:@MSITStore:C:%5CUsers%5CDinesh%5CDownloads%5CUnix%20Network%20Programming%20Volume%201,Third%20Edition%20The%20Sockets%20Networking%20API%5CUnix%20Network%20Programming%20Volume%201,Third%20Edition%20The%20Sockets%20Networking%20API.chm::/0131411551_ch12lev1sec4.html#ch12lev1sec4

2. The IPv6 client starts and calls getaddrinfo asking for only IPv6 addresses (it requests the AF_INET6 address family and sets the AI_V4MAPPED flag in its hints structure). Since the IPv4-only server host has only A records, we see from Figure 11.8 that an IPV4-mapped IPv6 address is returned to the client.

3. The IPv6 client calls connect with the IPv4-mapped IPv6 address in the IPv6 socket address structure. The kernel detects the mapped address and automatically sends an IPv4 SYN to the server.

4. The server responds with an IPv4 SYN/ACK, and the connection is established using IPv4 datagrams.

Summary of Interoperability

IPv6 Address-Testing Macros

#include <netinet/in.h>

int IN6_IS_ADDR_UNSPECIFIED(const struct in6_addr *aptr);

mk:@MSITStore:C:%5CUsers%5CDinesh%5CDownloads%5CUnix%20Network%20Programming%20Volume%201,Third%20Edition%20The%20Sockets%20Networking%20API%5CUnix%20Network%20Programming%20Volume%201,Third%20Edition%20The%20Sockets%20Networking%20API.chm::/0131411551_ch11lev1sec9.html#ch11fig08


int IN6_IS_ADDR_LOOPBACK(const struct in6_addr *aptr);

int IN6_IS_ADDR_MULTICAST(const struct in6_addr *aptr);

int IN6_IS_ADDR_LINKLOCAL(const struct in6_addr *aptr);

int IN6_IS_ADDR_SITELOCAL(const struct in6_addr *aptr);

int IN6_IS_ADDR_V4MAPPED(const struct in6_addr *aptr);

int IN6_IS_ADDR_V4COMPAT(const struct in6_addr *aptr);

int IN6_IS_ADDR_MC_NODELOCAL(const struct in6_addr *aptr);

int IN6_IS_ADDR_MC_LINKLOCAL(const struct in6_addr *aptr);

int IN6_IS_ADDR_MC_SITELOCAL(const struct in6_addr *aptr);

int IN6_IS_ADDR_MC_ORGLOCAL(const struct in6_addr *aptr);

int IN6_IS_ADDR_MC_GLOBAL(const struct in6_addr *aptr);

All return: nonzero if IPv6 address is of specified type, zero otherwise

Basic Thread Functions: Creation and Termination

When a program is started by exec, a single thread is created, called the initial thread or main thread. Additional threads are created by pthread_create.

#include <pthread.h>

int pthread_create(pthread_t *tid, const pthread_attr_t *attr, void *(*func) (void *), void *arg);

Returns: 0 if OK, positive Exxx value on error

Each thread within a process is identified by a thread ID, whose datatype is pthread_t (often an unsigned int). On successful creation of a new thread, its ID is returned through the pointer tid.

pthread_join Function


int pthread_join (pthread_t tid, void ** status);

Returns: 0 if OK, positive Exxx value on errorpthread_self Function


pthread_t pthread_self (void);

Returns: thread ID of calling threadpthread_detach Function


int pthread_detach (pthread_t tid);

Returns: 0 if OK, positive Exxx value on error

pthread_exit Function


void pthread_exit (void *status);

Does not return to caller

If the thread is not detached, its thread ID and exit status are retained for a later pthread_join by some other thread in the calling process.

The pointer status must not point to an object that is local to the calling thread since that object disappears when the thread terminates.

There are two other ways for a thread to terminate:

The function that started the thread (the third argument to pthread_create) can return. Since this function must be declared as returning a void pointer, that return value is the exit status of the thread.

If the main function of the process returns or if any thread calls exit, the process terminates, including any threads.

TCP Echo Server Using Threads

Create thread Thread function

TCP echo server using threads

threads/tcpserv01.c

1 #include "unpthread.h"

2 static void *doit(void *); /* each thread executes this function */

3 int 4 main(int argc, char **argv) 5 { 6 int listenfd, connfd; 7 pthread_t tid;

8 socklen_t addrlen, len; 9 struct sockaddr *cliaddr;

10 if (argc == 2)11 listenfd = Tcp_listen(NULL, argv[1], &addrlen);12 else if (argc == 3)13 listenfd = Tcp_listen(argv[1], argv[2], &addrlen);14 else15 err_quit("usage: tcpserv01 [ <host> ] <service or port>");

16 cliaddr = Malloc(addrlen);17 for (; ; ) {18 len = addrlen;19 connfd = Accept(listenfd, cliaddr, &len);20 Pthread_create(&tid, NULL, &doit, (void *) connfd);21 }22 }

23 static void *24 doit(void *arg)25 {26 Pthread_detach(pthread_self());27 str_echo((int) arg); /* same function as before */28 Close((int) arg); /* done with connected socket */29 return (NULL);30 }

Passing Arguments to New Threads

int main(int argc, char **argv) { int listenfd, connfd; ...

for ( ; ; ) { len = addrlen; connfd = Accept(listenfd, cliaddr, &len);

Pthread_create(&tid, NULL, &doit, &connfd); } } static void * doit(void *arg) { int connfd;

connfd = * ((int *) arg); pthread_detach (pthread_self()); str_echo (connfd); /* same function as before */ Close (connfd); /* done with connected socket */ return (NULL); }

From an ANSI C perspective this is acceptable: We are guaranteed that we can cast the integer pointer to be a void * and then cast this pointer back to an integer pointer. The problem is what this pointer points to.

There is one integer variable, connfd in the main thread, and each call to accept overwrites this variable with a new value (the connected descriptor). The following scenario can occur:

accept returns, connfd is stored into (say the new descriptor is 5), and the main thread calls pthread_create. The pointer to connfd (not its contents) is the final argument to pthread_create.

A thread is created and the doit function is scheduled to start executing. Another connection is ready and the main thread runs again (before the newly created

thread). accept returns, connfd is stored into (say the new descriptor is now 6), and the main thread calls pthread_create.

TCP echo server using threads with more portable argument passing.

threads/tcpserv02.c


2 static void *doit(void *); /* each thread executes this function */

3 int 4 main(int argc, char **argv) 5 { 6 int listenfd, *iptr; 7 thread_t tid; 8 socklen_t addrlen, len; 9 struct sockaddr *cliaddr;

10 if (argc == 2)11 listenfd = Tcp_listen(NULL, argv[1], &addrlen);12 else if (argc == 3)13 listenfd = Tcp_listen(argv[1], argv[2], &addrlen);14 else15 err_quit("usage: tcpserv01 [ <host> ] <service or port>");

16 cliaddr = Malloc(addrlen);17 for ( ; ; ) {18 len = addrlen;19 iptr = Malloc(sizeof(int));20 *iptr = Accept(listenfd, cliaddr, &len);21 Pthread_create(&tid, NULL, &doit, iptr);22 }23 }

24 static void *25 doit(void *arg)26 {27 int connfd;

28 connfd = *((int *) arg);29 free(arg);

30 Pthread_detach(pthread_self());31 str_echo(confd); /* same function as before */32 Close(confd); /* done with connected socket */33 return (NULL);34 }

Mutexes: Mutual Exclusion

when a thread terminates, the main loop decrements both nconn and nlefttoread. We could have placed these two decrements in the function do_get_read, letting each thread decrement these two counters immediately before the thread terminates. But this would be a subtle, yet significant, concurrent programming error.

1. Thread A is running and it loads the value of nconn (3) into a register.2. The system switches threads from A to B. A's registers are saved, and B's registers are

restored.3. Thread B executes the three instructions corresponding to the C expression nconn--,

storing the new value of 2.4. Sometime later, the system switches threads from B to A. A's registers are restored

and A continues where it left off, at the second machine instruction in the three-instruction sequence. The value of the register is decremented from 3 to 2, and the value of 2 is stored in nconn.

The end result is that nconn is 2 when it should be 1. This is wrong.

Two threads that increment a global variable incorrectly.

threads/example01.c


2 #define NLOOP 5000

3 int counter; /* incremented by threads */

4 void *doit(void *);

5 int 6 main(int argc, char **argv) 7 { 8 pthread_t tidA, tidB;

9 Pthread_create(&tidA, NULL, &doit, NULL);10 Pthread_create(&tidB, NULL, &doit, NULL);

11 /* wait for both threads to terminate */12 Pthread_join(tidA, NULL);13 Pthread_join(tidB, NULL);

14 exit(0);15 }

16 void *17 doit(void *vptr)18 {19 int i, val;

20 /*21 * Each thread fetches, prints, and increments the counter NLOOP times.22 * The value of the counter should increase monotonically.

23 */

24 for (i = 0; i < NLOOP; i++) {25 val = counter;26 printf("%d: %d\n", pthread_self(), val + 1);27 counter = val + 1;28 }

29 return (NULL);30 }

Corrected version of Figure 26.17 using a mutex to protect the shared variable.

threads/example02.c


2 #define NLOOP 5000

3 int counter; /* incremented by threads */ 4 pthread_mutex_t counter_mutex = PTHREAD_MUTEX_INITIALIZER;

5 void *doit(void *);

6 int 7 main(int argc, char **argv) 8 { 9 pthread_t tidA, tidB;

10 Pthread_create(&tidA, NULL, &doit, NULL);11 Pthread_create(&tidB, NULL, &doit, NULL);

12 /* wait for both threads to terminate */13 Pthread_join(tidA, NULL);14 Pthread_join(tidB, NULL);

15 exit(0);16 }17 void *18 doit(void *vptr)19 {20 int i, val;

21 /*22 * Each thread fetches, prints, and increments the counter NLOOP times.23 * The value of the counter should increase monotonically.24 */

25 for (i = 0; i < NLOOP; i++) {26 Pthread_mutex_lock(&counter_mutex);

27 val = counter;28 printf("%d: %d\n", pthread_self(), val + 1);29 counter = val + 1;

30 Pthread_mutex_unlock(&counter_mutex);31 }

32 return (NULL);


33 }

Condition Variables

A mutex is fine to prevent simultaneous access to a shared variable, but we need something else to let us go to sleep waiting for some condition to occur.

int ndone; /* number of terminated threads */ pthread_mutex_t ndone_mutex = PTHREAD_MUTEX_INITIALIZER;

An example is the easiest way to explain these functions. Returning to our Web client example, the counter ndone is now associated with both a condition variable and a mutex.

int ndone; pthread_mutex_t ndone_mutex = PTHREAD_MUTEX_INITIALIZER; pthread_cond_t ndone_cond = PTHREAD_COND_INITIALIZER;

A thread notifies the main loop that it is terminating by incrementing the counter while its mutex lock is held and by signaling the condition variable.

Pthread_mutex_lock(&ndone_mutex); ndone++; Pthread_cond_signal(&ndone_cond); Pthread_mutex_unlock(&ndone_mutex);

The main loop then blocks in a call to pthread_cond_wait, waiting to be signaled by a terminating thread.

while (nlefttoread > 0) { while (nconn < maxnconn && nlefttoconn > 0) { /* find file to read */ ... }

/* Wait for thread to terminate */ Pthread_mutex_lock(&ndone_mutex); while (ndone == 0) Pthread_cond_wait (&ndone_cond, &ndone_mutex);

for (i = 0; i < nfiles; i++) { if (file[i].f_flags & F_DONE) { Pthread_join(file[i].f_tid, (void **) &fptr);

/* update file[i] for terminated thread */ ... } } Pthread_mutex_unlock (&ndone_mutex); }


int pthread_cond_wait(pthread_cond_t *cptr, pthread_mutex_t *mptr);

int pthread_cond_signal(pthread_cond_t *cptr);

Both return: 0 if OK, positive Exxx value on error

Normally, pthread_cond_signal awakens one thread that is waiting on the condition variable. There are instances when a thread knows that multiple threads should be awakened, in which case, pthread_cond_broadcast will wake up all threads that are blocked on the condition variable.


int pthread_cond_broadcast (pthread_cond_t * cptr);

int pthread_cond_timedwait (pthread_cond_t * cptr, pthread_mutex_t *mptr, const struct timespec *abstime);

Both return: 0 if OK, positive Exxx value on errorRaw Socket Creation

The steps involved in creating a raw socket are as follows:

1. The socket function creates a raw socket when the second argument is SOCK_RAW. The third argument (the protocol) is normally nonzero. For example, to create an IPv4 raw socket we would write

2.3. int sockfd;4.5. sockfd = socket(AF_INET, SOCK_RAW, protocol);

where protocol is one of the constants, IPPROTO_xxx, defined by including the <netinet/in.h> header, such as IPPROTO_ICMP.

Only the superuser can create a raw socket. This prevents normal users from writing their own IP datagrams to the network.

6. The IP_HDRINCL socket option can be set as follows:7.8. const int on = 1;9.10. if (setsockopt(sockfd, IPPROTO_IP, IP_HDRINCL, &on, sizeof(on)) <

0)11. error

We will describe the effect of this socket option in the next section.

12. bind can be called on the raw socket, but this is rare. This function sets only the local address: There is no concept of a port number with a raw socket. With regard to output, calling bind sets the source IP address that will be used for datagrams sent on

the raw socket (but only if the IP_HDRINCL socket option is not set). If bind is not called, the kernel sets the source IP address to the primary IP address of the outgoing interface.

13. connect can be called on the raw socket, but this is rare. This function sets only the foreign address: Again, there is no concept of a port number with a raw socket. With regard to output, calling connect lets us call write or send instead of sendto, since the destination IP address is already specified.

Raw Socket Output

Output on a raw socket is governed by the following rules:

Normal output is performed by calling sendto or sendmsg and specifying the destination IP address. write, writev, or send can also be called if the socket has been connected.

If the IP_HDRINCL option is not set, the starting address of the data for the kernel to send specifies the first byte following the IP header because the kernel will build the IP header and prepend it to the data from the process. The kernel sets the protocol field of the IPv4 header that it builds to the third argument from the call to socket.

If the IP_HDRINCL option is set, the starting address of the data for the kernel to send specifies the first byte of the IP header. The amount of data to write must include the size of the caller's IP header. The process builds the entire IP header, except: (i) the IPv4 identification field can be set to 0, which tells the kernel to set this value; (ii) the kernel always calculates and stores the IPv4 header checksum; and (iii) IP options may or may not be included;

The kernel fragments raw packets that exceed the outgoing interface MTU.

Raw sockets are documented to provide an identical interface to the one a protocol would have if it was resident in the kernel [McKusick et al. 1996] Unfortunately, this means that certain pieces of the API are dependent on the OS kernel, specifically with regard to the byte ordering of the fields in the IP header. On many Berkeley-derived kernels, all fields are in network byte order except ip_len and ip_off, which are in host byte order (pp. 233 and 1057 of TCPv2). On Linux and OpenBSD, however, all the fields must be in network byte order.

The IP_HDRINCL socket option was introduced with 4.3BSD Reno. Before this, the only way for an application to specify its own IP header in packets sent on a raw IP socket was to apply a kernel patch that was introduced in 1988 by Van Jacobson to support traceroute. This patch required the application to create a raw IP socket specifying a protocol of IPPROTO_RAW, which has a value of 255 (and is a reserved value and must never appear as the protocol field in an IP header).

The functions that perform input and output on raw sockets are some of the simplest in the kernel. For example, in TCPv2, each function requires about 40 lines of C code (pp. 1054–1057), compared to TCP input at about 2,000 lines and TCP output at about 700 lines.

Raw Socket Input

The first question that we must answer regarding raw socket input is: Which received IP datagrams does the kernel pass to raw sockets? The following rules apply:

Received UDP packets and received TCP packets are never passed to a raw socket. If a process wants to read IP datagrams containing UDP or TCP packets, the packets must be read at the datalink layer, as described in Chapter 29.

Most ICMP packets are passed to a raw socket after the kernel has finished processing the ICMP message. Berkeley-derived implementations pass all received ICMP packets to a raw socket other than echo request, timestamp request, and address mask request (pp. 302–303 of TCPv2). These three ICMP messages are processed entirely by the kernel.

All IGMP packets are passed to a raw socket after the kernel has finished processing the IGMP message.

All IP datagrams with a protocol field that the kernel does not understand are passed to a raw socket. The only kernel processing done on these packets is the minimal verification of some IP header fields: the IP version, IPv4 header checksum, header length, and destination IP address (pp. 213–220 of TCPv2).

If the datagram arrives in fragments, nothing is passed to a raw socket until all fragments have arrived and have been reassembled.

ICMPv6 Type Filtering

To reduce the number of packets passed from the kernel to the application across a raw ICMPv6 socket, an application-specified filter is provided. A filter is declared with a datatype of struct icmp6_filter, which is defined by including <netinet/icmp6.h>. The current filter for a raw ICMPv6 socket is set and fetched using setsockopt and getsockopt with a level of IPPROTO_ICMPv6 and an optname of ICMP6_FILTER.

Six macros operate on the icmp6_filter structure.

#include <netinet/icmp6.h>

void ICMP6_FILTER_SETPASSALL (struct icmp6_filter *filt);

void ICMP6_FILTER_SETBLOCKALL (struct icmp6_filter *filt);

void ICMP6_FILTER_SETPASS (int msgtype, struct icmp6_filter *filt);

void ICMP6_FILTER_SETBLOCK (int msgtype, struct icmp6_filter *filt);

int ICMP6_FILTER_WILLPASS (int msgtype, const struct icmp6_filter *filt);

int ICMP6_FILTER_WILLBLOCK (int msgtype, const struct icmp6_filter *filt);

Both return: 1 if filter will pass (block) message type, 0 otherwise

mk:@MSITStore:C:%5CUsers%5CDinesh%5CDownloads%5CUnix%20Network%20Programming%20Volume%201,Third%20Edition%20The%20Sockets%20Networking%20API%5CUnix%20Network%20Programming%20Volume%201,Third%20Edition%20The%20Sockets%20Networking%20API.chm::/0131411551_ch29.html#ch29

ping Program

The operation of ping is extremely simple: An ICMP echo request is sent to some IP address and that node responds with an ICMP echo reply. These two ICMP messages are supported under both IPv4 and IPv6. Figure 28.1 shows the format of the ICMP messages.

Figure 28.1. Format of ICMPv4 and ICMPv6 echo request and echo reply messages.

Sample output from our ping program.freebsd % ping www.google.comPING www.google.com (216.239.57.99): 56 data bytes64 bytes from 216.239.57.99: seq=0, ttl=53, rtt=5.611 ms64 bytes from 216.239.57.99: seq=1, ttl=53, rtt=5.562 ms64 bytes from 216.239.57.99: seq=2, ttl=53, rtt=5.589 ms64 bytes from 216.239.57.99: seq=3, ttl=53, rtt=5.910 ms

freebsd % ping www.kame.netPING orange.kame.net (2001:200:0:4819:203:47ff:fea5:3085): 56 data bytes64 bytes from 2001:200:0:4819:203:47ff:fea5:3085: seq=0, hlim=52, rtt=422.066 ms64 bytes from 2001:200:0:4819:203:47ff:fea5:3085: seq=1, hlim=52, rtt=417.398 ms64 bytes from 2001:200:0:4819:203:47ff:fea5:3085: seq=2, hlim=52, rtt=416.528 ms64 bytes from 2001:200:0:4819:203:47ff:fea5:3085: seq=3, hlim=52, rtt=429.192 ms

Overview of the functions in our ping program


main function.

ping/main.c

1 #include "ping.h"

2 struct proto proto_v4 = 3 { proc_v4, send_v4, NULL, NULL, NULL, 0, IPPROTO_ICMP };

4 #ifdef IPV6 5 struct proto proto_v6 = 6 { proc_v6, send_v6, NULL, NULL, 0, IPPROTO_ICMPV6 }; 7 #endif

8 int datalen = 56; /* data that goes with ICMP echo request */

9 int10 main(int argc, char **argv)11 {12 int c;13 struct addrinfo *ai;14 char *h;

15 opterr = 0; /* don't want getopt() writing to stderr */16 while ( (c = getopt (argc, argv, "v") ) != -1) {17 switch (c) {18 case 'v':19 verbose++;20 break;

21 case '?':22 err_quit ("unrecognized option: %c", c);23 }24 }

25 if (optind != argc - 1)26 err_quit ("usage: ping [ -v ] <hostname>");27 host = argv [optind];

28 pid = getpid() & Oxffff; /* ICMP ID field is 16 bits */29 Signal(SIGALRM, sig_alrm);

30 ai = Host_serv (host, NULL, 0, 0);

31 h = Sock_ntop_host(ai->ai_addr, ai->ai_addrlen);32 printf ("PING %s (%s): %d data bytes\n",33 ai->ai_canonname ? ai->ai_canonname : h, h, datalen);

34 /* initialize according to protocol */35 if (ai->ai_family == AF_INET) {36 pr = &proto_v4;37 #ifdef IPV638 } else if (ai->ai_family == AF_INET6) {39 pr = &proto_v6;40 if (IN6_IS_ADDR_V4MAPPED (&(((struct sockaddr_in6 *)41 ai->ai_addr)->sin6_addr)))42 err_quit ("cannot ping IPv4-mapped IPv6 address");43 #endif44 } else

45 err_quit ("unknown address family %d", ai->ai_family);

46 pr->sasend = ai->ai_addr;47 pr->sacrecv = Calloc (1, ai->ai_addrlen);48 pr->salen = ai->ai_addrlen);

49 readloop();

50 exit(0);51 }

readloop function.

ping/readloop.c

1 #include "ping.h"

2 void 3 readloop(void) 4 { 5 int size; 6 char recvbuf[BUFSIZE]; 7 char controlbuf[BUFSIZE]; 8 struct msghdr msg; 9 struct iovec iov;10 ssize_t n;11 struct timeval tval;

12 sockfd = Socket(pr->sasend->sa_family, SOCK_RAW, pr->icmpproto);13 setuid(getuid()); /* don't need special permissions any more */14 if (pr->finit)15 (*pr->finit) ();

16 size = 60 * 1024; /* OK if setsockopt fails */17 setsockopt (sockfd, SOL_SOCKET, SO_RCVBUF, &size, sizeof (size));

18 sig_alrm (SIGALRM); /* send first packet */

19 iov.iov_base = recvbuf;20 iov.iov_len = sizeof (recvbuf);21 msg.msg_name = pr->sarecv;22 msg.msg_iov = &iov;23 msg.msg_iovlen = 1;24 msg.msg_control = controlbuf;25 for ( ; ; ) {26 msg.msg_namelen = pr->salen;27 msg.msg_controllen = sizeof (controlbuf);28 n = recvmsg (sockfd, &msg, 0);29 if (n < o) {30 if (errno == EINTR)31 continue;32 else33 err_sys("recvmsg error");34 }35 Gettimeofday (&tval, NULL);36 (*pr->fproc) (recvbuf, n, &msg, &tval);37 }38 }

Get pointer to ICMP header

Headers, pointers, and lengths in processing ICMPv4 reply

traceroute Program

traceroute lets us determine the path that IP datagrams follow from our host to some other destination. Its operation is simple and Chapter 8 of TCPv1 covers it in detail with numerous examples of its usage. traceroute uses the IPv4 TTL field or the IPv6 hop limit field and two ICMP messages. It starts by sending a UDP datagram to the destination with a TTL (or hop limit) of 1. This datagram causes the first-hop router to return an ICMP "time exceeded in transit" error. The TTL is then increased by one and another UDP datagram is sent, which locates the next router in the path. When the UDP datagram reaches the final destination, the goal is to have that host return an ICMP "port unreachable" error. This is done by sending the UDP datagram to a random port that is (hopefully) not in use on that host.

trace.h header.

traceroute/trace.h

1 #include "unp.h" 2 #include <netinet/in_systm.h> 3 #include <netinet/ip.h> 4 #include <netinet/ip_icmp.h> 5 #include <netinet/udp.h>

6 #define BUFSIZE 1500

7 struct rec { /* of outgoing UDP data */ 8 u_short rec_seq; /* sequence number */ 9 u_short rec_ttl; /* TTL packet left with */10 struct timeval rec_tv; /* time packet left */11 };

12 /* globals */13 char recvbuf [BUFSIZE];14 char sendbuf [BUFSIZE];

15 int datalen; /* # bytes of data following ICMP header */16 char *host;17 u_short sport, dport;18 int nsent; /* add 1 for each sendto () */19 pid_t pid; /* our PID */20 int probe, nprobes;

21 int sendfd, recvfd; /* send on UDP sock, read on raw ICMP sock */22 int ttl, max_ttl;23 int verbose;

24 /* function prototypes */25 const char *icmpcode_v4 (int);26 const char *icmpcode_v6 (int);27 int recv_v4 (int, struct timeval *);28 int recv_v6 (int, struct timeval *);29 void sig_alrm (int);30 void traceloop (void);31 void tv_sub (struct timeval *, struct timeval *);

32 struct proto {33 const char *(*icmpcode) (int);34 int (*recv) (int, struct timeval *);35 struct sockaddr *sasend; /* sockaddr{} for send, from getaddrinfo */36 struct sockaddr *sarecv; /* sockaddr{} for receiving */37 struct sockaddr *salast; /* last sockaddr{} for receiving */38 struct sockaddr *sabind; /* sockaddr{} for binding source port */39 socklen_t salen; /* length of sockaddr{}s */40 int icmpproto; /* IPPROTO_xxx value for ICMP */41 int ttllevel; /* setsockopt () level to set TTL */42 int ttloptname; /* setsockopt () name to set TTL */43 } *pr;

44 #ifdef IPV6

45 #include <netinet/ip6.h>46 #include <netinet/icmp6.h>

47 #endif

main function for traceroute program.

traceroute/main.c

1 #include "trace.h"

2 struct proto proto_v4 = { icmpcode_v4, recv_v4, NULL, NULL, NULL, NULL, 0, 3 IPPROTO_ICMP, IPPROTO_IP, IP_TTL 4 };

5 #ifdef IPV6 6 struct proto proto_v6 = { icmpcode_v6, recv_v6, NULL, NULL, NULL, NULL, 0, 7 IPPROTO_ICMPV6, IPPROTO_IPV6, IPV6_UNICAST_HOPS 8 }; 9 #endif

10 int datalen = sizeof (struct rec); /* defaults */11 int max_ttl = 30;12 int nprobes = 3;13 u_short dport = 32768 + 666;

14 int

15 main(int argc, char **argv)16 {17 int c;18 struct addrinfo *ai;19 char *h;

20 opterr = 0; /* don't want getopt () writing to stderr */21 while ( (c = getopt (argc, argv, "m:v")) != -1) {22 switch (c) {23 case 'm':24 if ( (max_ttl = atoi (optarg)) <= 1)25 err_quit ("invalid -m value");26 break;

27 case 'v':28 verbose++;29 break;

30 case '?':31 err_quit ("unrecognized option: %c", c);32 }33 }

34 if (optind != argc - 1)35 err_quit ("usage: traceroute [ -m <maxttl> -v ] <hostname>");36 host = argv [optind];

37 pid = getpid();38 Signal (SIGALRM, sig_alrm);

39 ai = Host_serv (host, NULL, 0, 0);

40 h = Sock_ntop_host (ai->ai_addr, ai->ai_addrlen);41 printf ("traceroute to %s (%s) : %d hops max, %d data bytes\n",42 ai->ai_canonname ? ai->ai_canonname : h, h, max_ttl, datalen);

43 /* initialize according to protocol */44 if (ai->ai_family == AF_INET) {45 pr = &proto_v4;46 #ifdef IPV647 } else if (ai->ai_family == AF_INET6) {48 pr = &proto_v6;49 if (IN6_IS_ADDR_V4MAPPED50 (&(((struct sockaddr_in6 *) ai->ai_addr)->sin6_addr)))51 err_quit ("cannot traceroute IPv4-mapped IPv6 address");52 #endif53 } else54 err_quit ("unknown address family %d", ai->ai_family);

55 pr->sasend = ai->ai_addr; /* contains destination address */56 pr->sarecv = Calloc (1, ai->ai_addrlen);57 pr->salast = Calloc (1, ai->ai_addrlen);58 pr->sabind = Calloc (1, ai->ai_addrlen);59 pr->salen = ai->ai_addrlen;

60 traceloop();

61 exit (0);62 }

ipv4 and ipv6 interoperability

Documents