stream socket programming

Stream Socket Programming

Idioms and pitfalls

Stream Socket Characteristics

• Transmissions across a stream socket are considered to be a continuous stream of bytes.

• Any other structure must be created by the applications participating in the communication.

• "Message" boundaries are not guaranteed to be preserved.

• Most applications DO want to communicate in terms of a series of separate messages.

Application Protocols

• Rules the processes involved in your application use to communicate.

• Includes:– Allowed message types (formats)– Rules about when each message type can be

sent.• Programming language structures don't

always map exactly onto your message formats

Banking Example

• Report total deposit amount (in pennies), total withdrawal amount (in pennies), number of deposits, number of withdrawals.

• Design decision #1: encoding scheme– Character strings of digits– Binary integer values

Character Encoding

• Advantages– No limit on size of values that can be encoded– No byte ordering issues

• Disadvantages– Inefficient– Easy to get buffer size wrong or waste space– Must be careful about delimiters

Binary numeric encoding

• Advantages– Uses fewer bits to represent a given value– Fields in message are a fixed number of bytes

• Disadvantages– Byte ordering is significant – must use

hton/ntoh– Building structs to represent messages isn't

always straightforward (alignment issues)

Alignment rules• Compilers lay out structs to maximize alignment.• Fields are allocated in the order they are

declared• A data value is aligned if its address is a multiple

of its size (e.g. 32 bit ints – 4 bytes – at addresses divisible by 4)

• A struct is aligned if its address is a multiple of the size of the largest data type it contains.

• Unnamed padding bytes are added to keep struct members aligned.

Example

struct mixedData { char Data1; short Data2; int Data3; char Data4; }; Total data bytes = 1 + 2 + 4 + 1 = 8

Examplestruct MixedData /* after compilation */{ char Data1; char Padding0[1]; /* So following 'short' is

on a 2 byte boundary */ short Data2; int Data3; char Data4; char Padding1[3];};

Total bytes = 1 + 1 + 2 + 4 + 1 + 3 = 12Size is a multiple of sizeof(int)

To avoid padding:struct mixedData2 { int Data3; short Data2; char Data1; char Data4;};

Data items declared in order of decreasing size. Assuming actual space needed is a multiple of size of largest data item, no padding needed.

sizeof(mixedData2) = 8

Bank Examplestruct bankMsg {int depositAmt;short depositCnt;int withdrawAmt;short withdrawCnt;

};• Total data size is 4 + 2 + 4 + 2 bytes = 12 bytes• On RHEL 5, using gcc, sizeof(bankMsg) = 16. Why?

No padding needed:

struct bankMsg {int depositAmt;int withdrawAmt;short depositCnt;short withdrawCnt;

};

You can also add the padding to your definition so it is explicit. (Recall sockaddr_in)

Parsing received messages• If the fields are fixed size, we can just send and

receive the associated struct:struct bankMsg msg;void *buffer = (void *) &msg;msg.depositAmt = 2324234;msg.withdrawAmt = 2232344;msg.depositCnt = 50;msg.withdrawCnt = 42;

send(s, &msg, sizeof(bankMsg), 0);

Parsing received messages• In the receiving process:

struct bankMsg msg;void *buffer = (void *) &msg;int rbytes, rv;...for (rbytes = 0; rbytes < sizeof(msg);

rbytes += rv){if ((rv = recv(s, buffer + rbytes,

sizeof(msg) – rbytes, 0)) < 0) /* handle error */

}/* Fix byte order! */msg.depositAmt = ntohl(msg.depositAmt);...

Portability/Compatibility

• The C/C++ standards don't specify alignment rules – left up to compiler implementations

• Say you don't pay attention to padding when you define message format structs in your code. What can go wrong?

Delimited char strings

• Sending multiple messages consisting of variable-length delimited character strings is tricky.

• Doing a receive may give you bytes from more than one message.

• Depending on how you have structured your parsing routines, it may be complicated to track which bytes of a receive buffer you have parsed.

Delimited char strings

• One solution: for each delimited string you expect, receive one byte at a time until you receive the delimiter.

• Leaves subsequent strings waiting in the received stream for subsequent calls to recv().

• Downside: slower than multi-byte receives, but if you don't know how many characters to expect, you're kind of stuck.

Working with core dumps

• When a program crashes, it can create a file containing the image of the address space of the process at the time of the crash – a "core dump"

• Debuggers can examine these files and show you precisely where the error occurred. Good for tracking down segmentation faults.

Requirements• You must compile your program with –g to

preserve symbol table information for the debugger

• You need to be allowed to create core files in your account.

• Use the ulimit –c command to check. If return is 0, no core files will be created.

• Change with "ulimit –c unlimited"• Caution: you can only do this once per login

session, you can't switch back and forth.

Using core dumps

• If a core dump is created, you will see "(core dumped)" in the error message

• Linux creates a file called core.n, where n is a unique number

• To examine the core file, use gdb (ddd also works):gdb executable-name corefile-name

What you see• gdb reports where the error occurred, e.g.

#0 0x08048370 in a (p=0x0) at test_core.c:1111 int y = *p;

• a is the method name• p is the variable that caused the problem• test_core is the name of the executable, 11 is

the line number• you can use gdb to examine variable values,

etc.

Caveats• Core files are big, and because of the Linux

naming convention, you will create a separate one every time a program crashes.

• Pay attention to creation dates, and make sure you're examining the latest core dump.

• Periodically delete core files. Make core.* one of the things the "clean" target in your makefile cleans up

• Don't set ulimit to "unlimited" unless you need to examine a core dump.

A reasonable tutorial

• http://www.network-theory.co.uk/articles/gccdebug.html

stream socket programming

Documents

total data size

int rbytes

total data bytes

total bytes

fixed size

char data1 char padding01

buffer rbytes

size of values