just enough c for open source projects

104
Just Enough C For Open Source Projects Andy Lester OSCON 2008

Post on 17-Oct-2014

31.145 views

Category:

Technology


3 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Just Enough C For Open Source Projects

Just Enough C For Open Source Projects

Andy LesterOSCON 2008

Page 2: Just Enough C For Open Source Projects

Why this session?

Page 3: Just Enough C For Open Source Projects

It's all forMichael Schwern

Schwern is a brilliant programmer.

An invaluable asset to the Perl community.

He still doesn't know C, even though Perl is written in C.

Page 4: Just Enough C For Open Source Projects

My assumptions

• You're like Schwern.

• Already a programmer, but raised on Ruby, Python, Java or PHP.

• You don't know what it was like in the bad old days.

• You want to work on open source projects.

Page 5: Just Enough C For Open Source Projects

Goals

• As much territory as possible

• As fast as possible

• Danger zones

• Impedance mismatches

Page 6: Just Enough C For Open Source Projects

Big differences

• Nothing is DWIM.

• All memory must be explicitly handled.

• Nothing automatically does anything.

• No strings extending.

• No magic type conversions.

Page 7: Just Enough C For Open Source Projects

Jumping in

Page 8: Just Enough C For Open Source Projects

#include <stdio.h>

int main( int argc, char *argv[] ) {puts( "Hello, World!" );

return 0; }

Hello, world!

Page 9: Just Enough C For Open Source Projects

uniqua:~/ctut/c : gcc -Wall -o hello hello.c

uniqua:~/ctut/c : ls -ltotal 20-rwxr-xr-x 1 andy andy 6592 Jul 6 01:02 hello*-rw-r--r-- 1 andy andy 103 Jul 6 01:00 hello.c

uniqua:~/ctut/c : ./helloHello, World!

Build & run

Page 10: Just Enough C For Open Source Projects

Build & run

• gcc compiles hello.c into a.out or hello.o.

• gcc calls the linker to make an executable program out of the object file

• main() has to be called main(). Otherwise linking won’t work.

Page 11: Just Enough C For Open Source Projects

Literals

• Strings are double-quoted.

• Characters are single-quoted.

Page 12: Just Enough C For Open Source Projects

Variables

• Variables have no sigils.

• Integers are int, and can be unsigned.

• Long integers are long, can be unsigned.

• Floating point numbers are float.

• Characters are char.

• There is no string type.

Page 13: Just Enough C For Open Source Projects

Variables

• Variables are never initialized for you.

• You create a variable, it's going to contain whatever happens to be in that spot in memory. It is almost never what you want.

Page 14: Just Enough C For Open Source Projects

Casting variablesTo change an int to a long, say:int n;long l;l = (long)n;

Upcasting to a bigger size is implicit.l = n;

Downcasting is dangerousn = (int)l; /* Could lose significant bits */

Page 15: Just Enough C For Open Source Projects

Converting valuesConvert strings to numbers with atoi and atof

int i = atoi( "1234" );float f = atof( "1234.5678" );

Convert numbers to strings with sprintfsprintf( str, "%d", 1234 );sprintf( str, "%8.4f", 1234.5678 );

Page 16: Just Enough C For Open Source Projects

Numeric max/minints can only be so big, then wrap#include <limits.h>int n = INT_MAX;printf( "n = %d\n", n ); n++;printf( "n = %d\n", n );

unsigned int u = UINT_MAX;printf( "u = %u\n", u ); u++;printf( "u = %u\n", u );

n = 2147483647n = -2147483648u = 4294967295u = 0

Page 17: Just Enough C For Open Source Projects

IntegersAll integer size maxima and minima are platform-dependent.

Those 32-bit ints on the previous slide? Luxury!

In my day, we had 2-bit ints, and we were glad to get those!

When in doubt, use <limits.h>.

Page 18: Just Enough C For Open Source Projects

ArraysArrays are pre-defined.

Arrays cannot change in size.int scores[10];/* Numbered 0-9 */

Page 19: Just Enough C For Open Source Projects

FunctionsTake 0 or more arguments.

Return 0 or 1 values/* Declaration */int square( int n );

/* Definition */int square( int n ) { return n*n;}

Page 20: Just Enough C For Open Source Projects

FunctionsFunctions can return void, meaning nothing.void greet_person( const char * name ) { printf( "Hello, %s\n", name );}

Functions can take no arguments with voidvoid launch_nukes( void ) { /* Implementation details elided. */ /* Return value not necessary. */}

Page 21: Just Enough C For Open Source Projects

Questions?

Page 22: Just Enough C For Open Source Projects

Pointers

Page 23: Just Enough C For Open Source Projects

Pointers

• Address in memory, associated with a type

• Dangerous

• But you can't live without 'em

Page 24: Just Enough C For Open Source Projects

PointersTake address of something with &

Dereference the pointer with *char answer;char * cptr = &answer;*cptr = 'x';

Page 25: Just Enough C For Open Source Projects

PointersPass-by-reference with pointersvoid set_defaults( int *x, int *y, char *c ) { *x = 42; *y = 0; *c = ' ';}

int this;int that;char other;

set_defaults( &this, &that, &other );

Page 26: Just Enough C For Open Source Projects

Pointer mathYou can move pointers forward & back:

int totals[3] = { 2112, 5150, 90125 };int *i = totals;/* or = &totals[0] */

i = i + 2; /* now points at totals[2] */*i = 14; /* Sets totals[2] to 14 */

Page 27: Just Enough C For Open Source Projects

Strings & structs

Page 28: Just Enough C For Open Source Projects

StringsStream of chars, ended with a nul ('\0').char buffer[100];char *p = &buffer;p[0] = 'H';p[1] = 'e';p[2] = 'l';...p[12] = 'd';p[13] = '!';p[14] = '\0';puts( p );

/* Prints "Hello, world" */

Page 29: Just Enough C For Open Source Projects

StringsOr you can use standard functions.#include <string.h>char buffer[100];char *p = &buffer;

strcpy( p, "Hello, world!" );

buffer[] contains this:|H|e|l|l|o|,| |W|o|r|l|d|!|\0| ... +86 bytes trash

Page 30: Just Enough C For Open Source Projects

StringsThere is no bounds checking.char buffer[10];

strcpy( buffer, "Hello, world!" );/* Writes 14 chars into a 10-char buffer */

This is where buffer overflow security advisories come from.

Page 31: Just Enough C For Open Source Projects

StringsDeclaring a string at compile-time will automagically give you the buffer you need.char greeting[] = "Hello, world!";

printf( "greeting \"%s\", size %d\n", greeting, (int)sizeof(greeting));

>> greeting "Hello, world!", size 14

Page 32: Just Enough C For Open Source Projects

Stringsstrcat() tacks strings on each otherchar greeting[100];

strcpy( greeting, "Hello, " );strcat( greeting, "world!" );

Page 33: Just Enough C For Open Source Projects

Stringsstrcmp() compares strings and returns 1, 0, -1 if the first string is less than, equal, or greater.if ( strcmp( str, "monkey" ) == 0 ) { handle_monkey();}

It is NOT a boolean, so don't pretend it is.#define STREQ(x,y) (strcmp((x),(y))==0)/* Wrapper macro that is a boolean */

Page 34: Just Enough C For Open Source Projects

strlen()Gives the length of a stringconst char name[] = 'Bob';int len = strlen( name );

/* len is now 3, although sizeof(name) == 4 */

Page 35: Just Enough C For Open Source Projects

StructsStructs aggregate vars together#define LEN_USERNAME 8 struct User { char username[ LEN_USERNAME + 1 ]; unsigned int age; unsigned int salary; int example_originality_rating; };

struct User staff[100];

Page 36: Just Enough C For Open Source Projects

UnionsUnions let storage be of one type or another.struct User { char type; union { float hourly_rate; long yearly_salary; } pay;};

struct User Bob;Bob.type = 'H';Bob.pay.hourly_rate = 9.50;

struct User Ted;Ted.type = 'S';Ted.pay.yearly_salary = 70000L;

Page 37: Just Enough C For Open Source Projects

Questions?

Page 38: Just Enough C For Open Source Projects

File I/O

Page 39: Just Enough C For Open Source Projects

File I/OIn Perl, it's easy.open( my $fh, '<', $filename ) or die( "Can't open $filename: $!\n" );my $str = <$fh>;close( $fh );

Doesn't matter how big that first line of the file is, because $fh grows as necessary.

Page 40: Just Enough C For Open Source Projects

In C? No such luck.

Page 41: Just Enough C For Open Source Projects

File I/OFILE *fp;fp = fopen( "/path/to/file", "r" );if ( fp == NULL ) { puts( "Unable to open file" ); /* Print an error based on errno */ exit 1;}

char buffer[100];char *p;p = fgets( buffer, sizeof(buffer), fp );if ( errno ) { /* put out an error based on errno */}if ( p == NULL ) { puts( "I'm at EOF" );}fclose( fp );

Page 42: Just Enough C For Open Source Projects

This is why I fell in love with Perl*

Page 43: Just Enough C For Open Source Projects

Questions?Praise for modern

languages?

Page 44: Just Enough C For Open Source Projects

Macros and the preprocessor

Page 45: Just Enough C For Open Source Projects

Macros

Macros get handled by the preprocessor.#define MAX_USERS 100int scores[MAX_USERS];for ( int i = 0; i < MAX_USERS; i++ ) { ...}

This expands before compilation.int scores[100];for ( int i = 0; i < 100; i++ ) { ...}

Page 46: Just Enough C For Open Source Projects

Macros

Macros can take arguments that are replaced by the preprocessor.#define MAX_USERS 100#define BYTES_NEEDED(n,type) n * sizeof(type)int *scores = malloc( BYTES_NEEDED( MAX_USERS, int ) );

becomesint *scores = malloc( 100 * sizeof(int) );

Page 47: Just Enough C For Open Source Projects

Macro safetyAlways wrap your macro arguments in parens, because of order of operations.#define BYTES_NEEDED(n,type) n * sizeof(type)const int bytes = BYTES_NEEDED(n+1,int)

becomes#define BYTES_NEEDED(n,type) n * sizeof(type)const int bytes = nusers+1 * sizeof(int);/* Evals as nusers +(1*sizeof(int)) */

Page 48: Just Enough C For Open Source Projects

Macro safetyInstead, define it as:#define BYTES_NEEDED(n,type) ((n)*sizeof(type))

so it expands asconst int bytes = ((nusers+1) * sizeof(int));

Page 49: Just Enough C For Open Source Projects

Macrosgcc -E will preprocess to stdout, so you can see exactly what the effects of your macros are.

Most compilers these days will inline simple functions, so don’t use macros instead of functions in the name of “efficiency.”

Page 50: Just Enough C For Open Source Projects

Conditional compilationMacros allow you to have multiple platforms in the same block of code.#ifdef WIN32/* Compile some Win32-specific code.#endif

Page 51: Just Enough C For Open Source Projects

Conditional compilationMacros let you compile in debug code or leave it out.int launch_missiles( void ) {#ifdef DEBUG log( "Entering launch_missiles" );#endif

/* Implementation details elided */

#ifdef DEBUG log( "Leaving launch_missiles" );#endif}

Page 52: Just Enough C For Open Source Projects

Conditional compilationYou can also use the value of those macros.int launch_missiles( void ) {#if DEBUG_LEVEL > 2 log( "Entering launch_missiles" );#endif

/* Implementation details elided */

#if DEBUG_LEVEL > 2 log( "Leaving launch_missiles" );#endif}

Page 53: Just Enough C For Open Source Projects

Open source projects use conditional

compilation a LOT.

Page 54: Just Enough C For Open Source Projects

Questions?

Page 55: Just Enough C For Open Source Projects

Memory allocation

Page 56: Just Enough C For Open Source Projects

Memory allocation

• Perl, Ruby, PHP, Python, most any dynamic language hides all this from you.

• Soon you’ll see why.

• It's a pain to deal with.

• It's dangerous.

• It's necessary.

Page 57: Just Enough C For Open Source Projects

malloc() & free()const char name[] = "Bob in Marketing";

int main( int argc, char *argv[] ) { char *message = malloc( 100 );

if ( message == NULL ) { puts( "Failed to allocate memory" ); exit(1); } strcpy( message, "Hello, " ); strcat( message, name ); strcat( message, "!" ); puts( message ); free( message );

return 0;}

Page 58: Just Enough C For Open Source Projects

sizeof()Get the size of a type with sizeofint *scores = malloc( 100 * sizeof( int ) );

sizeof is an operator.

sizeof happens at compile time. Sorry, no run-time dynamic type sizing.

Page 59: Just Enough C For Open Source Projects

memset()Sets a range of memory to a given value.

Definition:memset( void *p, char ch, unsigned int nbytes );

Use:memset( scores, 0, 100 * sizeof( int ) );

Page 60: Just Enough C For Open Source Projects

memcpy()Copies range of memory to another place.

Definition:memcpy( void *targ, void *source, unsigned int nbytes );

Use:memcpy( scores, original_scores, 100 * sizeof(int) );

If the ranges of memory overlap, you have to use memmove.

Page 61: Just Enough C For Open Source Projects

realloc()realloc resizes the buffer you previously malloced, or a new one of the new size.int bufsize = users_allocated * sizeof(int);int *scores = malloc( bufsize );...nusers++;if ( n_users > users_allocated ) { users_allocated += (users_allocated/2); bufsize = users_allocated * sizeof(int); scores = realloc( scores, bufsize );)

You may not get the same block of memory back, so other pointers that pointed into the buffer are now invalidated.

Page 62: Just Enough C For Open Source Projects

Memory catastrophes

Page 63: Just Enough C For Open Source Projects

These are why we have programs crashes and

security advisories.

Page 64: Just Enough C For Open Source Projects

Memory catastrophesReturning a pointer to a local variable.

We do this all the time in Perl, for example.sub name_ref { my $name = 'Bob'; return \$name;}

my $ref = name_ref();print ${$ref};

Perl has reference counting to keep track of when areas of memory are no longer used and can be returned to memory.

Page 65: Just Enough C For Open Source Projects

Memory catastrophesWhen you exit a function in C, you lose the rights to what's on the stack.char *name( void ) { char temp[4]; int n; strcpy( temp, "Bob" ); return temp;}

char *who = name();do_something();puts( who );

Let's see how this works.

Page 66: Just Enough C For Open Source Projects

Memory catastrophesBefore calling name()char *name( void ) { char temp[4]; int n; strcpy( temp, "Bob" ); return temp;}

char *who = name();do_something();puts( who );

Top of stack

n (4 bytes)

temp[0]

temp[1]

temp[2]

temp[3]

Return address

Page 67: Just Enough C For Open Source Projects

Memory catastrophesJust called name()char *name( void ) { char temp[4]; int n; strcpy( temp, "Bob" ); return temp;}

char *who = name();do_something();puts( who );

Top of stack

n (4 bytes)

temp[0]

temp[1]

temp[2]

temp[3]

Return address

Page 68: Just Enough C For Open Source Projects

Memory catastrophesReturning from name()char *name( void ) { char temp[4]; int n; strcpy( temp, "Bob" ); return temp;}

char *who = name();do_something();puts( who );

Top of stack

n (4 bytes)

temp[0] <<<

temp[1]

temp[2]

temp[3]

Return address

Page 69: Just Enough C For Open Source Projects

Memory catastrophesReturned from name()char *name( void ) { char temp[4]; int n; strcpy( temp, "Bob" ); return temp;}

char *who = name();do_something();puts( who );

who

Top of stack

Page 70: Just Enough C For Open Source Projects

Memory catastrophesCalled do_something()char *name( void ) { char temp[4]; int n; strcpy( temp, "Bob" ); return temp;}

char *who = name();do_something();puts( who );

???

???

??? who

???

???

???

Return address

Page 71: Just Enough C For Open Source Projects

Memory catastrophesPrinting whochar *name( void ) { char temp[4]; int n; strcpy( temp, "Bob" ); return temp;}

char *who = name();do_something();puts( who );

???

???

??? who

???

???

???

Top of stack

Page 72: Just Enough C For Open Source Projects

Memory catastrophesIf you dereference NULL, you crash.char *p = NULL;*p = 'x';

Page 73: Just Enough C For Open Source Projects

Memory catastrophesIf you dereference a random value:char *p;*p = 'x';

Page 74: Just Enough C For Open Source Projects

Memory catastrophesIf you free something you didn't malloc:char name[] = 'Bob';char *p = &name;free(p);

Page 75: Just Enough C For Open Source Projects

Memory catastrophesIf you dereference memory, and use it again, you crash, or corrupt memory, or open yourself to a security hole.char *p = malloc( 100 );...free(p);*p = 'x';

Page 76: Just Enough C For Open Source Projects

Memory catastrophesIf you use more memory than you allocated, you crash, or corrupt memory, or open yourself to a security hole.char *p = malloc( 10 );strcpy( p, "Hello, world!" );

Page 77: Just Enough C For Open Source Projects

Memory catastrophesIf you use more memory than you allocated, you crash, or corrupt memory, or open yourself to a security hole.char *p = malloc( 10 );strcpy( p, "Hello, world!" );

orchar *p = malloc(10);p[10] = 'x'; /* Off-by-one, a fencepost error */

Page 78: Just Enough C For Open Source Projects

Memory catastrophesIf you allocate memory and don't free it, you have a memory leak.void do_something( void ) { char *p = malloc(100); /* do some stuff */ return;}

p is never freed, and we'll never have that pointer to that buffer again to free it in the future.

Page 79: Just Enough C For Open Source Projects

Questions?Groans of fear?

Page 80: Just Enough C For Open Source Projects

Advanced pointers

Page 81: Just Enough C For Open Source Projects

Advanced pointersInstall cdecl. It is invaluable. It also is rarely packaged, so you'll have to build from source.uniqua:~ : cdeclType `help' or `?' for helpcdecl> explain char *pdeclare p as pointer to charcdecl> explain int x[34]declare x as array 34 of int

Page 82: Just Enough C For Open Source Projects

Advanced pointersvoid * throws away your type checking.char name[100];char *p = name;int *n = p;

void.c:5: warning: initialization from incompatible pointer type

But this causes no warnings:char name[100];char *p = name;void *v = p;int *n = v;

Page 83: Just Enough C For Open Source Projects

Double pointersYou can have a pointer to a pointer, so you can modify your pointers.char *p = "Hello, world.";repoint_pointer( p );

void repoint_pointer( char **handle ) { /* Repoint if the current string starts with 'H' */ if ( strcmp( *handle, "Hello, world." ) == 0 ) { *handle = "Goodbye, world."; }}

Page 84: Just Enough C For Open Source Projects

Function pointersYou can point to functions.cdecl> explain void (*fn_ptr)(void)declare fn_ptr as pointer to function (void) returning void

Here's how it's usedvoid (*fn_ptr)(void) = \&some_action;if ( x == 1 ) { fn_ptr = \&some_action;}fn_ptr();

Page 85: Just Enough C For Open Source Projects

Function pointersYou can point to any kind of function.cdecl> explain char * (*fn_ptr)(int, char **)declare fn_ptr as pointer to function (int, pointer to pointer to char) returning pointer to char

This is how you can do dispatch tables.

Use typedef to create names for these types.

There is no shame in using cdecl.

Page 86: Just Enough C For Open Source Projects

Questions?

Page 87: Just Enough C For Open Source Projects

const

Page 88: Just Enough C For Open Source Projects

constThe const qualifier lets the compiler know you are don't want to modify a value.const int bufsize = NUSERS * sizeof(int);

Trying to modify bufsize is an error.bufsize++; /* error */

Page 89: Just Enough C For Open Source Projects

constLiteral strings should be thought of as const.const char username[] = "Bingo";username[0] = 'R'; /* Error */

If your compiler has a switch to make this automatic, use it.

Page 90: Just Enough C For Open Source Projects

constconst your pointers in function arguments to tell the compiler that you promise not to touch the contents.int strlen( const char *str );int strcmp( const char *a, const char *b );

It would be tragic if strlen() modified what was passed into it, no?

Page 91: Just Enough C For Open Source Projects

constconst also lets the compiler or other tools know that your function does not initialize the contents of what's passed in.int strlen( const char *str );int mystery_func( char *str );

char *p;n = strlen(p); /* "uninitialized value" error */n = mystery_func(p); /* not an error */

The compiler has to assume that mystery_func is going to fill str.

Page 92: Just Enough C For Open Source Projects

Questions?

Page 93: Just Enough C For Open Source Projects

Multiple source files

Page 94: Just Enough C For Open Source Projects

Header files/* In handy.h */int square( int n );

/* In handy.c */int square( int n ) { return n*n;}

/* In hello.c */#include <handy.h> /* for square() */#include <stdio.h> /* for printf() */

int main( int argc, char *argv ) { printf( "Hello, world, 12^2 = %d\n", square( 12 ) );

return 0;}

Page 95: Just Enough C For Open Source Projects

Standard header files#include <stdio.h>/* Standard I/O functions */

#include <stdlib.h>/* Catch-all useful: malloc(), rand(), etc */

#include <time.h>/* Time-handling structs & functions */

#include <string.h>/* String handling */

#include <math.h>/* Math functions of all kinds */

Page 96: Just Enough C For Open Source Projects

Package header filesLook in /usr/include or /usr/local/include#include <db.h>/* Berkeley DB */

#include <sqlite3.h>/* SQLite3 */

#include <ldap.h>/* LDAP */

#include <apache2/httpd.h>/* Apache 2's main header file */

Page 97: Just Enough C For Open Source Projects

ackack is grep for big codebases.

It searches recursively from current directory by default.

It lets you specify filetypes.

http://petdance.com/ack/

Page 98: Just Enough C For Open Source Projects

TagsTags files are prebuilt indexes of the symbols in your project.ctags \ --links=no --totals \ -R --exclude=blib --exclude=.svn \ --exclude=res_lea.c \ --languages=c,perl --langmap=c:+.h,c:+.pmc,c:+.ops \ .1371 files, 649213 lines (19150 kB) scanned in 2.2 seconds (8665 kB/s)33566 tags added to tag file33566 tags sorted in 0.00 seconds

Page 99: Just Enough C For Open Source Projects

TagsEach line of the tags file looks like this:find_builtinsrc/builtin.c/^find_builtin(ARGIN(const char *func))$/;"ffile:

But run together with tabs.

Tells the editor how to find the symbol.

Page 100: Just Enough C For Open Source Projects

TagsJump to a tag from the command-line:$ vim -t find_builtin

Jump to a tag from inside vim::tag find_builtin

Jump to the symbol under the cursor, or jump back to previous positionCtrl-], Ctrl-t

Page 101: Just Enough C For Open Source Projects

TagsThese have been vim tag files.

Emacs supports tags as well.

Exuberant ctags generates tags in both forms.http://ctags.sf.net/ (I think)

Page 102: Just Enough C For Open Source Projects

Questions?

Page 103: Just Enough C For Open Source Projects

Slides will be athttp://petdance.com

Page 104: Just Enough C For Open Source Projects

Topics omittedFile I/OMacro side effectswarnings vs. errorsdebuggingprofilingWorking on large projectslint/splintvalgrind