Introduction to C
(Reek, Chs. 1-2)
1 CS 3090: Safety Critical Programming in C
C: History
CS 3090: Safety Critical Programming in C2
Developed in the 1970s – in conjunction with
development of UNIX operating system
When writing an OS kernel, efficiency is crucial
This requires low-level access to the underlying
hardware:
e.g. programmer can leverage knowledge of how data is laid out
in memory, to enable faster data access
UNIX originally written in low-level assembly language –
but there were problems:
No structured programming (e.g. encapsulating routines as
“functions”, “methods”, etc.) – code hard to maintain
Code worked only for particular hardware – not portable
C: Characteristics
CS 3090: Safety Critical Programming in C3
C takes a middle path between low-level assembly
language…
Direct access to memory layout through pointer
manipulation
Concise syntax, small set of keywords
… and a high-level programming language like Java:
Block structure
Some encapsulation of code, via functions
Type checking (pretty weak)
C: Dangers
CS 3090: Safety Critical Programming in C4
C is not object oriented!
Can’t “hide” data as “private” or “protected” fields
You can follow standards to write C code that looks object-oriented, but you have to be disciplined – will the other people working on your code also be disciplined?
C has portability issues
Low-level “tricks” may make your C code run well on one platform – but the tricks might not work elsewhere
The compiler and runtime system will rarely stop your C program from doing stupid/bad things
Compile-time type checking is weak
No run-time checks for array bounds errors, etc. like in Java
Separate compilation
CS 3090: Safety Critical Programming in C5
A C program consists of source code in one or more files
Each source file is run through the preprocessor and compiler, resulting in a file containing object code
Object files are tied together by the linker to form a single executable program
Source code
file1.cPreprocessor/
Compiler
Object code
file1.o
Source code
file2.cPreprocessor/
Compiler
Object code
file2.o
LinkerLibraries
Executable code
a.out
Separate compilation
CS 3090: Safety Critical Programming in C6
Advantage: Quicker compilation
When modifying a program, a programmer typically edits
only a few source code files at a time.
With separate compilation, only the files that have been
edited since the last compilation need to be recompiled
when re-building the program.
For very large programs, this can save a lot of time.
How to compile (UNIX)
CS 3090: Safety Critical Programming in C7
To compile and link a C program that is contained entirely in one source file:
cc program.c The executable program is called a.out by default.
If you don’t like this name, choose another using the –o option:
cc program.c –o exciting_executable
To compile and link several C source files:
cc main.c extra.c more.c This will produce object (.o) files, that you can use in a later compilation:
cc main.o extra.o more.c Here, only more.c will be compiled – the main.o and extra.o files
will be used for linking.
To produce object files, without linking, use -c:
cc –c main.c extra.c more.c
The preprocessor
CS 3090: Safety Critical Programming in C8
The preprocessor takes your source code and – following certain directives that you give it – tweaks it in various ways before compilation.
A directive is given as a line of source code starting with the # symbol
The preprocessor works in a very crude, “word-processor” way, simply cutting and pasting –
it doesn’t really know anything about C!
Your
source
code
Preprocessor
Enhanced and
obfuscated
source code
Compiler
Object
code
A first program: Text rearranger
Input
First line: pairs of nonnegative integers, separated by
whitespace, then terminated by a negative integer
x1 y1 x2 y2 … xn yn -1
Each subsequent line: a string of characters
Output
For each string S, output substrings of S:
First, the substring starting at location x1 and ending at y1;
Next, the substring starting at location x2 and ending at y2;
…
Finally, the substring starting at location xn and ending at xn.
9 CS 3090: Safety Critical Programming in C
Sample input/output
CS 3090: Safety Critical Programming in C10
Initial input: 0 2 5 7 10 12 -1
Next input line: deep C diving
Output: deeC ding
Next input line: excitement!
Output: exceme!
… continue ad nauseum…
Terminate with ctrl-D (signals end of keyboard input)
Use of comments
CS 3090: Safety Critical Programming in C11
/*
** This program reads input lines from the standard input and prints
** each input line, followed by just some portions of the lines, to
** the standard output.
**
** The first input is a list of column numbers, which ends with a
** negative number. The column numbers are paired and specify
** ranges of columns from the input line that are to be printed.
** For example, 0 3 10 12 -1 indicates that only columns 0 through 3
** and columns 10 through 12 will be printed.
*/
Only /* … */ for comments – no // like Java or C++
Comments on comments
CS 3090: Safety Critical Programming in C12
Can’t nest comments within comments /* is matched with the very next */ that comes along
Don’t use /* … */ to comment out code – it won’t work if the commented-out code contains comments
/* Comment out the following code
int f(int x) {
return x+42; /* return the result */
}
*/
Anyway, commenting out code is confusing, and dangerous (easy to forget about) – avoid it
Only this will be
commented out
This will not!
Preprocessor directives
CS 3090: Safety Critical Programming in C13
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
The #include directives “paste” the contents of the files stdio.h, stdlib.h and string.h into your source code, at the very place where the directives appear.
These files contain information about some library functions used in the program:
stdio stands for “standard I/O”, stdlib stands for “standard library”, and string.h includes useful string manipulation functions.
Want to see the files? Look in /usr/include
Preprocessor directives
CS 3090: Safety Critical Programming in C14
#define MAX_COLS 20
#define MAX_INPUT 1000
The #define directives perform
“global replacements”:
every instance of MAX_COLS is replaced with 20, and every
instance of MAX_INPUT is replaced with 1000.
Function prototypes
CS 3090: Safety Critical Programming in C15
int read_column_numbers( int columns[], int max );
void rearrange( char *output, char const *input,
int n_columns, int const columns[] );
These look like function definitions – they have the
name and all the type information – but each ends
abruptly with a semicolon. Where’s the body of the
function – what does it actually do?
(Note that each function does have a real definition,
later in the program.)
Function prototypes
CS 3090: Safety Critical Programming in C16
Q: Why are these needed, if the functions are defined later in the program anyway?
A: C programs are typically arranged in “top-down” order, so functions are used (called) before they’re defined. (Note that the function main() includes a call to read_column_numbers().)
When the compiler sees a call to read_column_numbers() , it must check whether the call is valid (the right number and types of parameters, and the right return type).
But it hasn’t seen the definition of read_column_numbers() yet!
The prototype gives the compiler advance information about the function that’s being called. Of course, the prototype and the later function definition must
match in terms of type information.
The main() function
CS 3090: Safety Critical Programming in C17
main() is always the first function called in a program execution.
int
main( void )
{ …
void indicates that the function takes no arguments
int indicates that the function returns an integer value
Q: Integer value? Isn’t the program just printing out some stuff and then exiting? What’s there to return?
A: Through returning particular values, the program can indicate whether it terminated “nicely” or badly; the operating system can react accordingly.
The printf() function
CS 3090: Safety Critical Programming in C18
printf( "Original input : %s\n", input );
printf() is a library function declared in <stdio.h>
Syntax: printf( FormatString, Expr, Expr...)
FormatString: String of text to print
Exprs: Values to print
FormatString has placeholders to show where to put the
values (note: #placeholders should match #Exprs)
Placeholders: %s (print as string), %c (print as char),
%d (print as integer),
%f (print as floating-point)
\n indicates a newline character
Make sure you pick
the right one!
Text line printed only
when \n encountered
Don’t forget \n when
printing “final results”
return vs. exit
CS 3090: Safety Critical Programming in C19
Let’s look at the return statement in main():
return EXIT_SUCCESS;
EXIT_SUCCESS is a constant defined in stdlib ; returning this value signifies successful termination.
Contrast this with the exit statement in the function read_column_numbers():
puts( “Last column number is not paired.” );
exit( EXIT_FAILURE );
EXIT_FAILURE is another constant, signifying that something bad happened requiring termination.
exit differs from return in that execution terminates immediately –control is not passed back to the calling function main().
Pointers, arrays, strings
CS 3090: Safety Critical Programming in C20
In this program, the notions of string, array, and
pointer seem to be somewhat interchangeable:
In main(), an array of characters is declared, for
purposes of holding the input string:
char input[MAX_INPUT];
Yet when it’s passed in as an argument to the
rearrange() function, input has morphed into a pointer
to a character (char *):
void
rearrange( char *output, char const *input,…
Pointers, arrays, strings
CS 3090: Safety Critical Programming in C21
In C, the three concepts are indeed closely related:
A pointer is simply a memory address. The type char *
“pointer to character” signifies that the data at the
pointer’s address is to be interpreted as a character.
An array is simply a pointer – of a special kind:
The array pointer is assumed to point to the first of a sequence
of data items stored sequentially in memory.
How do you get to the other array elements? By incrementing
the pointer value.
A string is simply an array of characters – unlike Java,
which has a predefined String class.
String layout and access
CS 3090: Safety Critical Programming in C22
p(char)
o(char)
i(char)
n(char)
t(char)
e(char)
r(char)
NUL(char)
(char *)
inputWhat is input?
It’s a string!
It’s a pointer to char!
It’s an array of char!
How do we get to the “n”?
Follow the input pointer,
then hop 3 to the right
*(input + 3)- or -
input[3]
NUL is a special value
indicating end-of-string