unix primer

14

Click here to load reader

Upload: dummy

Post on 04-Jul-2015

1.309 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Unix primer

Unix Primer

Maxime Augier

03.04.2004

1 Before we start. . .

you can get information about any external (= shell-independant) command by con-sulting the UNIX Manual Pages:

$> man command

for more information about the manual system itself, do:

$> man man

if your are looking for a keywordblah in a man page, after launching the man program,type

/blah<enter>� to find the next occurence of the word your are looking for, usen.� to quit the man page simply hit q.

2 Filesystem exploration

2.1 Filesystem structure & namespace

The filesystem is an abstraction used by UNIX to access to information storage devices(hard drives, floppies, cd-roms, network storage). The filesystem is organised in direc-tories that form a tree. Each directory is a node of this tree.Files, and other objectslike Unix Sockets, FIFOs, and devices, are all leaves of thistree.

Each node in the tree can be uniquely identified by the sequence of parent nodes totraverse before reaching the object. This is called a path. Apath is the seqence of thenodes (directories) names, separated by slashes. (For example: an/example/of/path)

The root of the filesystem tree is a special directory called /(slash). Thus, paths startingwith a / are considered absolute (they start at the root of thefilesystem). On the otherhand, paths not starting with a / are understood as relative to the current directory.

Every user gets a personal directory. traditionally it is located in /home/username.There is also a shortcut for home directories: a tilde (˜ ) designates the home direc-

1

Page 2: Unix primer

tory of the current user; a tilde followed by a username (˜ alice) designates the homedirectory of this user.

Programs are stored in /bin, /sbin, /usr/bin, /usr/sbin, /usr/local/bin and /usr/local/sbin.

All the system-wide configuration files are located in /etc

Temporary files can be created in /tmp. Files you create here are not accounted intoyour quota, so it is convenient to work with. However, the contents of /tmp will even-tually be deleted upon reboot.

Hidden files or directories start with a dot (.)

2.2 Navigating

When you work in your shell, you always have a current directory.

2.2.1 cd — moving around

To change your current directory, do:

$> cd /my/new/directory

To go back to your home directory, do:

$> cd

to go back one directory in the directory tree, do:

$> cd ..

2.2.2 pwd — knowing where you are

At any moment, you can use the pwd command (Print Working Directory) to displayyou current directory

$> pwd

2.2.3 ls — examining directories

Use the command ls to list the contents of a directory. The syntax is

$> ls directory

If directory is omitted, ls assumes it is ”.” (the current directory).

Most common options are:

-a : to include hidden files

-l : to get a long listing (including file sizes, dates and permissions)

-d : to list the directory itself (as in its parent) instead of itscontents (useful when youdo ”ls -ld *” for instance).

2

Page 3: Unix primer

2.3 Organising directories

2.3.1 mkdir - creating new directories

$> mkdir directory

2.3.2 rmdir - deleting empty directories

$> rmdir directory

Note: the directory must be empty. To delete a non-empty directory, use rm with therecursive option (-R)

2.4 Managing files

Preliminary warning Unlike other systems, Unix is very permissive regarding file names.In fact, you can use almost every printable or non-printablecharacters, including line-feed, backspace, tab and so on. However, many characters have a special meaningto the shell, leading to unpredictable results. A strangelynamed file can become un-deletable from the shell, mangle the directory listing display or even be interpreted asan argument to a command, changing the behavior of the command.

Conclusion: When choosing a name for a file, rry only using alphanumerics and thefollowing characters: - . and do NOT start a file name with a dash (-)

2.4.1 touch — creating empty file

touch won’t erase anything even if the file already exists so it’s a safe way of creatingempty file.

$> touch filename

2.4.2 cp — copying files

Two possibles usages:

$> cp source destination

to copy a source file into a destination file, and

$> cp source1 source2 source3 destination/

to copy several files into a destination directory, keeping the original filenames.

2.4.3 rm — removing files

To delete file(s):

$> rm file1 file2 file3 ...

3

Page 4: Unix primer

To delete a file that starts with - (for example -toto);

$> rm -- -toto

To remove the entire contents of a directory:

$> rm -r directory/

2.4.4 mv — moving and renaming files

To move a file:

$> mv file /destination/directory

To rename a file:

$> mv oldname newname

Note: with the option “-i”, you are asked for confirmation every time a file is deletedor overwritten. It is a good idea to make this the default, e.g., by defining appropriatealiases, i.e. with bash:

$> alias rm="rm -i"

If you do that, you can temporarily negate the -i flag with a -f (force) flag.

2.4.5 chown, chgrp, chmod — managing permissions

You can use these tools to manage access control for your files. s You can change theowner of a file(s) with chown:

$> chown owner file1 file2 file3...

You can use chgrp to change the file(s) group:

$> chgrp group file1 file2 file3...

You can use chmod to change the access mode of a file(s):

$> chmod mode file1 file2 file3...

mode being of the form<who><action><right> , where: who can be one ofu(ser), g(roup) or o(thers)action can be one of + (allow), - (deny), = (allow only this)right can be one of r(ead), w(rite) or x(ecute) (and some others, too.. look at the manpage)

For a file, the rights meanings are straightforward:

r: you can read the file

w: you can write in the file (and therefore delete it)

x: you can execute the file

For a directory, the meanings are more complex:

r: you can list the directory contents

4

Page 5: Unix primer

w: you can add new files in the directory, and delete existing ones (technically, thatmeans you can modify all the files you can read, and at least delete all the filesyou can’t).

x: you can traverse the directory (meaning you can access the files in it)

Some common modes you can use:

go= to deny access to everybody but yourselfa+x to make a script executable for everybodygo=x to make a secret directory (other people can access the

contents only if they know the exact file names)

2.5 Archiving files

Some widely used archiving formats (.zip, .rar) include both file collating and com-pression. Under UNIX, those two tasks are handled by separate utilities

2.5.1 tar — packing many files in one

To pack files into a ”tarball” (archive), do:

$> tar -cf nouvelle_archive.tar fichier1 fichier2 fichier 3 ...

To unpack a tarball, do:

$> tar -xf archive.tar

Warning: don’t use absolute filenames (those starting with a/) when creating a tarball,otherwise tar will attempt to put them on the same absolute location when extracting,which can lead to problems if you take files from one system to another.

2.5.2 gzip - compressing files

To compress a file, do:

$> gzip filename

This will create a compressed version of the file named filename .gz

To decompress it, do:

$> gunzip filename.gz

The recommended approach is to first use tar to collate many files, then use gzip tocompress the resulting tarball.

2.6 Working with text files

2.6.1 more & less — reading files quickly

To read a text file, use

5

Page 6: Unix primer

$> more filename

You can advance in the text by hitting space or enter. You can go back by hitting ”b”.You can search for a keyword, or even a whole regular expression (see below) by hitting/ (slash), typing in the keyword or regexp, and hitting enter. After a search, you can hit”n” to go to the next match.

2.6.2 cat — quickly creating text files

When you want to quickly create short text files, you don’t have to open a heavy editor.Instead, type

$> cat > filename

then type the contents of the file. When you’re finished, go to anew line and hit Ctrl-D.

2.7 Finding Files

2.7.1 find

Find looks recursively for files matching some criterions ina given directory. Thesyntax is

$> find directory (criterions...)

To look in the current directory for regular files with names ending in .java, do

$> find . -type f -name "*.java"

To get a list of all the files and directories in the current directory (that is, withoutapplying any criterion), do

$> find .

Warning: You have to quote the wildcards, otherwise the shell will expand them, whichis not what you want (see below about shell wildcards).

3 Examining files

3.0.2 Identifying file types: file

Some common desktop operating systems use file name extensions (.txt, .c, .mp3, etc...)to denote the file type. With UNIX, you don’t have to use file extensions to denote aparticular filetype.

When you don’t know what a particular file contains, use the file command. It guessesthe file type by looking directly at the contents of the file, regardless of its actual name.

$> file filename

6

Page 7: Unix primer

3.0.3 Counting the number of lines in a text file: wc

wc with the l switch counts the number of lines in a file. For example you can useusnoop |grep Syn |wc -l if you want to know how many Syn appear in yourusnoop trace.

$> cat textfile | wc -l

3.0.4 Searching text files for strings: grep

Grep looks for lines in a text stream matching a particular pattern, and prints only thematching (or non-matching) lines. The syntax is

$> grep pattern file1 file2 file3...

The pattern can be a simple keyword. For a fixed-string pattern, passing the -F option togrep will make it faster. For more complex patterns, you haveto use regular expressions(or RegExpes).

4 Regular Expressions

RegExpes are the UNIX way of describing text patterns. They are used in many textprocessing languages such as sed, awk and perl. RegExpes arejust simple text, butsome characters acquire a special meaning:

Character Meaning. (Dot) Stands for any character? (Question mark) The previous character is optional(Star) The previous character repeats zero or more times+ (Plus) The last character repeats one or mor timesˆ (Caret) Stands for the beginning of the line$ (Dollar) Stands for the end of the line

You can also express a choice among many characters by listing them between brack-ets: [123] means either 1, 2 or 3.

You can negate the choice by prepending it with a caret: [ˆabc] means any characterbut a, b or c.

You can aggregate contiguous characters with a dash: [a-z] means any lowercase al-phabetic character, [0-9] means any digit, and so on.

Examples:

ˆa*$ Matches a, aa, aaa, aaaa, . . .abc* Matches anything containing ab, abc, abcc, abccc, abcccc, .. .ˆab[cd] Matches anything starting with abc or abdˆab.*cd$ Matches anything starting with ab and ending with cdˆab.c$ Matches abac, abbc, abcc, abdc, abec, ab c, ab1c, ab$c, . . .

For more information, you can refer to the grep manpage (man grep), or better, to theperl RegExp manpage (perldoc perlre)

7

Page 8: Unix primer

5 usnoop : selecting which packets to be captured

You can use pcap filter with usnoop or tcpdump to choose which packets you want tobe captured. Here are some common filters :

5.0.5 arp, tcp, icmp and udp keywords

use one of these keywords to catch only a particular protocol. For example if you wantto catch only arp request and answer use :

$> usnoop arp

5.0.6 host, net and port keywords

host hostname to catch only traffic going or coming from this hostname.net network to catch only traffic going or coming from this network.port portnumber to catch only traffic going or coming from this specific port number.

$> usnoop host www.epfl.ch

$> usnoop net 128.178

$> usnoop port 80

5.0.7 src and dst keywords

you can specifiy a transfer direction for host, net and port. If you want to match packetscoming from a source port use src port or if you want to match packets going to aparticular destination host use dst host

$> usnoop dst port 80

$> usnoop src host 128.178.164.22

5.0.8 matching flags in tcp packets

if you want to check for these common tcp flags use :

SYN : ’tcp[13] & 2 != 0’FIN : ’tcp[13] & 1 != 0’ACK : ’tcp[13] & 16 != 0’RST : ’tcp[13] & 4 != 0’

$> usnoop ’tcp[13] & 2 != 0’

8

Page 9: Unix primer

5.0.9 using and, or, ( ) and not between keywords

just combine precedent keywords with and, or, ( ) and not to build powerful filters :

catch all DNS traffic :

$> usnoop udp and port 53

catch tcp traffic with SYN and ACK bit coming from in3sun1

$> usnoop ’tcp[13] & 2 != 0 and tcp[13] & 16 != 0 and src host in3s un1’

catch all tcp and udp traffic from www.epfl.ch but discard tcp port 22 one

$> usnoop ’((tcp and not port 22) or udp) and host www.epfl.ch ’

6 Using the shell

6.1 Wildcards

Whenever you want to use a program expecting a list of files as arguments, you canuse wildcards to have the shell build an appropriate list foryou. Shell wildcards worksomewhat like regexpes, but are much more simple.

Warning: regexpes and shell wildcards are not compatible. Avalid wildcard is oftennot a valid regexp, and conversely

The two most common wildcards are the star (*), that stands for zero, one or morecharacters of any sort, and the question mark (?), that stands for exactly one, but any,character. You can also specify a choice of possibles characters by listing them betweensquare braces ([123] stands for either 1, 2, or 3).

Whenever you use an argument containing wildcards, the shell will automatically re-place it by a list of file names in the current directory matching the pattern. For instance,for deleting all the java source files in a directory, do

$> rm *.java

Note that the shell only looks in the current directory. If you want a list of files foundthrough an entire directory tree, you have to use the find command in conjunction withxargs or a backtick operator (see below).

6.2 I/O Redirection

In Unix, every running program can read data from a special stream called ”StandardInput” (or STDIN), and write data to two others special streams (Standard Output, andStandard Error, rsp. STDOUT and STDERR).

By default, when you run a command, STDIN is associated to your keyboard, whileSTDOUT and STDERR are associated with your display. That means the program ???

You can redirect STDOUT to a file, so that the command write itsresults in a file ratherthan on your screen. To do so, use the notation ¿file .

9

Page 10: Unix primer

$> ls -d * >catalog

will create a list of the files in the current directory and write it in a file named ”catalog”.

Also, you can redirect STDIN so that the program takes its inputs from a file ratherthan from the keyboard, with the notation ¿file.

6.3 Pipes and filters

Sometimes, you want to chain the effects of two commands.

You can use the pipe symbol ”| ” to connect the STDOUT of a command to the STDINof a second one.

UNIX has a lot of small ”filter” programs that you can use to process the results froma command

6.3.1 more

When used without filenames, more will act as a pager for its STDIN. You can thenuse it to add paging to the output of any command:

$> usnoop -i /tmp/file | more

6.3.2 tee

the tee command gets data from its STDIN and puts it both in STDOUT and in a file.You can use it to both view and save the results in a file. For example,

$> usnoop -i /tmp/file | tee results | more

Will both page the output and save it in the results file

6.3.3 sort

The sort command can sort output lines by alphabetic or numeric order (use option -nfor numeric sort). You can reverse the sort with the -r optionTo list all the files in adirectory by ascending number or words, you can do:

$> wc * | sort -rn

To create a sorted version of a file, you can do

$> sort <original >sorted

6.3.4 sed, awk, perl

sed (Stream EDitor) is a powerful program that can be used to perform various transfor-mations on a text stream. It is line-oriented, and also relies on regular expressions forits pattern matching. It is too complex to be explained here,but plenty of informationis available in the man page.

10

Page 11: Unix primer

awk is another text-processing program, line-and-fields oriented. It can be used toprocess text files databases. You can also look up more information in the man page.

those two programs are practical for small processing tasks, but their performance andflexibility is very limited. Much more complex tasks can be achieved with the Perllanguage, which is becoming widely used among unix distributions. Perl programmingis beyond the scope of this document.

6.4 Turning a stream into arguments

6.4.1 The backtick operator “

The backticks take the output of a command and expand it into an argument stream.For instance, to recursively grep all java files in a directory for a specific keyword, youcan use:

$> find -type f -name "*.java"

to get a list of all the java files in a directory. Then, to turn it into an argument list forgrep, do

$> grep keyword ‘find . -type f -name "*.java"‘

6.4.2 xargs

Xargs reads lines from its STDIN and turns them into arguments for a command. Itworks pretty much like the backtick, except it is more flexible. For the example above,an equivalent would be

$> find . -type f -name "*.java" | xargs grep keyword

6.5 Job control

6.5.1 ; — synchronous execution

If you want to execute sequentially several commands, you can separate them withsemicolons (;)

$> ls /bin; ls /sbin

Each separate command will start when the previous one is completed, regardless ofits exit status.

6.5.2 & and wait — asynchronous execution

You can execute several commands in parallel; each command starts immediatly. Thiswill print file.ps while simultaneously creating a compressed version file.ps.gz

$> lp file.ps & gzip file.ps

11

Page 12: Unix primer

If you do not specify the last command, you return to the shell. Therefore, you canlaunch a command in the background by appending a & to it. However, the commandwill still use your terminal as standard input and output, soyou have to use proper redi-rection when launching commands in backgroud. This runs a command in background,saving its output in a file ’results’ and discarding the errors.

$> command >results >&/dev/null &

If you need to have a command sequence to stop and wait for a program to complete,

6.5.3 Conditional synchronous execution

Occasionally, you want to execute a command only if the previous one succeeded (ornot). For this, you can use the && and|| operators.

&& acts the same as ; excepts that il will not continue if the first command fails.

|| will execute the second command only if the previous one failed.

$> cp a b && rm a

Will copy file a to file b, then erase a only if the copy suceeded.

$> cp a b ; rm a

will erase a even if the copy failed.

$> mv a b || cp a b

will rename a to b, or fallback to making a copy if the rename failed (for instance, if awas write-protected)

6.5.4 ps — showing running processes

You can use ps to show a list of running processes on your machine and their ID withthe ps command.

With no argument ps will show processes owned by you and that were run from thesame terminal.

The ’-A’ option lists all processes running on the machine. the ’-l’ option shows de-tailed information for each process.

6.5.5 kill — sending signals

UNIX processes can also communicate with signals. Signals can be used to interrupt,restart or destroy processes. The syntax is

$> kill -signal PID

To kill a process given its pid, or

$> kill -signal \%job

12

Page 13: Unix primer

To kill a process by job number (see below).

If you do not specify a signal, the TERM signal is assumed.

The most common signals are:INT interruption (program should interrupt what it’s doing)STOP suspends the execution of the current process (the execution can be resumed later)CONT resume the execution of a suspended process (i.e. with the STOP signal)TERM termination (program must clean up garbage and exit immediately)KILL kills the process (should be only used in last resort.)

You can get a list of available signals by typing

$> kill -l

Examples :

$> kill -TERM 2245$> kill -TERM %2

6.5.6 shortcuts to send signals

You can send a SIGINT to the current program running in a terminal by hitting Ctrl-C

You can send a SIGSTOP (program is suspended and can continuelater) by hittingCtrl-Z

6.5.7 jobs, fg and bg - controlling job execution

Each program either stopped or running in background becomes a ”Job” of the shell.You can display current jobs and their numbers with the command jobs.

You can bring back a job that you stopped with Ctrl-Z, or that was running in thebackground, with the command

$> fg <job number>

You can run a stopped job in the background with

$> bg <job number>

If you don’t give a job number, fg and bg will assume the most recent job.

Finally, you can get the PID of a job with the construct %¡job number¿. For instance,to send the HUP signal to job n 5, do

$> kill -HUP %5

13

Page 14: Unix primer

7 Printing

7.0.8 PostScript files

Beyond printing simple text files, most UNIX printers support the PostScript languageto print more complex documents. You can send either text or PostScript files to theprinter with lp.

7.0.9 lp — printing .ps and text files

$> lp -c file.ps

The -c option copies files to the spooler before printing. It can be needed in IN1 andIN3 because of configuration glitches.

7.0.10 a2ps — printing any file types ! (almost)

a2ps is a filtering program that produces PostScript files outof lots of different filetypes. You can send the files to the printer using lp:

$> a2ps -o MyClass.ps MyClass.java$> lp MyClass.ps && rm MyClass.ps

8 Acknowledgements:

Thanks to the people who contributed to this document:� Matthias Grossglauser� Olivier Hochreutiner� Sebastien Mathieu

14