grep, diff, find
‣ grep searches for a pattern – Outputs line(s) containing pattern to stdout
‣ grep accepts both stdin and command line arg(s) UNIX> seq 10 | grep 1 1 10 UNIX>
UNIX> cat > input.txt 1 haystack 2 haystack 3 needle 4 haystack <CTRL-C> UNIX> grep needle input.txt 3 needle
‣ Use double quotes for complex strings UNIX> cat > input2.txt 1,Marc Rubin,98753,Blue 2,Marc Ruben,96242,Green 3,Marc Reuben,97114,Brown <CTRL-C> UNIX> cat input2.txt | grep "Marc Rubin" 1,Marc Rubin,98753,Blue
‣ Search multiple files at once
UNIX> grep 3 input.txt input2.txt
input.txt:3 needle
input2.txt:1,Marc Rubin,98753,Blue
input2.txt:3,Marc Reuben,97114,Brown
plark42:~ >
UNIX> seq -352.5983 0.001 73.9531 | grep .653 | wc -l
3952
UNIX>ls -l /u/sa/br | grep qhan drwx--S--- 8 qhan qhan 4096 Jul 17 14:28 qhan
‣ grep can make use regular expressions – CS theory tangent: what type of machine recognizes regex?
‣ Beyond time and scope of course…
‣ More information: – http://www.robelle.com/smugbook/regexpr.html
‣ diff compares two files, outputs differences to stdout – No differences no output
UNIX> echo HELLO > f1.txt
UNIX> echo HELLO > f2.txt
UNIX> diff f1.txt f2.txt
UNIX>
UNIX> echo HELLO > f1.txt UNIX> echo GOODBYE > f2.txt UNIX> diff f1.txt f2.txt 1c1 < HELLO --- > GOODBYE
‣ -y option splits output into two columns
‣ ‘|’ indicates differences
UNIX> diff –y f1.txt f2.txt
HELLO | GOODBYE
UNIX>
UNIX> cat > input1.txt first second third <CTRL-C> UNIX>
UNIX> cat > input2.txt first s e c o n d third <CTRL-C> UNIX>
UNIX> diff –y input1.txt input2.txt first first second | s e c o n d third third
UNIX> diff –y input1.txt input2.txt | grep ‘|’ second | s e c o n d
‣ -w option ignores whitespace
UNIX> diff –y input1.txt input2.txt | grep ‘|’
second | s e c o n d
UNIX> diff –w input1.txt input2.txt
UNIX>
‣ find recursively searches filesystem for file(s)
‣ Syntax: find where [..] what [..] – where := where to start searching
– what := what to search for
‣ -name option to specify file name – “starting in current directory, find photo1.jpg ”
UNIX> mkdir temp UNIX> touch temp/photo1.jpg UNIX> find . –name photo1.jpg ./temp/photo1.jpg UNIX>
‣ Often use wildcards (*) – “find all .jpg files starting in temp directory”
UNIX> touch temp/photo2.jpg temp/photo3.jpg
UNIX> find temp –name "*.jpg"
temp/photo1.jpg
temp/photo2.jpg
temp/photo3.jpg
‣ -user option to specify file owner – “starting in /data/csci274, find all files owned by qhan”
UNIX>find /data/csci274/ -user qhan
/data/csci274/
/data/csci274/Assignments
/data/csci274/Assignments/14_network.tgz
/data/csci274/Assignments/14_network
/data/csci274/Assignments/14_network/file.txt
…
‣ -size option to specify file size (e.g., -10k, +100M) – see man page for more details.
‣ “starting in temp, find all files >= 450 MB” UNIX> wc -c temp/huge.txt 471904256 temp/huge.txt UNIX> find . -size +450M ./temp/huge.txt UNIX>
‣ Often combine search expressions – “Starting in /data/csci274/, find all files with:
– .jpg extension
– owned by qhan
– size >= 100kB in size”
UNIX> find /data/csci274/ -name "*.jpg" -user qhan-size +100k
/data/csci274/Assignments/14_network/image.jpg
‣ by default, multiple search expressions joined by logical AND
‣ -or option used for logical OR UNIX> find /data/csci274/ -name "driver.cpp" -or -name "dummy.cpp"
/data/csci274/Assignments/6_makefile/fourth/driver.cpp
/data/csci274/Assignments/6_makefile/fourth/dummy.cpp
/data/csci274/Assignments/6_makefile/second/driver.cpp
/data/csci274/Assignments/6_makefile/second/dummy.cpp
UNIX> mkdir a a/b a/b/c
UNIX> touch a/f1.zzzz a/b/f2.zzzz a/b/c/f3.zzzz
UNIX> find a -name "*.zzzz"
a/b/c/f3.zzzz
a/b/f2.zzzz
a/f1.zzzz
‣ xargs: build and execute command lines from standard input
UNIX> find a -name "*.zzzz” | xargs rm
UNIX> find a –name “*.zzzz”
UNIX>
‣ Be VERY CAREFUL with xargs rm
UNIX> find / -user poor_guy | xargs rm
‣ Use grep, diff, and find to search
‣ http://eecs.mines.edu/Courses/csci274/Assignments/10_searching.html