sed tips and_tricks

Post on 16-Apr-2017

897 Views

Category:

Technology

6 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Sed – tips and Tricks1

Logan Palanisamy

Agenda2

BasicsBio BreakIntermediate ConceptsQ & A

What is sed3

non-interactivenon-screen orientedLine orientedInput file can be any sizeInput file not affected

sed syntax4

sed [options] 'cmd' in_file(s)sed [options] 'cmd' in_file(s) [> out_file]Standard input: sed [options] 'cmd' < in_file

[> out_file]Pipelined input: command | sed [options]

'cmd' [> out_file]

Simple examples5

Example Explanationsed 's/pat1/pat2/g' file Substitute all occurrences of pat1 with pat2sed '/pat1/d' file delete lines containing pat1 from filesed '/pat1/w newfile' file

save lines containing pat1 to newfile

sed –n '30,40p' file Print lines 30 to 40sed '10q' file Print the top 10 linessed –e 's/pat1/pat2/' –e 's/pat3/pat4/' file

Substitute pat1 with pat2, and pat3 with pat4

sed 's/pat1/pat2/;s/pat3/pat4/' file

Substitute pat1 with pat2, and pat3 with pat4

Different sed options/switches6

Range Remarks-e Used when multiple commands are used on

the command line-n Suppress automatic printing of pattern space-f script-file The commands in the script-file get executed-r Use extended regular expressions in the script.

With this option, characters such as (, ), {, }, | become meta characters, and don't have to be escaped.

-s Consider files as separate rather than as a single continuous long stream

-i in-place editing of the input file

The "-f" option - example7

cat mysed.txt/pat1/ s/this/that/g/pat2/ s/before/after/

sed –f mysed.txt in_fileapostrophes not used

With and without "-s" option - Comparison

8

Lines in f1

Lines in f2

sed –n '1,10p' f1, f2 sed –ns '1,10p' f1, f2

4 5 9 lines 4 lines from f1, 5 lines from f2

6 6 6 lines from f1, 4 lines from f2

6 lines from f1, 6 lines from f2

12 10 10 lines from f1. No lines from f2

10 lines from f1, 10 lines from f2

Address Specification9

Range Remarks10 Just the line 101,10 Lines between 1 and 1010,$ Line 10 to end of file10,+3 Line 10 and 3 lines below (lines 10, 11, 12 and

13)10~3 every third line after line 1010, ~3 Line 10 and the next multiple of 3 (lines 10,11 and

12)/pat1/ All lines containing pat1/pat1/,+3 Lines containing pat1 and three lines following

it/pat1/,~3 Lines containing pat1 and up to the multiple of

3/pat1/, 20 Lines between the line containing pat1 and line 20 if pat1

appears before line 20. Otherwise, just the line containing pat1

/pat1/, /pat2/ lines between the line containing pat1 and line containing pat2

Address Specification with negation 10

Range Remarks10! All lines except 10 (! is the negation indicator)1,10! Lines from 11 to end of the file10,$! Lines 1 to 910,+3! All lines except 10, 11, 12 and 1310~3! All lines except line 10 and every third line

after that10, ~3! All lines except lines 10, 11, and 12

/pat1/! All lines not containing pat1/pat1/,+3! All line except lines containing pat1 and three

lines following it/pat1/,~3! All lines containing pat1 and up to the multiple

of 3

Regular Expressions11

Meta character

Meaning

. Matches any single character except newline* Matches zero or more of the character preceding it

e.g.: bugs*, table.*^ Denotes the beginning of the line. ^A denotes lines

starting with A$ Denotes the end of the line. :$ denotes lines ending

with :\ Escape character (\., \*, \[, \\, etc)[ ] matches one or more characters within the brackets.

e.g. [aeiou], [a-z], [a-zA-Z], [0-9], [:alpha:], [a-z?,!][^] matches any characters others than the ones inside

brackets. eg. ^[^13579] denotes all lines not starting with odd numbers, [^02468]$ denotes all lines not ending with even numbers

\<, \> Matches characters at the beginning or end of words

Extended Regular Expressions12

Meta character

Meaning

| alternation. e.g.: ho(use|me), the(y|m), (they|them)+ one or more occurrences of previous character. a+ is

same as aa*)? zero or one occurrences of previous character. {n} exactly n repetitions of the previous char or group{n,} n or more repetitions of the previous char or group{,m} zero to m repetitions on the previous char or group{n, m} n to m repetitions of previous char or group(....) Used for grouping sed –r ... the "-r" option may have to be used on some version

of sed for extended regular expressions to work

e.g.: sed –r '/(pat1|pat2) s/pat3/pat4/'sed –rn '/(pat1|pat2){3,}/ p'

Regular Expressions – Examples13

Example Meaning.{10,} 10 or more characters. Curly braces have

to escaped[0-9]{3}-[0-9]{2}-[0-9]{4} Social Security number([1-9]{3})[1-9]{3}-[0-9]{4}

Phone number (xxx)yyy-zzzz

[0-9]{2,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}

IP address format

[0-9]{3}[ ]*[0-9]{3} Postal code in India[0-9]{5}(-[0-9]{4})? US ZIP Code + 4

Substitution – Format and Options14

[address[!]]s/pat1/pat2/[options]Options: g – global, w – write, i – ignore

case, p – print, n – nth occurrence: or ; +, @ or any other character including

space could also be used as the delimiter. Useful when / is part of the search or replacement string.

Substitution - Examples15

Example Explanationsed 's/pat1/pat2/' fn Substitute the FIRST occurrence of pat1

with pat2sed 's/pat1/pat2/g' fn Substitute ALL occurrences of pat1 with

pat2sed 's/pat1/pat2/3' fn Substitute the third occurrence of pat1 with

pat2sed 's/pat1/pat2/3g' fn Substitute all but the first two occurrences

of pat1 with pat2sed 's/pat1/pat2/gi' fn Substitute ALL occurrences of pat1 with

pat2 ignoring the casesed 's/pat1/pat2/giw new_file' fn

Write to new_files lines containing pat1 substituting with pat2

sed –n 's/pat1/pat2/gp' fn

Print lines containing pat1 substituting with pat2

Substitution – Examples contd16

Example Explanationsed '/pat1/ s/pat2/pat3/g' fn

Substitute all occurrences of pat2 with pat3 on lines containing pat1

sed '/pat1/! s/pat2/pat3/g' fn

Substitute all occurrences of pat2 with pat3 on lines NOT containing pat1

sed '/pat1/,/pat2/ s/pat3/pat4/g' fn

Substitute all occurrences of pat3 with pat4 on lines between pat1 and pat2 (inclusive)

sed '1,100 s/pat1/pat2/gi' fn

Substitute ALL occurrences of pat1 with pat2 ignoring the case between lines 1 and 100

sed '/pat1/,/pat2/! s/pat3/pat4/g' fn

Substitute all occurrences of pat3 with pat4 on lines NOT between pat1 and pat2 (inclusive)

sed '1,100! s/pat1/pat2/3i' fn

Substitute the third occurrence of pat1 with pat2 ignoring the case from 101 to end of file

sed '2~5 s/pat1/pat2/w new_file ' fn

Write every 5th line starting with the 2nd line, substituting pat1 with pat2

Substitution – Examples contd.17

Example Explanationsed 's/\(pat1\|pat2\)/pat3/g' fn

Substitute all occurrences of either pat1 or pat2 with pat3

sed –r 's/(pat1|pat2)/pat3/g' fn

Same as above. With the –r option, parenthesis and alternation characters don't have to be escaped

sed –r 's/abc(pat1|pat2)/pat3/g' fn

Substitute all occurrences of either abcpat1 or abcpat2 with pat3

sed 's/\<pat1/pat3/g' fn Substitute all occurrences pat1 that begin a word with pat3.

sed 's/pat1\>/pat3/g' fn Substitute all occurrences pat1 that is end of a word with pat3

sed 's/\<pat1\>/pat3/g' fn

Substitute all occurrences of the word pat1 with pat3. Note: The angular brackets (< and >) have to be escaped even with the –r option.

sed 's/[a-z]/\u&/g' fn Substitute all lower case letters to upper case letters

sed 's/./&&/g' fn Double each character

Grouping and Back Referencing18

Parts of strings in the Search/Left-hand side can be grouped and referenced in the Replacement/Right-hand side

Up to nine groups possible (\1, \2, ..\9)Groups can be nested or referenced back on

the Search sideSame group can be referenced any number of

times

Substitution with Grouping and Back Referencing. Examples

19

Command Explanationsed 's/^\(.*\):\(.*\)/\2:\1/' fn Swap two fields delimited with :.

"column A:column B" becomes "column B:column A"

sed –r 's/^(.*):(.*)/\2:\1/' fn Same as above. With the –r option, the parentheses don't have to be escaped.

sed –r 's/^([^:]*):([^:]*)/\2:\1/' fn

Same as above. Exchanges only the first two columns even if there are more.

sed –r 's/^(This (.*) nested)/\2 \1/' fn

Group 1 contains everything between "This .. nested". Group 2 contains just the characters between "This" and "nested".

Substitution with Grouping and Back Referencing. Examples

20

Command Explanationsed –r '(\w)(\w)(\w)\w?\3\2\1/p' fn

Print lines with six or seven character long palindromes

sed –r 's/(.)(.)(.)\3\2\1/\1\2\3\1\2\3/g' fn

Convert six char palindrome strings to repetitive strings. Note: Any embedded or trailing six spaces also will match

sed -r ':a;s/(^|[^0-9.])([0-9]+)([0-9]{3})/\1\2,\3/g;ta' numbers.txt

Add comma as thousands separator (Think how easy it would have if we were to do this from the left side)

sed –r 's/(.*)/\1\1/' fn Concatenate the string at the line levelsed –r 's:(.*):\1\1:' fn Same as above, but using : as the

delimiter between search and replacement strings

Inserting text with "i" and "a"21

Example Meaningsed '4i\my text 1\my text 2' in_file

Inserts two lines before line 4

sed '/pat1/ i\my text 1\my text 2' in_file

Inserts two lines before every line that contains pat1

sed '4a\my text 1\my text 2' in_file

Inserts two lines after line 4

sed '$a\my text 1\my text 2' in_file

Inserts two lines after the end of the file

Changing text22

Example Meaningsed '4c\my text 1\my text 2' in_file

Replace line 4 with the two new lines

sed '/pat1/ c\my text 1\my text 2' in_file

Replace all lines containing pat1 with the two new lines

sed '/pat1/, /pat2/ c\my text 1\my text 2' in_file

Replace all lines between pat1 and pat2 with the two new lines

Change affects the whole line. Substitute just the matching words or strings on the line

Deleting lines23

Command Resultsed '10d' in_file Delete the 10th linesed '1,10d' in_file Delete lines between 1 and 10sed '10,$d' in_file Delete lines from 10 to end of filesed '10,+3d' in_file Delete line 10 and 3 lines below (lines 10, 11, 12

and 13)sed '10~3d' in_file Delete every third line after line 10sed '10, ~3d' in_file Delete line 10 and up to the next multiple of 3 (lines

10,11 and 12)sed '/pat1/d' in_file Delete all lines containing pat1sed '/pat1/,+3d' in_file

Delete all lines containing pat1 and three lines following it

sed '/pat1/,~3d' in_file

Delete lines containing pat1 and up to next the multiple of 3

sed '/pat1/!d' in_file Delete all lines NOT containing pat1sed –r '/[0-9]{8,}/d' in_file Delete lines containing 8 or more digits

Printing lines24

Command Resultsed –n '10,/pat1/p' in_file

Print lines from 10 to next line containing pat1

sed –n '10,+3p' in_file

Print line 10 and 3 lines below (lines 10, 11, 12 and 13)

sed –n '10~3p' in_file

Print every third line after line 10

sed –n '10, ~3p' in_file

Print line 10 and up to the next multiple of 3 (lines 10,11 and 12)

sed –n '/pat1/p' in_file

Print all lines containing pat1

sed –n '/pat1/,+3p' in_file

Print all lines containing pat1 and three lines following it

sed –n '/pat1/,~3p' in_file

Print lines containing pat1 and up to next the multiple of 3

sed –n '/pat1/,/pat2/p' in_file

Repetitively print lines between the ranges containing pat1 and pat2

sed –n '/pat1/!p' in_file

Print all lines NOT containing pat1

sed -n '/([0-9]\{3\})[0-9]\{3\}-[0-9]\{4\}/p' in_file

Print lines containing phones numbers.

A note on the –r option25

Command Resultsed -n '/([0-9]\{3\})[0-9]\{3\}-[0-9]\{4\}/p' in_file

Print lines containing phones numbers like (408)806-8330, (408)349-3699

sed -nr '/\([0-9]{3}\)[0-9]{3}-[0-9]{4}/p' in_file

Same as above with the –r option. Note the escaping of parentheses. Without escaping, ( and ) become meta characters, not part of the search string.

Printing with and without "-n" option - Comparison

26

Command Commentssed '1,10p' fn Lines 1 to 10 are printed twice. Rest of the

lines are printed oncesed –n '1,10p' fn Lines 1 to 10 are printed just once.sed 's/pat1/pat2/p' fn Lines containing pat1 are printed twice after

substituting pat1 with pat2. Other lines are printed once

sed –n 's/pat1/pat2/p' fn

Only lines containing pat1 are printed after substituting pat1 with pat2

Inserting/reading from a file27

Command Resultsed '10r my_file' in_file Insert my_file after the 10th linesed '1,10r my_file' in_file

Insert my_file after each line between 1 and 10

sed '$r my_file' in_file Append my_file at the endsed '10,+3r my_file' in_file

Insert my_file after every line between10 and 13.

sed '10~3r my_file' in_file

Insert my_file every third line after line 10

sed '/pat1/,/pat2/r my_file' in_file

Insert my_file after every line between pat1 and pat2

Writing selectively28

Command Resultsed '1,10w nf' in_file Write lines 1 to 10 to new file "nf"sed '/^pat1/w nf' in_file Write all lines beginning with pat1 to nfsed '/pat1$/w nf' in_file Write lines that end with pat1

sed '/\<pat1\>/w nf' in_file

Write lines that contain the WORD pat1 to nf

sed –e '1~2w odd_lines'-e '2~2w even_lines` in_file

Write odd numbered lines to odd_lines, and even numbered files to even_lines

sed '/pat1/, /pat2/w nf' in_file

Write all lines between pat1 and pat2 to nf

sed –r '/[0-9]{3}-[0-9]{2}-[0-9]{4}w ssn' in_file

Write lines containing Social Security Number nnn-nn-nnnn to the file ssn. Note the use of –r option. Otherwise, { and } have to be escaped.

sed –r '/(.)(.)(.)\3\2\1/w palin.txt' in_file

Write lines containing palindromes to palin.txt

Transliterating (like ‘tr’)29

Command Resulty/source/dest/ One to one character by character

substitution. source and dest strings have to be of the same length

sed 'y/13579/aeiou/' fn replace 1 by a, 3 by e, and so on

Quitting from sed30

Command Resultsed '15q' in_file Quit after the 15th line. Print 14 linessed '/\<[Ee]nd\>/q' in_file

Quit after encountering the word End or end

Grouping multiple commands with { and }

31

sed –n '/pat1/,/pat2/ {s/pat3/pat4/gs/pat5/pat6/gw new_filep}' inp_file

One liners32

Command Resultsed 'p' in_file Duplicate all the lines one below the othersed '/^$/ d' in_file Delete the blank lines from the input filesed ‘s/./&:/80’ Add a colon after the 80th character.

(replace the eightieth occurrence of the pattern with itself and a colon)

sed –r ‘s/^(.*) \1/\1/' fn Replace duplicate stringssed G in_file Add a blank line after everyline. find . –type f –name my_files.*.sql –exec sed –i ‘s/TableA/TableB/g’ {} +

Replace all occurrences of TableA with TableB in all my SQL scripts

More oneliners http://www.catonmat.net/blog/sed-one-liners-explained-part-one/

Additional concepts not covered33

labels, branching to themh, H: copy/append patternspace to holdspaceg, G: copy/append holdspace to patternspacex: exchange the contents of the hold and

pattern spaces

References34

sed & awk by Dale Dougherty & Arnold Robins

http://www.grymoire.com/Unix/Sed.htmlhttp://sed.sourceforge.net/sedfaq3.htmlhttp://en.wikipedia.org/wiki/Sedhttp://groups.yahoo.com/group/sed-users/ http://sed.sourceforge.net/sed1line.txt

Q & A35

http://twiki.corp.yahoo.com/view/Main/LpalaniYahoo

Devel-sed@yahoo-inc.com

Unanswered questions36

How to simulate tail with sed?How to substitute the nth to mth occurrences

of pat1 with pat2?How to substitute the last N occurrences or

the nth occurrence from the end?How to identify palindrome of any length?

top related