Shell Practice: Introduction to the sed stream editor
|
Searching
You can use the search function among other things to replace text sections. The search query represents the addressing. You can also use regex for search patterns. Table 5 show some of the possibilities, and Table 6 provides some examples. In this table, sed is used both in the data stream and the direct access to the text file. With composite addressing (two or more patterns), the statement is applied to all lines (except the first) matching the first address, up to and including the next line matching the second address.
Table 5
Patterns and Addressing
Action | Pattern |
---|---|
All lines | (null) |
Line 25 | 25 |
Not line 25 | 25! |
Lines 10 through 20 | 10,20 |
Last line | $ |
Not pattern | '/PATTERN/!' |
Character at beginning of line | ^CHAR |
String | /STRING/ |
Character set | [CHARS] |
Any character | [:alpha:] |
Lowercase | [:lower:] |
Uppercase | [:upper:] |
Alphanumeric | [:alnum:] |
Digit | [:digit:] |
Hexadecimal digit | [:xdigit:] |
Tab and space | [:blank:] |
Space | [:space:] |
Control character | [:cntrl:] |
Printable characters (no control characters) | [:print:] |
Visible characters (without spaces) | [:graph:] |
Punctuation | [:punct:] |
Table 6
Sample Searches and Patterns
Search for | Pattern | Example | Figure |
---|---|---|---|
Term, Name | '/TERM/' | cat textdata.txt | sed -n '/Meier/p' | - |
All lines containing "man" or "Man" | '/[Mm]an/p' | sed -n '/[Mm]an/p' textdata.txt | Figure 2 |
All lines except 3 through 5 | '3,5!' | sed -n '3,5!'p textdata.txt | Figure 3 |
All lines except those containing "Man" | '/Man/!' | sed -n '/Man/!'p textdata.txt | Figure 4 |
Lines containing "H" or "G" | '/[H|G]/' | sed -n '/[H|G]/'p textdata.txt | - |
Lines not containing "H" or "G" | '/[H]\|[G]/!' | sed -n '/[H]\|[G]/!'p textdata.txt | Figure 5 |
Line 3 | 3 | cat textdata.txt | sed -n '3p' | - |
Last line | '$p' | cat textdata.txt | sed -n '$p' | - |
Multiple patterns: Do not output lines containing an "R" somewhere followed by an "M" somewhere else | '/[R]./,/[M]./!' | sed -n '/[R]./,/[M]./!'p textdata.txt | Figure 6 |
All lines containing some alphanumeric characters (not all space characters) | '/[:alnum:]/' | cat textdata.txt | sed -n '/[:alnum:]/'p | Figure 7 |
Note that in Figure 6, the j comes before J in the text file. In the first example in Table 6, none of the lines containing H and J are output, which works because the order in the command and text file are the same. The second example with the negated H and j shows, however, that a line containing H must first be found. That's why johann still appears in the output!
If you want be certain in a clear way that sed is doing what you need it to do, you can combine several calls in the pipe. The following command suppresses empty lines and "Man" (see Figure 8):
cat textdata.txt | sed -n '/[:alnum:]/'p | sed -n '/Man/!'p
Substituting and Removing
You use the s instruction to replace matched expressions. The length of search and replace strings is irrelevant. You can see the detailed syntax shown in Figure 9.
You can limit the search and replace statement to specific lines by preceding command with the line number as shown in the following example:
sed -n '5s/OLD/NEW/p' [TEXTFILE]
Or, for a range of lines:
sed -n '1,4/OLD/NEW/p' [TEXTFILE]
You can also suppress changes to certain lines using the exclamation point:
sed -n '20-80!s/OLD/NEW/p' [TEXTFILE]
Furthermore, you can limit changes to lines which contain certain strings or patterns that are not the same as the search and replace statement:
sed -n '/[STRING|PATTERN]/s/OLD/NEW/gp' [TEXTFILE]
You can delete the matched string with an empty string.
The first occurrence of the search string on a line is processed. To replace all instances, add the g (greedy) option at the end of the statement. The stream editor can be a silent partner if the -n option is set. So, if you want to see what's going on, add the p (print) option. You can also write results to an output file with the w (write) option. Table 7 shows some short examples.
Table 7
Sample Search and Replace Statements
Action | Example | Figure |
---|---|---|
Replace pattern at the first occurrence only | cat textdata.txt | sed -n 's/e/E/p' | Figure 10 |
Replace pattern at every occurrence | cat textdata.txt | sed -n 's/e/E/gp' | Figure 10 |
Delete the word "Man" | sed -n 's/Man//gp' textdata.txt | Figure 11 |
Replace "Iron" with "Tin" on line 4 | cat textdata.txt | sed -n '4s/Iron/Tin/gp' | Figure 12 |
Replace "0" with "089" on all lines containing "Man" or "man" | sed -n '/[Mm]an/s/0/089/gp' textdata.txt | Figure 13 |
Replace "0" with "089" on all lines except those containing "Man" or "man" | sed -n '/[Mm]an/!s/0/089/gp' textdata.txt | Figure 14 |
Delete all numbers and backslashes (/ ) and hyphens (- ) | cat textdata.txt | sed -n s'/[0-9\/-]//'gp | Figure 15 |
You can see a more complex example in Listing 1. It converts the inconsistently formatted date syntax in testlist.txt to a common, unified, albeit European (DD/MM/YYYY ) format. Be sure to press the Enter key immediately after the \ at the line's end. Alternatively, you can omit the sign and use the pipe character to connect with the line that follows it; however, this results in a less clear screen display.
The list is read in line 1 and starts the pipes in line 2. Line 2 takes any partially present leading space characters and substitutes the number 0 . Line 3 replaces any minus signs in dates with spaces. Lines 4 and 5 substitute any month written as a word with its numeric values followed by a dot. Line 6 substitutes any two-digit numbers at the beginning of a line (^ ), with the first being 1 through 3 and the second with any digit, and any space character with "itself" (& ) followed by a dot.
To make the search pattern repeatable during the replacement, enclose it in parentheses – which you have to be sure to escape with \ . The sed statement in line 7 deletes all existing space characters (through the g option for s ).
The uniq command on the last line ensures that all duplicate lines are uniquely output. Figure 16 shows the results. You can also "carry over" all or part of the original string into the replacement patterns in the replacement statement. Check out the following example:
echo "happy" | sed -n s'/happy/un&/'p
This example replaces happy with unhappy . You can also convert from lowercase to uppercase:
cat textdata.txt | sed -n s'/\([[:lower:]]\)/\U&/'pg
The \U before the & indicates the output must be converted into uppercase. You can do the following:
cat textdata.txt | sed -n s'/\([[:upper:]]/\L&/'pg
to convert from uppercase to lowercase.
Buy this article as PDF
Pages: 6
(incl. VAT)