.ds :? Advanced Editing .de PT .lt \\n(LLu .pc % .nr PN \\n% .if \\n%-1 .if o .tl '\s9\f2\*(:?\fP''\\n(PN\s0' .if \\n%-1 .if e .tl '\s9\\n(PN''\f2\*(:?\^\fP\s0' .lt \\n(.lu .. .tr _\(em .de UL .if n .ul .if n \\$3\\$1\\$2 .if t \\$3\f3\\$1\fP\\$2 .. .de IT .if t \\$3\f2\\$1\fP\\$2 .if n .ul .if n \\$3\\$1\\$2 .. .de UI \f3\\$1\fI\\$2\fR\\$3 .. .de P1 .if n .ls 1 .nf .if n .ta 5 10 15 20 25 30 35 40 45 50 55 60 .if t .ta .3i .6i .9i 1.2i 1.5i 1.8i .tr -\- . use first argument as indent if present .if \\n(.$ .DS I \\$1 .if !\\n(.$ .DS I 5 .. .de P2 .DE .tr -- .if n .ls 2 .. .if t .ds B \s6\|\v'.1m'\(sq\v'-.1m'\|\s0 .if n .ds B [] .if t .ds m \(mi .if n .ds m - .if t .ds n \(no .if n .ds n - .if t .ds s \v'.41m'\s+4*\s-4\v'-.41m' .if n .ds s * .if t .ds S \(sl .if n .ds S / .if t .ds d \s+4\&.\&\s-4 .if n .ds d \&.\& .if t .ds a \z@@ .if n .ds a @ .if t .ds . \s+2\fB.\fP\s-2 .if n .ds . . .if t .ds e \z\e\h'2u'\e .if n .ds e \e . 2=not last lines; 4= no -xx; 8=no xx- .tr *\(** .hy 14 .TL Advanced Editing on U\s-2NIX\s+2 .AU "MH 2C518" 6021 Brian W. Kernighan .AI .MH .AB .nr PS 9 .nr VS 11 This paper is meant to help secretaries, typists and programmers to make effective use of the .UX facilities for preparing and editing text. It provides explanations and examples of .IP \(bu special characters, line addressing and global commands in the editor .UL ed ; .IP \(bu commands for ``cut and paste'' operations on files and parts of files, including the .UL mv , .UL cp , .UL cat and .UL rm commands, and the .UL r , .UL w , .UL m and .UL t commands of the editor; .IP \(bu editing scripts and editor-based programs like .UL grep and .UL sed . .PP Although the treatment is aimed at non-programmers, new .UC UNIX users with any background should find helpful hints on how to get their jobs done more easily. .AE .CS 16 0 16 0 0 3 .if n .ls 2 .if t .2C .NH INTRODUCTION .PP Although .UX provides remarkably effective tools for text editing, that by itself is no guarantee that everyone will automatically make the most effective use of them. In particular, people who are not computer specialists _ typists, secretaries, casual users _ often use the system less effectively than they might. .PP This document is intended as a sequel to .ul A Tutorial Introduction to the UNIX Text Editor [1], providing explanations and examples of how to edit with less effort. (You should also be familiar with the material in .ul UNIX For Beginners [2].) Further information on all commands discussed here can be found in .ul The UNIX Programmer's Manual [3]. .PP Examples are based on observations of users and the difficulties they encounter. Topics covered include special characters in searches and substitute commands, line addressing, the global commands, and line moving and copying. There are also brief discussions of effective use of related tools, like those for file manipulation, and those based on .UL ed , like .UL grep and .UL sed . .PP A word of caution. There is only one way to learn to use something, and that is to .ul use it. Reading a description is no substitute for trying something. A paper like this one should give you ideas about what to try, but until you actually try something, you will not learn it. .NH SPECIAL CHARACTERS .PP The editor .UL ed is the primary interface to the system for many people, so it is worthwhile to know how to get the most out of .UL ed for the least effort. .PP The next few sections will discuss shortcuts and labor-saving devices. Not all of these will be instantly useful to any one person, of course, but a few will be, and the others should give you ideas to store away for future use. And as always, until you try these things, they will remain theoretical knowledge, not something you have confidence in. .SH The List command `l' .PP .UL ed provides two commands for printing the contents of the lines you're editing. Most people are familiar with .UL p , in combinations like .P1 1,$p .P2 to print all the lines you're editing, or .P1 s/abc/def/p .P2 to change `abc' to `def' on the current line. Less familiar is the .ul list command .UL l (the letter `\fIl\|\fR'), which gives slightly more information than .UL p . In particular, .UL l makes visible characters that are normally invisible, such as tabs and backspaces. If you list a line that contains some of these, .UL l will print each tab as .UL \z\(mi> and each backspace as .UL \z\(mi< . This makes it much easier to correct the sort of typing mistake that inserts extra spaces adjacent to tabs, or inserts a backspace followed by a space. .PP The .UL l command also `folds' long lines for printing _ any line that exceeds 72 characters is printed on multiple lines; each printed line except the last is terminated by a backslash .UL \*e , so you can tell it was folded. This is useful for printing long lines on short terminals. .PP Occasionally the .UL l command will print in a line a string of numbers preceded by a backslash, such as \*e07 or \*e16. These combinations are used to make visible characters that normally don't print, like form feed or vertical tab or bell. Each such combination is a single character. When you see such characters, be wary _ they may have surprising meanings when printed on some terminals. Often their presence means that your finger slipped while you were typing; you almost never want them. .SH The Substitute Command `s' .PP Most of the next few sections will be taken up with a discussion of the substitute command .UL s . Since this is the command for changing the contents of individual lines, it probably has the most complexity of any .UL ed command, and the most potential for effective use. .PP As the simplest place to begin, recall the meaning of a trailing .UL g after a substitute command. With .P1 s/this/that/ .P2 and .P1 s/this/that/g .P2 the first one replaces the .ul first `this' on the line with `that'. If there is more than one `this' on the line, the second form with the trailing .UL g changes .ul all of them. .PP Either form of the .UL s command can be followed by .UL p or .UL l to `print' or `list' (as described in the previous section) the contents of the line: .P1 s/this/that/p s/this/that/l s/this/that/gp s/this/that/gl .P2 are all legal, and mean slightly different things. Make sure you know what the differences are. .PP Of course, any .UL s command can be preceded by one or two `line numbers' to specify that the substitution is to take place on a group of lines. Thus .P1 1,$s/mispell/misspell/ .P2 changes the .ul first occurrence of `mispell' to `misspell' on every line of the file. But .P1 1,$s/mispell/misspell/g .P2 changes .ul every occurrence in every line (and this is more likely to be what you wanted in this particular case). .PP You should also notice that if you add a .UL p or .UL l to the end of any of these substitute commands, only the last line that got changed will be printed, not all the lines. We will talk later about how to print all the lines that were modified. .SH The Undo Command `u' .PP Occasionally you will make a substitution in a line, only to realize too late that it was a ghastly mistake. The `undo' command .UL u lets you `undo' the last substitution: the last line that was substituted can be restored to its previous state by typing the command .P1 u .P2 .SH The Metacharacter `\*.' .PP As you have undoubtedly noticed when you use .UL ed , certain characters have unexpected meanings when they occur in the left side of a substitute command, or in a search for a particular line. In the next several sections, we will talk about these special characters, which are often called `metacharacters'. .PP The first one is the period `\*.'. On the left side of a substitute command, or in a search with `/.../', `\*.' stands for .ul any single character. Thus the search .P1 /x\*.y/ .P2 finds any line where `x' and `y' occur separated by a single character, as in .P1 x+y x\-y x\*By x\*.y .P2 and so on. (We will use \*B to stand for a space whenever we need to make it visible.) .PP Since `\*.' matches a single character, that gives you a way to deal with funny characters printed by .UL l . Suppose you have a line that, when printed with the .UL l command, appears as .P1 .... th\*e07is .... .P2 and you want to get rid of the \*e07 (which represents the bell character, by the way). .PP The most obvious solution is to try .P1 s/\*e07// .P2 but this will fail. (Try it.) The brute force solution, which most people would now take, is to re-type the entire line. This is guaranteed, and is actually quite a reasonable tactic if the line in question isn't too big, but for a very long line, re-typing is a bore. This is where the metacharacter `\*.' comes in handy. Since `\*e07' really represents a single character, if we say .P1 s/th\*.is/this/ .P2 the job is done. The `\*.' matches the mysterious character between the `h' and the `i', .ul whatever it is. .PP Bear in mind that since `\*.' matches any single character, the command .P1 s/\*./,/ .P2 converts the first character on a line into a `,', which very often is not what you intended. .PP As is true of many characters in .UL ed , the `\*.' has several meanings, depending on its context. This line shows all three: .P1 \&\*.s/\*./\*./ .P2 The first `\*.' is a line number, the number of the line we are editing, which is called `line dot'. (We will discuss line dot more in Section 3.) The second `\*.' is a metacharacter that matches any single character on that line. The third `\*.' is the only one that really is an honest literal period. On the .ul right side of a substitution, `\*.' is not special. If you apply this command to the line .P1 Now is the time\*. .P2 the result will be .P1 \&\*.ow is the time\*. .P2 which is probably not what you intended. .SH The Backslash `\*e' .PP Since a period means `any character', the question naturally arises of what to do when you really want a period. For example, how do you convert the line .P1 Now is the time\*. .P2 into .P1 Now is the time? .P2 The backslash `\*e' does the job. A backslash turns off any special meaning that the next character might have; in particular, `\*e\*.' converts the `\*.' from a `match anything' into a period, so you can use it to replace the period in .P1 Now is the time\*. .P2 like this: .P1 s/\*e\*./?/ .P2 The pair of characters `\*e\*.' is considered by .UL ed to be a single real period. .PP The backslash can also be used when searching for lines that contain a special character. Suppose you are looking for a line that contains .P1 \&\*.PP .P2 The search .P1 /\*.PP/ .P2 isn't adequate, for it will find a line like .P1 THE APPLICATION OF ... .P2 because the `\*.' matches the letter `A'. But if you say .P1 /\*e\*.PP/ .P2 you will find only lines that contain `\*.PP'. .PP The backslash can also be used to turn off special meanings for characters other than `\*.'. For example, consider finding a line that contains a backslash. The search .P1 /\*e/ .P2 won't work, because the `\*e' isn't a literal `\*e', but instead means that the second `/' no longer \%delimits the search. But by preceding a backslash with another one, you can search for a literal backslash. Thus .P1 /\*e\*e/ .P2 does work. Similarly, you can search for a forward slash `/' with .P1 /\*e// .P2 The backslash turns off the meaning of the immediately following `/' so that it doesn't terminate the /.../ construction prematurely. .PP As an exercise, before reading further, find two substitute commands each of which will convert the line .P1 \*ex\*e\*.\*ey .P2 into the line .P1 \*ex\*ey .P2 .PP Here are several solutions; verify that each works as advertised. .P1 s/\*e\*e\*e\*.// s/x\*.\*./x/ s/\*.\*.y/y/ .P2 .PP A couple of miscellaneous notes about backslashes and special characters. First, you can use any character to delimit the pieces of an .UL s command: there is nothing sacred about slashes. (But you must use slashes for context searching.) For instance, in a line that contains a lot of slashes already, like .P1 //exec //sys.fort.go // etc... .P2 you could use a colon as the delimiter _ to delete all the slashes, type .P1 s:/::g .P2 .PP Second, if # and @ are your character erase and line kill characters, you have to type \*e# and \*e@; this is true whether you're talking to .UL ed or any other program. .PP When you are adding text with .UL a or .UL i or .UL c , backslash is not special, and you should only put in one backslash for each one you really want. .SH The Dollar Sign `$' .PP The next metacharacter, the `$', stands for `the end of the line'. As its most obvious use, suppose you have the line .P1 Now is the .P2 and you wish to add the word `time' to the end. Use the $ like this: .P1 s/$/\*Btime/ .P2 to get .P1 Now is the time .P2 Notice that a space is needed before `time' in the substitute command, or you will get .P1 Now is thetime .P2 .PP As another example, replace the second comma in the following line with a period without altering the first: .P1 Now is the time, for all good men, .P2 The command needed is .P1 s/,$/\*./ .P2 The $ sign here provides context to make specific which comma we mean. Without it, of course, the .UL s command would operate on the first comma to produce .P1 Now is the time\*. for all good men, .P2 .PP As another example, to convert .P1 Now is the time\*. .P2 into .P1 Now is the time? .P2 as we did earlier, we can use .P1 s/\*.$/?/ .P2 .PP Like `\*.', the `$' has multiple meanings depending on context. In the line .P1 $s/$/$/ .P2 the first `$' refers to the last line of the file, the second refers to the end of that line, and the third is a literal dollar sign, to be added to that line. .SH The Circumflex `^' .PP The circumflex (or hat or caret) `^' stands for the beginning of the line. For example, suppose you are looking for a line that begins with `the'. If you simply say .P1 /the/ .P2 you will in all likelihood find several lines that contain `the' in the middle before arriving at the one you want. But with .P1 /^the/ .P2 you narrow the context, and thus arrive at the desired one more easily. .PP The other use of `^' is of course to enable you to insert something at the beginning of a line: .P1 s/^/\*B/ .P2 places a space at the beginning of the current line. .PP Metacharacters can be combined. To search for a line that contains .ul only the characters .P1 \&\*.PP .P2 you can use the command .P1 /^\*e\*.PP$/ .P2 .SH The Star `*' .PP Suppose you have a line that looks like this: .P1 \fItext \fR x y \fI text \fR .P2 where .ul text stands for lots of text, and there are some indeterminate number of spaces between the .UL x and the .UL y . Suppose the job is to replace all the spaces between .UL x and .UL y by a single space. The line is too long to retype, and there are too many spaces to count. What now? .PP This is where the metacharacter `*' comes in handy. A character followed by a star stands for as many consecutive occurrences of that character as possible. To refer to all the spaces at once, say .P1 s/x\*B*y/x\*By/ .P2 The construction `\*B*' means `as many spaces as possible'. Thus `x\*B*y' means `an x, as many spaces as possible, then a y'. .PP The star can be used with any character, not just space. If the original example was instead .P1 \fItext \fR x--------y \fI text \fR .P2 then all `\-' signs can be replaced by a single space with the command .P1 s/x-*y/x\*By/ .P2 .PP Finally, suppose that the line was .P1 \fItext \fR x\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.y \fI text \fR .P2 Can you see what trap lies in wait for the unwary? If you blindly type .P1 s/x\*.*y/x\*By/ .P2 what will happen? The answer, naturally, is that it depends. If there are no other x's or y's on the line, then everything works, but it's blind luck, not good management. Remember that `\*.' matches .ul any single character? Then `\*.*' matches as many single characters as possible, and unless you're careful, it can eat up a lot more of the line than you expected. If the line was, for example, like this: .P1 \fItext \fRx\fI text \fR x\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.y \fI text \fRy\fI text \fR .P2 then saying .P1 s/x\*.*y/x\*By/ .P2 will take everything from the .ul first `x' to the .ul last `y', which, in this example, is undoubtedly more than you wanted. .PP The solution, of course, is to turn off the special meaning of `\*.' with `\*e\*.': .P1 s/x\*e\*.*y/x\*By/ .P2 Now everything works, for `\*e\*.*' means `as many .ul periods as possible'. .PP There are times when the pattern `\*.*' is exactly what you want. For example, to change .P1 Now is the time for all good men .... .P2 into .P1 Now is the time\*. .P2 use `\*.*' to eat up everything after the `for': .P1 s/\*Bfor\*.*/\*./ .P2 .PP There are a couple of additional pitfalls associated with `*' that you should be aware of. Most notable is the fact that `as many as possible' means .ul zero or more. The fact that zero is a legitimate possibility is sometimes rather surprising. For example, if our line contained .P1 \fItext \fR xy \fI text \fR x y \fI text \fR .P2 and we said .P1 s/x\*B*y/x\*By/ .P2 the .ul first `xy' matches this pattern, for it consists of an `x', zero spaces, and a `y'. The result is that the substitute acts on the first `xy', and does not touch the later one that actually contains some intervening spaces. .PP The way around this, if it matters, is to specify a pattern like .P1 /x\*B\*B*y/ .P2 which says `an x, a space, then as many more spaces as possible, then a y', in other words, one or more spaces. .PP The other startling behavior of `*' is again related to the fact that zero is a legitimate number of occurrences of something followed by a star. The command .P1 s/x*/y/g .P2 when applied to the line .P1 abcdef .P2 produces .P1 yaybycydyeyfy .P2 which is almost certainly not what was intended. The reason for this behavior is that zero is a legal number of matches, and there are no x's at the beginning of the line (so that gets converted into a `y'), nor between the `a' and the `b' (so that gets converted into a `y'), nor ... and so on. Make sure you really want zero matches; if not, in this case write .P1 s/xx*/y/g .P2 `xx*' is one or more x's. .SH The Brackets `[ ]' .PP Suppose that you want to delete any numbers that appear at the beginning of all lines of a file. You might first think of trying a series of commands like .P1 1,$s/^1*// 1,$s/^2*// 1,$s/^3*// .P2 and so on, but this is clearly going to take forever if the numbers are at all long. Unless you want to repeat the commands over and over until finally all numbers are gone, you must get all the digits on one pass. This is the purpose of the brackets [ and ]. .PP The construction .P1 [0123456789] .P2 matches any single digit _ the whole thing is called a `character class'. With a character class, the job is easy. The pattern `[0123456789]*' matches zero or more digits (an entire number), so .P1 1,$s/^[0123456789]*// .P2 deletes all digits from the beginning of all lines. .PP Any characters can appear within a character class, and just to confuse the issue there are essentially no special characters inside the brackets; even the backslash doesn't have a special meaning. To search for special characters, for example, you can say .P1 /[\*.\*e$^[]/ .P2 Within [...], the `[' is not special. To get a `]' into a character class, make it the first character. .PP It's a nuisance to have to spell out the digits, so you can abbreviate them as [0\-9]; similarly, [a\-z] stands for the lower case letters, and [A\-Z] for upper case. .PP As a final frill on character classes, you can specify a class that means `none of the following characters'. This is done by beginning the class with a `^': .P1 [^0-9] .P2 stands for `any character .ul except a digit'. Thus you might find the first line that doesn't begin with a tab or space by a search like .P1 /^[^(space)(tab)]/ .P2 .PP Within a character class, the circumflex has a special meaning only if it occurs at the beginning. Just to convince yourself, verify that .P1 /^[^^]/ .P2 finds a line that doesn't begin with a circumflex. .SH The Ampersand `&' .PP The ampersand `&' is used primarily to save typing. Suppose you have the line .P1 Now is the time .P2 and you want to make it .P1 Now is the best time .P2 Of course you can always say .P1 s/the/the best/ .P2 but it seems silly to have to repeat the `the'. The `&' is used to eliminate the repetition. On the .ul right side of a substitute, the ampersand means `whatever was just matched', so you can say .P1 s/the/& best/ .P2 and the `&' will stand for `the'. Of course this isn't much of a saving if the thing matched is just `the', but if it is something truly long or awful, or if it is something like `.*' which matches a lot of text, you can save some tedious typing. There is also much less chance of making a typing error in the replacement text. For example, to parenthesize a line, regardless of its length, .P1 s/\*.*/(&)/ .P2 .PP The ampersand can occur more than once on the right side: .P1 s/the/& best and & worst/ .P2 makes .P1 Now is the best and the worst time .P2 and .P1 s/\*.*/&? &!!/ .P2 converts the original line into .P1 Now is the time? Now is the time!! .P2 .PP To get a literal ampersand, naturally the backslash is used to turn off the special meaning: .P1 s/ampersand/\*e&/ .P2 converts the word into the symbol. Notice that `&' is not special on the left side of a substitute, only on the .ul right side. .SH Substituting Newlines .PP .UL ed provides a facility for splitting a single line into two or more shorter lines by `substituting in a newline'. As the simplest example, suppose a line has gotten unmanageably long because of editing (or merely because it was unwisely typed). If it looks like .P1 \fItext \fR xy \fI text \fR .P2 you can break it between the `x' and the `y' like this: .P1 s/xy/x\*e y/ .P2 This is actually a single command, although it is typed on two lines. Bearing in mind that `\*e' turns off special meanings, it seems relatively intuitive that a `\*e' at the end of a line would make the newline there no longer special. .PP You can in fact make a single line into several lines with this same mechanism. As a large example, consider underlining the word `very' in a long line by splitting `very' onto a separate line, and preceding it by the .UL roff or .UL nroff formatting command `.ul'. .P1 \fItext \fR a very big \fI text \fR .P2 The command .P1 s/\*Bvery\*B/\*e \&.ul\*e very\*e / .P2 converts the line into four shorter lines, preceding the word `very' by the line `.ul', and eliminating the spaces around the `very', all at the same time. .PP When a newline is substituted in, dot is left pointing at the last line created. .PP .SH Joining Lines .PP Lines may also be joined together, but this is done with the .UL j command instead of .UL s . Given the lines .P1 Now is \*Bthe time .P2 and supposing that dot is set to the first of them, then the command .P1 j .P2 joins them together. No blanks are added, which is why we carefully showed a blank at the beginning of the second line. .PP All by itself, a .UL j command joins line dot to line dot+1, but any contiguous set of lines can be joined. Just specify the starting and ending line numbers. For example, .P1 1,$jp .P2 joins all the lines into one big one and prints it. (More on line numbers in Section 3.) .SH Rearranging a Line with \*e( ... \*e) .PP (This section should be skipped on first reading.) Recall that `&' is a shorthand that stands for whatever was matched by the left side of an .UL s command. In much the same way you can capture separate pieces of what was matched; the only difference is that you have to specify on the left side just what pieces you're interested in. .PP Suppose, for instance, that you have a file of lines that consist of names in the form .P1 Smith, A. B. Jones, C. .P2 and so on, and you want the initials to precede the name, as in .P1 A. B. Smith C. Jones .P2 It is possible to do this with a series of editing commands, but it is tedious and error-prone. (It is instructive to figure out how it is done, though.) .PP The alternative is to `tag' the pieces of the pattern (in this case, the last name, and the initials), and then rearrange the pieces. On the left side of a substitution, if part of the pattern is enclosed between \*e( and \*e), whatever matched that part is remembered, and available for use on the right side. On the right side, the symbol `\*e1' refers to whatever matched the first \*e(...\*e) pair, `\*e2' to the second \*e(...\*e), and so on. .PP The command .P1 1,$s/^\*e([^,]*\*e),\*B*\*e(\*.*\*e)/\*e2\*B\*e1/ .P2 although hard to read, does the job. The first \*e(...\*e) matches the last name, which is any string up to the comma; this is referred to on the right side with `\*e1'. The second \*e(...\*e) is whatever follows the comma and any spaces, and is referred to as `\*e2'. .PP Of course, with any editing sequence this complicated, it's foolhardy to simply run it and hope. The global commands .UL g and .UL v discussed in section 4 provide a way for you to print exactly those lines which were affected by the substitute command, and thus verify that it did what you wanted in all cases. .NH LINE ADDRESSING IN THE EDITOR .PP The next general area we will discuss is that of line addressing in .UL ed , that is, how you specify what lines are to be affected by editing commands. We have already used constructions like .P1 1,$s/x/y/ .P2 to specify a change on all lines. And most users are long since familiar with using a single newline (or return) to print the next line, and with .P1 /thing/ .P2 to find a line that contains `thing'. Less familiar, surprisingly enough, is the use of .P1 ?thing? .P2 to scan .ul backwards for the previous occurrence of `thing'. This is especially handy when you realize that the thing you want to operate on is back up the page from where you are currently editing. .PP The slash and question mark are the only characters you can use to delimit a context search, though you can use essentially any character in a substitute command. .SH Address Arithmetic .PP The next step is to combine the line numbers like `\*.', `$', `/.../' and `?...?' with `+' and `\-'. Thus .P1 $-1 .P2 is a command to print the next to last line of the current file (that is, one line before line `$'). For example, to recall how far you got in a previous editing session, .P1 $-5,$p .P2 prints the last six lines. (Be sure you understand why it's six, not five.) If there aren't six, of course, you'll get an error message. .PP As another example, .P1 \&\*.-3,\*.+3p .P2 prints from three lines before where you are now (at line dot) to three lines after, thus giving you a bit of context. By the way, the `+' can be omitted: .P1 \&\*.-3,\*.3p .P2 is absolutely identical in meaning. .PP Another area in which you can save typing effort in specifying lines is to use `\-' and `+' as line numbers by themselves. .P1 - .P2 by itself is a command to move back up one line in the file. In fact, you can string several minus signs together to move back up that many lines: .P1 --- .P2 moves up three lines, as does `\-3'. Thus .P1 -3,+3p .P2 is also identical to the examples above. .PP Since `\-' is shorter than `\*.\-1', constructions like .P1 -,\*.s/bad/good/ .P2 are useful. This changes `bad' to `good' on the previous line and on the current line. .PP `+' and `\-' can be used in combination with searches using `/.../' and `?...?', and with `$'. The search .P1 /thing/-- .P2 finds the line containing `thing', and positions you two lines before it. .SH Repeated Searches .PP Suppose you ask for the search .P1 /horrible thing/ .P2 and when the line is printed you discover that it isn't the horrible thing that you wanted, so it is necessary to repeat the search again. You don't have to re-type the search, for the construction .P1 // .P2 is a shorthand for `the previous thing that was searched for', whatever it was. This can be repeated as many times as necessary. You can also go backwards: .P1 ?? .P2 searches for the same thing, but in the reverse direction. .PP Not only can you repeat the search, but you can use `//' as the left side of a substitute command, to mean `the most recent pattern'. .P1 /horrible thing/ .ft I .... ed prints line with `horrible thing' ... .ft R s//good/p .P2 To go backwards and change a line, say .P1 ??s//good/ .P2 Of course, you can still use the `&' on the right hand side of a substitute to stand for whatever got matched: .P1 //s//&\*B&/p .P2 finds the next occurrence of whatever you searched for last, replaces it by two copies of itself, then prints the line just to verify that it worked. .SH Default Line Numbers and the Value of Dot .PP One of the most effective ways to speed up your editing is always to know what lines will be affected by a command if you don't specify the lines it is to act on, and on what line you will be positioned (i.e., the value of dot) when a command finishes. If you can edit without specifying unnecessary line numbers, you can save a lot of typing. .PP As the most obvious example, if you issue a search command like .P1 /thing/ .P2 you are left pointing at the next line that contains `thing'. Then no address is required with commands like .UL s to make a substitution on that line, or .UL p to print it, or .UL l to list it, or .UL d to delete it, or .UL a to append text after it, or .UL c to change it, or .UL i to insert text before it. .PP What happens if there was no `thing'? Then you are left right where you were _ dot is unchanged. This is also true if you were sitting on the only `thing' when you issued the command. The same rules hold for searches that use `?...?'; the only difference is the direction in which you search. .PP The delete command .UL d leaves dot pointing at the line that followed the last deleted line. When line `$' gets deleted, however, dot points at the .ul new line `$'. .PP The line-changing commands .UL a , .UL c and .UL i by default all affect the current line _ if you give no line number with them, .UL a appends text after the current line, .UL c changes the current line, and .UL i inserts text before the current line. .PP .UL a , .UL c , and .UL i behave identically in one respect _ when you stop appending, changing or inserting, dot points at the last line entered. This is exactly what you want for typing and editing on the fly. For example, you can say .P1 .ta 1.5i a ... text ... ... botch ... (minor error) \&\*. s/botch/correct/ (fix botched line) a ... more text ... .P2 without specifying any line number for the substitute command or for the second append command. Or you can say .P1 2 .ta 1.5i a ... text ... ... horrible botch ... (major error) \&\*. c (replace entire line) ... fixed up line ... .P2 .PP You should experiment to determine what happens if you add .ul no lines with .UL a , .UL c or .UL i . .PP The .UL r command will read a file into the text being edited, either at the end if you give no address, or after the specified line if you do. In either case, dot points at the last line read in. Remember that you can even say .UL 0r to read a file in at the beginning of the text. (You can also say .UL 0a or .UL 1i to start adding text at the beginning.) .PP The .UL w command writes out the entire file. If you precede the command by one line number, that line is written, while if you precede it by two line numbers, that range of lines is written. The .UL w command does .ul not change dot: the current line remains the same, regardless of what lines are written. This is true even if you say something like .P1 /^\*e\*.AB/,/^\*e\*.AE/w abstract .P2 which involves a context search. .PP Since the .UL w command is so easy to use, you should save what you are editing regularly as you go along just in case the system crashes, or in case you do something foolish, like clobbering what you're editing. .PP The least intuitive behavior, in a sense, is that of the .UL s command. The rule is simple _ you are left sitting on the last line that got changed. If there were no changes, then dot is unchanged. .PP To illustrate, suppose that there are three lines in the buffer, and you are sitting on the middle one: .P1 x1 x2 x3 .P2 Then the command .P1 \&-,+s/x/y/p .P2 prints the third line, which is the last one changed. But if the three lines had been .P1 x1 y2 y3 .P2 and the same command had been issued while dot pointed at the second line, then the result would be to change and print only the first line, and that is where dot would be set. .SH Semicolon `;' .PP Searches with `/.../' and `?...?' start at the current line and move forward or backward respectively until they either find the pattern or get back to the current line. Sometimes this is not what is wanted. Suppose, for example, that the buffer contains lines like this: .P1 \*. \*. \*. ab \*. \*. \*. bc \*. \*. .P2 Starting at line 1, one would expect that the command .P1 /a/,/b/p .P2 prints all the lines from the `ab' to the `bc' inclusive. Actually this is not what happens. .ul Both searches (for `a' and for `b') start from the same point, and thus they both find the line that contains `ab'. The result is to print a single line. Worse, if there had been a line with a `b' in it before the `ab' line, then the print command would be in error, since the second line number would be less than the first, and it is illegal to try to print lines in reverse order. .PP This is because the comma separator for line numbers doesn't set dot as each address is processed; each search starts from the same place. In .UL ed , the semicolon `;' can be used just like comma, with the single difference that use of a semicolon forces dot to be set at that point as the line numbers are being evaluated. In effect, the semicolon `moves' dot. Thus in our example above, the command .P1 /a/;/b/p .P2 prints the range of lines from `ab' to `bc', because after the `a' is found, dot is set to that line, and then `b' is searched for, starting beyond that line. .PP This property is most often useful in a very simple situation. Suppose you want to find the .ul second occurrence of `thing'. You could say .P1 /thing/ // .P2 but this prints the first occurrence as well as the second, and is a nuisance when you know very well that it is only the second one you're interested in. The solution is to say .P1 /thing/;// .P2 This says to find the first occurrence of `thing', set dot to that line, then find the second and print only that. .PP Closely related is searching for the second previous occurrence of something, as in .P1 ?something?;?? .P2 Printing the third or fourth or ... in either direction is left as an exercise. .PP Finally, bear in mind that if you want to find the first occurrence of something in a file, starting at an arbitrary place within the file, it is not sufficient to say .P1 1;/thing/ .P2 because this fails if `thing' occurs on line 1. But it is possible to say .P1 0;/thing/ .P2 (one of the few places where 0 is a legal line number), for this starts the search at line 1. .SH Interrupting the Editor .PP As a final note on what dot gets set to, you should be aware that if you hit the interrupt or delete or rubout or break key while .UL ed is doing a command, things are put back together again and your state is restored as much as possible to what it was before the command began. Naturally, some changes are irrevocable _ if you are reading or writing a file or making substitutions or deleting lines, these will be stopped in some clean but unpredictable state in the middle (which is why it is not usually wise to stop them). Dot may or may not be changed. .PP Printing is more clear cut. Dot is not changed until the printing is done. Thus if you print until you see an interesting line, then hit delete, you are .ul not sitting on that line or even near it. Dot is left where it was when the .UL p command was started. .NH GLOBAL COMMANDS .PP The global commands .UL g and .UL v are used to perform one or more editing commands on all lines that either contain .UL g ) ( or don't contain .UL v ) ( a specified pattern. .PP As the simplest example, the command .P1 g/UNIX/p .P2 prints all lines that contain the word `UNIX'. The pattern that goes between the slashes can be anything that could be used in a line search or in a substitute command; exactly the same rules and limitations apply. .PP As another example, then, .P1 g/^\*e\*./p .P2 prints all the formatting commands in a file (lines that begin with `\*.'). .PP The .UL v command is identical to .UL g , except that it operates on those line that do .ul not contain an occurrence of the pattern. (Don't look too hard for mnemonic significance to the letter `v'.) So .P1 v/^\*e\*./p .P2 prints all the lines that don't begin with `\*.' _ the actual text lines. .PP The command that follows .UL g or .UL v can be anything: .P1 g/^\*e\*./d .P2 deletes all lines that begin with `\*.', and .P1 g/^$/d .P2 deletes all empty lines. .PP Probably the most useful command that can follow a global is the substitute command, for this can be used to make a change and print each affected line for verification. For example, we could change the word `Unix' to `UNIX' everywhere, and verify that it really worked, with .P1 g/Unix/s//UNIX/gp .P2 Notice that we used `//' in the substitute command to mean `the previous pattern', in this case, `Unix'. The .UL p command is done on every line that matches the pattern, not just those on which a substitution took place. .PP The global command operates by making two passes over the file. On the first pass, all lines that match the pattern are marked. On the second pass, each marked line in turn is examined, dot is set to that line, and the command executed. This means that it is possible for the command that follows a .UL g or .UL v to use addresses, set dot, and so on, quite freely. .P1 g/^\*e\*.PP/+ .P2 prints the line that follows each `.PP' command (the signal for a new paragraph in some formatting packages). Remember that `+' means `one line past dot'. And .P1 g/topic/?^\*e\*.SH?1 .P2 searches for each line that contains `topic', scans backwards until it finds a line that begins `.SH' (a section heading) and prints the line that follows that, thus showing the section headings under which `topic' is mentioned. Finally, .P1 g/^\*e\*.EQ/+,/^\*e\*.EN/-p .P2 prints all the lines that lie between lines beginning with `.EQ' and `.EN' formatting commands. .PP The .UL g and .UL v commands can also be preceded by line numbers, in which case the lines searched are only those in the range specified. .SH Multi-line Global Commands .PP It is possible to do more than one command under the control of a global command, although the syntax for expressing the operation is not especially natural or pleasant. As an example, suppose the task is to change `x' to `y' and `a' to `b' on all lines that contain `thing'. Then .P1 g/thing/s/x/y/\*e s/a/b/ .P2 is sufficient. The `\*e' signals the .UL g command that the set of commands continues on the next line; it terminates on the first line that does not end with `\*e'. (As a minor blemish, you can't use a substitute command to insert a newline within a .UL g command.) .PP You should watch out for this problem: the command .P1 g/x/s//y/\*e s/a/b/ .P2 does .ul not work as you expect. The remembered pattern is the last pattern that was actually executed, so sometimes it will be `x' (as expected), and sometimes it will be `a' (not expected). You must spell it out, like this: .P1 g/x/s/x/y/\*e s/a/b/ .P2 .PP It is also possible to execute .UL a , .UL c and .UL i commands under a global command; as with other multi-line constructions, all that is needed is to add a `\*e' at the end of each line except the last. Thus to add a `.nf' and `.sp' command before each `.EQ' line, type .P1 g/^\*e\*.EQ/i\*e \&\*.nf\*e \&\*.sp .P2 There is no need for a final line containing a `\*.' to terminate the .UL i command, unless there are further commands being done under the global. On the other hand, it does no harm to put it in either. .NH CUT AND PASTE WITH UNIX COMMANDS .PP One editing area in which non-programmers seem not very confident is in what might be called `cut and paste' operations _ changing the name of a file, making a copy of a file somewhere else, moving a few lines from one place to another in a file, inserting one file in the middle of another, splitting a file into pieces, and splicing two or more files together. .PP Yet most of these operations are actually quite easy, if you keep your wits about you and go cautiously. The next several sections talk about cut and paste. We will begin with the .UX commands for moving entire files around, then discuss .UL ed commands for operating on pieces of files. .SH Changing the Name of a File .PP You have a file named `memo' and you want it to be called `paper' instead. How is it done? .PP The .UX program that renames files is called .UL mv (for `move'); it `moves' the file from one name to another, like this: .P1 mv memo paper .P2 That's all there is to it: .UL mv from the old name to the new name. .P1 mv oldname newname .P2 Warning: if there is already a file around with the new name, its present contents will be silently clobbered by the information from the other file. The one exception is that you can't move a file to itself _ .P1 mv x x .P2 is illegal. .SH Making a Copy of a File .PP Sometimes what you want is a copy of a file _ an entirely fresh version. This might be because you want to work on a file, and yet save a copy in case something gets fouled up, or just because you're paranoid. .PP In any case, the way to do it is with the .UL cp command. .UL cp \& ( stands for `copy'; the .UC UNIX system is big on short command names, which are appreciated by heavy users, but sometimes a strain for novices.) Suppose you have a file called `good' and you want to save a copy before you make some dramatic editing changes. Choose a name _ `savegood' might be acceptable _ then type .P1 cp good savegood .P2 This copies `good' onto `savegood', and you now have two identical copies of the file `good'. (If `savegood' previously contained something, it gets overwritten.) .PP Now if you decide at some time that you want to get back to the original state of `good', you can say .P1 mv savegood good .P2 (if you're not interested in `savegood' any more), or .P1 cp savegood good .P2 if you still want to retain a safe copy. .PP In summary, .UL mv just renames a file; .UL cp makes a duplicate copy. Both of them clobber the `target' file if it already exists, so you had better be sure that's what you want to do .ul before you do it. .SH Removing a File .PP If you decide you are really done with a file forever, you can remove it with the .UL rm command: .P1 rm savegood .P2 throws away (irrevocably) the file called `savegood'. .SH Putting Two or More Files Together .PP The next step is the familiar one of collecting two or more files into one big one. This will be needed, for example, when the author of a paper decides that several sections need to be combined into one. There are several ways to do it, of which the cleanest, once you get used to it, is a program called .UL cat . (Not .ul all .UC UNIX programs have two-letter names.) .UL cat is short for `concatenate', which is exactly what we want to do. .PP Suppose the job is to combine the files `file1' and `file2' into a single file called `bigfile'. If you say .P1 cat file .P2 the contents of `file' will get printed on your terminal. If you say .P1 cat file1 file2 .P2 the contents of `file1' and then the contents of `file2' will .ul both be printed on your terminal, in that order. So .UL cat combines the files, all right, but it's not much help to print them on the terminal _ we want them in `bigfile'. .PP Fortunately, there is a way. You can tell the system that instead of printing on your terminal, you want the same information put in a file. The way to do it is to add to the command line the character .UL > and the name of the file where you want the output to go. Then you can say .P1 cat file1 file2 >bigfile .P2 and the job is done. (As with .UL cp and .UL mv , you're putting something into `bigfile', and anything that was already there is destroyed.) .PP This ability to `capture' the output of a program is one of the most useful aspects of the .UC UNIX system. Fortunately it's not limited to the .UL cat program _ you can use it with .ul any program that prints on your terminal. We'll see some more uses for it in a moment. .PP Naturally, you can combine several files, not just two: .P1 cat file1 file2 file3 ... >bigfile .P2 collects a whole bunch. .PP Question: is there any difference between .P1 cp good savegood .P2 and .P1 cat good >savegood .P2 Answer: for most purposes, no. You might reasonably ask why there are two programs in that case, since .UL cat is obviously all you need. The answer is that .UL cp will do some other things as well, which you can investigate for yourself by reading the manual. For now we'll stick to simple usages. .SH Adding Something to the End of a File .PP Sometimes you want to add one file to the end of another. We have enough building blocks now that you can do it; in fact before reading further it would be valuable if you figured out how. To be specific, how would you use .UL cp , .UL mv and/or .UL cat to add the file `good1' to the end of the file `good'? .PP You could try .P1 cat good good1 >temp mv temp good .P2 which is probably most direct. You should also understand why .P1 cat good good1 >good .P2 doesn't work. (Don't practice with a good `good'!) .PP The easy way is to use a variant of .UL > , called .UL >> . In fact, .UL >> is identical to .UL > except that instead of clobbering the old file, it simply tacks stuff on at the end. Thus you could say .P1 cat good1 >>good .P2 and `good1' is added to the end of `good'. (And if `good' didn't exist, this makes a copy of `good1' called `good'.) .NH CUT AND PASTE WITH THE EDITOR .PP Now we move on to manipulating pieces of files _ individual lines or groups of lines. This is another area where new users seem unsure of themselves. .SH Filenames .PP The first step is to ensure that you know the .UL ed commands for reading and writing files. Of course you can't go very far without knowing .UL r and .UL w . Equally useful, but less well known, is the `edit' command .UL e . Within .UL ed , the command .P1 e newfile .P2 says `I want to edit a new file called .ul newfile, without leaving the editor.' The .UL e command discards whatever you're currently working on and starts over on .ul newfile. It's exactly the same as if you had quit with the .UL q command, then re-entered .UL ed with a new file name, except that if you have a pattern remembered, then a command like .UL // will still work. .PP If you enter .UL ed with the command .P1 ed file .P2 .UL ed remembers the name of the file, and any subsequent .UL e , .UL r or .UL w commands that don't contain a filename will refer to this remembered file. Thus .P1 2 .ta .5i .6i .7i ed file1 ... (editing) ... w (writes back in file1) e file2 (edit new file, without leaving editor) ... (editing on file2) ... w (writes back on file2) .P2 (and so on) does a series of edits on various files without ever leaving .UL ed and without typing the name of any file more than once. (As an aside, if you examine the sequence of commands here, you can see why many UNIX systems use .UL e as a synonym for .UL ed .) .PP You can find out the remembered file name at any time with the .UL f command; just type .UL f without a file name. You can also change the name of the remembered file name with .UL f ; a useful sequence is .P1 ed precious f junk ... (editing) ... .P2 which gets a copy of a precious file, then uses .UL f to guarantee that a careless .UL w command won't clobber the original. .SH Inserting One File into Another .PP Suppose you have a file called `memo', and you want the file called `table' to be inserted just after the reference to Table 1. That is, in `memo' somewhere is a line that says .IP Table 1 shows that ... .LP and the data contained in `table' has to go there, probably so it will be formatted properly by .UL nroff or .UL troff . Now what? .PP This one is easy. Edit `memo', find `Table 1', and add the file `table' right there: .P1 ed memo /Table 1/ .ft I Table 1 shows that ... [response from ed] .ft \&\*.r table .P2 The critical line is the last one. As we said earlier, the .UL r command reads a file; here you asked for it to be read in right after line dot. An .UL r command without any address adds lines at the end, so it is the same as .UL $r . .SH Writing out Part of a File .PP The other side of the coin is writing out part of the document you're editing. For example, maybe you want to split out into a separate file that table from the previous example, so it can be formatted and tested separately. Suppose that in the file being edited we have .P1 \&\*.TS ...[lots of stuff] \&\*.TE .P2 which is the way a table is set up for the .UL tbl program. To isolate the table in a separate file called `table', first find the start of the table (the `.TS' line), then write out the interesting part: .P1 /^\*e\*.TS/ .ft I \&\*.TS [ed prints the line it found] .ft R \&\*.,/^\*e\*.TE/w table .P2 and the job is done. If you are confident, you can do it all at once with .P1 /^\*e\*.TS/;/^\*e\*.TE/w table .P2 .PP The point is that the .UL w command can write out a group of lines, instead of the whole file. In fact, you can write out a single line if you like; just give one line number instead of two. For example, if you have just typed a horribly complicated line and you know that it (or something like it) is going to be needed later, then save it _ don't re-type it. In the editor, say .P1 a \&...lots of stuff... \&...horrible line... \&\*. \&\*.w temp a \&\*.\*.\*.more stuff\*.\*.\*. \&\*. \&\*.r temp a \&\*.\*.\*.more stuff\*.\*.\*. \&\*. .P2 This last example is worth studying, to be sure you appreciate what's going on. .SH Moving Lines Around .PP Suppose you want to move a paragraph from its present position in a paper to the end. How would you do it? As a concrete example, suppose each paragraph in the paper begins with the formatting command `.PP'. Think about it and write down the details before reading on. .PP The brute force way (not necessarily bad) is to write the paragraph onto a temporary file, delete it from its current position, then read in the temporary file at the end. Assuming that you are sitting on the `.PP' command that begins the paragraph, this is the sequence of commands: .P1 \&\*.,/^\*e\*.PP/-w temp \&\*.,//-d $r temp .P2 That is, from where you are now (`\*.') until one line before the next `\*.PP' (`/^\*e\*.PP/\-') write onto `temp'. Then delete the same lines. Finally, read `temp' at the end. .PP As we said, that's the brute force way. The easier way (often) is to use the .ul move command .UL m that .UL ed provides _ it lets you do the whole set of operations at one crack, without any temporary file. .PP The .UL m command is like many other .UL ed commands in that it takes up to two line numbers in front that tell what lines are to be affected. It is also .ul followed by a line number that tells where the lines are to go. Thus .P1 line1, line2 m line3 .P2 says to move all the lines between `line1' and `line2' after `line3'. Naturally, any of `line1' etc., can be patterns between slashes, $ signs, or other ways to specify lines. .PP Suppose again that you're sitting at the first line of the paragraph. Then you can say .P1 \&\*.,/^\*e\*.PP/-m$ .P2 That's all. .PP As another example of a frequent operation, you can reverse the order of two adjacent lines by moving the first one to after the second. Suppose that you are positioned at the first. Then .P1 m+ .P2 does it. It says to move line dot to after one line after line dot. If you are positioned on the second line, .P1 m-- .P2 does the interchange. .PP As you can see, the .UL m command is more succinct and direct than writing, deleting and re-reading. When is brute force better anyway? This is a matter of personal taste _ do what you have most confidence in. The main difficulty with the .UL m command is that if you use patterns to specify both the lines you are moving and the target, you have to take care that you specify them properly, or you may well not move the lines you thought you did. The result of a botched .UL m command can be a ghastly mess. Doing the job a step at a time makes it easier for you to verify at each step that you accomplished what you wanted to. It's also a good idea to issue a .UL w command before doing anything complicated; then if you goof, it's easy to back up to where you were. .SH Marks .PP .UL ed provides a facility for marking a line with a particular name so you can later reference it by name regardless of its actual line number. This can be handy for moving lines, and for keeping track of them as they move. The .ul mark command is .UL k ; the command .P1 kx .P2 marks the current line with the name `x'. If a line number precedes the .UL k , that line is marked. (The mark name must be a single lower case letter.) Now you can refer to the marked line with the address .P1 \(fmx .P2 .PP Marks are most useful for moving things around. Find the first line of the block to be moved, and mark it with .ul \(fma. Then find the last line and mark it with .ul \(fmb. Now position yourself at the place where the stuff is to go and say .P1 \(fma,\(fmbm\*. .P2 .PP Bear in mind that only one line can have a particular mark name associated with it at any given time. .SH Copying Lines .PP We mentioned earlier the idea of saving a line that was hard to type or used often, so as to cut down on typing time. Of course this could be more than one line; then the saving is presumably even greater. .PP .UL ed provides another command, called .UL t (for `transfer') for making a copy of a group of one or more lines at any point. This is often easier than writing and reading. .PP The .UL t command is identical to the .UL m command, except that instead of moving lines it simply duplicates them at the place you named. Thus .P1 1,$t$ .P2 duplicates the entire contents that you are editing. A more common use for .UL t is for creating a series of lines that differ only slightly. For example, you can say .P1 .ta 1i a \&.......... x ......... (long line) \&\*. t\*. (make a copy) s/x/y/ (change it a bit) t\*. (make third copy) s/y/z/ (change it a bit) .P2 and so on. .SH The Temporary Escape `!' .PP Sometimes it is convenient to be able to temporarily escape from the editor to do some other .UX command, perhaps one of the file copy or move commands discussed in section 5, without leaving the editor. The `escape' command .UL ! provides a way to do this. .PP If you say .P1 !any UNIX command .P2 your current editing state is suspended, and the .UX command you asked for is executed. When the command finishes, .UL ed will signal you by printing another .UL ! ; at that point you can resume editing. .PP You can really do .ul any .UX command, including another .UL ed . (This is quite common, in fact.) In this case, you can even do another .UL ! . .NH SUPPORTING TOOLS .PP There are several tools and techniques that go along with the editor, all of which are relatively easy once you know how .UL ed works, because they are all based on the editor. In this section we will give some fairly cursory examples of these tools, more to indicate their existence than to provide a complete tutorial. More information on each can be found in [3]. .SH Grep .PP Sometimes you want to find all occurrences of some word or pattern in a set of files, to edit them or perhaps just to verify their presence or absence. It may be possible to edit each file separately and look for the pattern of interest, but if there are many files this can get very tedious, and if the files are really big, it may be impossible because of limits in .UL ed . .PP The program .UL grep was invented to get around these limitations. The search patterns that we have described in the paper are often called `regular expressions', and `grep' stands for .P1 g/re/p .P2 That describes exactly what .UL grep does _ it prints every line in a set of files that contains a particular pattern. Thus .P1 grep \(fmthing\(fm file1 file2 file3 ... .P2 finds `thing' wherever it occurs in any of the files `file1', `file2', etc. .UL grep also indicates the file in which the line was found, so you can later edit it if you like. .PP The pattern represented by `thing' can be any pattern you can use in the editor, since .UL grep and .UL ed use exactly the same mechanism for pattern searching. It is wisest always to enclose the pattern in the single quotes \(fm...\(fm if it contains any non-alphabetic characters, since many such characters also mean something special to the .UX command interpreter (the `shell'). If you don't quote them, the command interpreter will try to interpret them before .UL grep gets a chance. .PP There is also a way to find lines that .ul don't contain a pattern: .P1 grep -v \(fmthing\(fm file1 file2 ... .P2 finds all lines that don't contains `thing'. The .UL \-v must occur in the position shown. Given .UL grep and .UL grep\ \-v , it is possible to do things like selecting all lines that contain some combination of patterns. For example, to get all lines that contain `x' but not `y': .P1 grep x file... | grep -v y .P2 (The notation | is a `pipe', which causes the output of the first command to be used as input to the second command; see [2].) .SH Editing Scripts .PP If a fairly complicated set of editing operations is to be done on a whole set of files, the easiest thing to do is to make up a `script', i.e., a file that contains the operations you want to perform, then apply this script to each file in turn. .PP For example, suppose you want to change every `Unix' to `UNIX' and every `Gcos' to `GCOS' in a large number of files. Then put into the file `script' the lines .P1 g/Unix/s//UNIX/g g/Gcos/s//GCOS/g w q .P2 Now you can say .P1 ed file1 <script ed file2 <script \&... .P2 This causes .UL ed to take its commands from the prepared script. Notice that the whole job has to be planned in advance. .PP And of course by using the .UX command interpreter, you can cycle through a set of files automatically, with varying degrees of ease. .SH Sed .PP .UL sed (`stream editor') is a version of the editor with restricted capabilities but which is capable of processing unlimited amounts of input. Basically .UL sed copies its input to its output, applying one or more editing commands to each line of input. .PP As an example, suppose that we want to do the `Unix' to `UNIX' part of the example given above, but without rewriting the files. Then the command .P1 sed \(fms/Unix/UNIX/g\(fm file1 file2 ... .P2 applies the command `s/Unix/UNIX/g' to all lines from `file1', `file2', etc., and copies all lines to the output. The advantage of using .UL sed in such a case is that it can be used with input too large for .UL ed to handle. All the output can be collected in one place, either in a file or perhaps piped into another program. .PP If the editing transformation is so complicated that more than one editing command is needed, commands can be supplied from a file, or on the command line, with a slightly more complex syntax. To take commands from a file, for example, .P1 sed -f cmdfile input-files... .P2 .PP .UL sed has further capabilities, including conditional testing and branching, which we cannot go into here. .SH Acknowledgement .PP I am grateful to Ted Dolotta for his careful reading and valuable suggestions. .SH References .IP [1] Brian W. Kernighan, .ul A Tutorial Introduction to the UNIX Text Editor, Bell Laboratories internal memorandum. .IP [2] Brian W. Kernighan, .ul UNIX For Beginners, Bell Laboratories internal memorandum. .IP [3] Ken L. Thompson and Dennis M. Ritchie, .ul The UNIX Programmer's Manual. Bell Laboratories. .sp .I "May 1979"