SysIII/usr/src/man/docs/adv_ed

.ds :? Advanced Editing
.de PT
.lt \\n(LLu
.pc %
.nr PN \\n%
.if \\n%-1 .if o .tl '\s9\f2\*(:?\fP''\\n(PN\s0'
.if \\n%-1 .if e .tl '\s9\\n(PN''\f2\*(:?\^\fP\s0'
.lt \\n(.lu
..
.tr _\(em
.de UL
.if n .ul
.if n \\$3\\$1\\$2
.if t \\$3\f3\\$1\fP\\$2
..
.de IT
.if t \\$3\f2\\$1\fP\\$2
.if n .ul
.if n \\$3\\$1\\$2
..
.de UI
\f3\\$1\fI\\$2\fR\\$3
..
.de P1
.if n .ls 1
.nf
.if n .ta 5 10 15 20 25 30 35 40 45 50 55 60
.if t .ta .3i .6i .9i 1.2i 1.5i 1.8i
.tr -\-
. use first argument as indent if present
.if \\n(.$ .DS I \\$1
.if !\\n(.$ .DS I 5
..
.de P2
.DE
.tr --
.if n .ls 2
..
.if t .ds B \s6\|\v'.1m'\(sq\v'-.1m'\|\s0
.if n .ds B []
.if t .ds m \(mi
.if n .ds m -
.if t .ds n \(no
.if n .ds n -
.if t .ds s \v'.41m'\s+4*\s-4\v'-.41m'
.if n .ds s *
.if t .ds S \(sl
.if n .ds S /
.if t .ds d \s+4\&.\&\s-4
.if n .ds d \&.\&
.if t .ds a \z@@
.if n .ds a @
.if t .ds . \s+2\fB.\fP\s-2
.if n .ds . .
.if t .ds e \z\e\h'2u'\e
.if n .ds e \e
. 2=not last lines; 4= no -xx; 8=no xx-
.tr *\(**
.hy 14
.TL
Advanced Editing on U\s-2NIX\s+2
.AU "MH 2C518" 6021
Brian W. Kernighan
.AI
.MH
.AB
.nr PS 9
.nr VS 11
This paper is meant to help
secretaries, typists and programmers
to make effective use of the
.UX
facilities
for preparing and editing text.
It provides explanations and examples of
.IP \(bu
special characters, line addressing and global commands in the editor
.UL ed ;
.IP \(bu
commands for ``cut and paste'' operations on files
and parts of files,
including the
.UL mv ,
.UL cp ,
.UL cat
and
.UL rm
commands,
and the
.UL r ,
.UL w ,
.UL m
and
.UL t
commands of the editor;
.IP \(bu
editing scripts and
editor-based programs like
.UL grep
and
.UL sed .
.PP
Although the treatment is aimed
at non-programmers,
new
.UC UNIX
users
with any background
should find helpful hints
on how to get their jobs done
more easily.
.AE
.CS 16 0 16 0 0 3
.if n .ls 2
.if t .2C
.NH
INTRODUCTION
.PP
Although
.UX
provides remarkably effective tools for text editing,
that by itself is no guarantee
that everyone will automatically
make the most effective use of them.
In particular, people who are not computer specialists _
typists, secretaries, casual users _
often use the system less effectively than they might.
.PP
This document is intended as a sequel to
.ul
A Tutorial Introduction to the UNIX Text Editor
[1],
providing explanations and examples of how to edit with less effort.
(You should also be familiar with the material in
.ul
UNIX For Beginners
[2].)
Further information on all commands discussed here can be found in
.ul
The UNIX Programmer's Manual
[3].
.PP
Examples are based on observations
of users
and the difficulties they encounter.
Topics covered include special characters in searches and substitute commands,
line addressing, the global commands,
and line moving and copying.
There are also brief discussions of
effective use
of related tools, like those for file manipulation,
and those based on
.UL ed ,
like
.UL grep
and
.UL sed .
.PP
A word of caution.
There is only one way to learn to use something,
and that is to
.ul
use
it.
Reading a description is no substitute
for trying something.
A paper like this one should
give you ideas about what to try,
but until you actually try something,
you will not learn it.
.NH
SPECIAL CHARACTERS
.PP
The editor
.UL ed
is the primary interface to the system
for many people, so
it is worthwhile to know
how to get the most out of
.UL ed
for the least effort.
.PP
The next few sections will discuss
shortcuts
and labor-saving devices.
Not all of these will be instantly useful
to any one person, of course,
but a few will be,
and the others should give you ideas to store
away for future use.
And as always,
until you try these things,
they will remain theoretical knowledge,
not something you have confidence in.
.SH
The List command `l'
.PP
.UL ed
provides two commands for printing the contents of the lines
you're editing.
Most people are familiar with
.UL p ,
in combinations like
.P1
1,$p
.P2
to print all the lines you're editing,
or
.P1
s/abc/def/p
.P2
to change
`abc'
to
`def'
on the current line.
Less familiar is the
.ul
list
command
.UL l
(the letter `\fIl\|\fR'),
which gives slightly more information than
.UL p .
In particular,
.UL l
makes visible characters that are normally invisible,
such as tabs and backspaces.
If you list a line that contains some of these,
.UL l
will print each tab as
.UL \z\(mi>
and each backspace as
.UL \z\(mi< .
This makes it much easier to correct the sort of typing mistake
that inserts extra spaces adjacent to tabs,
or inserts a backspace followed by a space.
.PP
The
.UL l
command
also `folds' long lines for printing _
any line that exceeds 72 characters is printed on multiple lines;
each printed line except the last is terminated by a backslash
.UL \*e ,
so you can tell it was folded.
This is useful for printing long lines on short terminals.
.PP
Occasionally the
.UL l
command will print in a line a string of numbers preceded by a backslash,
such as \*e07 or \*e16.
These combinations are used to make visible characters that normally don't print,
like form feed or vertical tab or bell.
Each such combination is a single character.
When you see such characters, be wary _
they may have surprising meanings when printed on some terminals.
Often their presence means that your finger slipped while you were typing;
you almost never want them.
.SH
The Substitute Command `s'
.PP
Most of the next few sections will be taken up with a discussion
of the
substitute
command
.UL s .
Since this is the command for changing the contents of individual
lines,
it probably has the most complexity of any
.UL ed
command,
and the most potential for effective use.
.PP
As the simplest place to begin,
recall the meaning of a trailing
.UL g
after a substitute command.
With
.P1
s/this/that/
.P2
and
.P1
s/this/that/g
.P2
the
first
one replaces the
.ul
first
`this' on the line
with `that'.
If there is more than one `this' on the line,
the second form
with the trailing
.UL g
changes
.ul
all
of them.
.PP
Either form of the
.UL s
command can be followed by
.UL p
or
.UL l
to `print' or `list' (as described in the previous section)
the contents of the line:
.P1
s/this/that/p
s/this/that/l
s/this/that/gp
s/this/that/gl
.P2
are all legal, and mean slightly different things.
Make sure you know what the differences are.
.PP
Of course, any
.UL s
command can be preceded by one or two `line numbers'
to specify that the substitution is to take place
on a group of lines.
Thus
.P1
1,$s/mispell/misspell/
.P2
changes the
.ul
first
occurrence of
`mispell' to `misspell' on every line of the file.
But
.P1
1,$s/mispell/misspell/g
.P2
changes
.ul
every
occurrence in every line
(and this is more likely to be what you wanted in this
particular case).
.PP
You should also notice that if you add a
.UL p
or
.UL l
to the end of any of these substitute commands,
only the last line that got changed will be printed,
not all the lines.
We will talk later about how to print all the lines
that were modified.
.SH
The Undo Command `u'
.PP
Occasionally you will make a substitution in a line,
only to realize too late that it was a ghastly mistake.
The `undo' command
.UL u
lets you `undo' the last substitution:
the last line that was substituted can be restored to
its previous state by typing the command
.P1
u
.P2
.SH
The Metacharacter `\*.'
.PP
As you have undoubtedly noticed
when you use
.UL ed ,
certain characters have unexpected meanings
when they occur in the left side of a substitute command,
or in a search for a particular line.
In the next several sections, we will talk about
these special characters,
which are often called `metacharacters'.
.PP
The first one is the period `\*.'.
On the left side of a substitute command,
or in a search with `/.../',
`\*.' stands for
.ul
any
single character.
Thus the search
.P1
/x\*.y/
.P2
finds any line where `x' and `y' occur separated by
a single character, as in
.P1
x+y
x\-y
x\*By
x\*.y
.P2
and so on.
(We will use \*B to stand for a space whenever we need to
make it visible.)
.PP
Since `\*.' matches a single character,
that gives you a way to deal with funny characters
printed by
.UL l .
Suppose you have a line that, when printed with the
.UL l
command, appears as
.P1
.... th\*e07is ....
.P2
and you want to get rid of the
\*e07
(which represents the bell character, by the way).
.PP
The most obvious solution is to try
.P1
s/\*e07//
.P2
but this will fail. (Try it.)
The brute force solution, which most people would now take,
is to re-type the entire line.
This is guaranteed, and is actually quite a reasonable tactic
if the line in question isn't too big,
but for a very long line,
re-typing is a bore.
This is where the metacharacter `\*.' comes in handy.
Since `\*e07' really represents a single character,
if we say
.P1
s/th\*.is/this/
.P2
the job is done.
The `\*.' matches the mysterious character between the `h' and the `i',
.ul
whatever it is.
.PP
Bear in mind that since `\*.' matches any single character,
the command
.P1
s/\*./,/
.P2
converts the first character on a line into a `,',
which very often is not what you intended.
.PP
As is true of many characters in
.UL ed ,
the `\*.' has several meanings, depending
on its context.
This line shows all three:
.P1
\&\*.s/\*./\*./
.P2
The first `\*.' is a line number,
the number of
the line we are editing,
which is called `line dot'.
(We will discuss line dot more in Section 3.)
The second `\*.' is a metacharacter
that matches any single character on that line.
The third `\*.' is the only one that really is
an honest literal period.
On the
.ul
right
side of a substitution, `\*.'
is not special.
If you apply this command to the line
.P1
Now is the time\*.
.P2
the result will
be
.P1
\&\*.ow is the time\*.
.P2
which is probably not what you intended.
.SH
The Backslash `\*e'
.PP
Since a period means `any character',
the question naturally arises of what to do
when you really want a period.
For example, how do you convert the line
.P1
Now is the time\*.
.P2
into
.P1
Now is the time?
.P2
The backslash `\*e' does the job.
A backslash turns off any special meaning that the next character
might have; in particular,
`\*e\*.' converts the `\*.' from a `match anything'
into a period, so
you can use it to replace
the period in
.P1
Now is the time\*.
.P2
like this:
.P1
s/\*e\*./?/
.P2
The pair of characters `\*e\*.' is considered by
.UL ed
to be a single real period.
.PP
The backslash can also be used when searching for lines
that contain a special character.
Suppose you are looking for a line that contains
.P1
\&\*.PP
.P2
The search
.P1
/\*.PP/
.P2
isn't adequate, for it will find
a line like
.P1
THE APPLICATION OF ...
.P2
because the `\*.' matches the letter `A'.
But if you say
.P1
/\*e\*.PP/
.P2
you will find only lines that contain `\*.PP'.
.PP
The backslash can also be used to turn off special meanings for
characters other than `\*.'.
For example, consider finding a line that contains a backslash.
The search
.P1
/\*e/
.P2
won't work,
because the `\*e' isn't a literal `\*e', but instead means that the second `/'
no longer \%delimits the search.
But by preceding a backslash with another one,
you can search for a literal backslash.
Thus
.P1
/\*e\*e/
.P2
does work.
Similarly, you can search for a forward slash `/' with
.P1
/\*e//
.P2
The backslash turns off the meaning of the immediately following `/' so that
it doesn't terminate the /.../ construction prematurely.
.PP
As an exercise, before reading further, find two substitute commands each of which will
convert the line
.P1
\*ex\*e\*.\*ey
.P2
into the line
.P1
\*ex\*ey
.P2
.PP
Here are several solutions;
verify that each works as advertised.
.P1
s/\*e\*e\*e\*.//
s/x\*.\*./x/
s/\*.\*.y/y/
.P2
.PP
A couple of miscellaneous notes about
backslashes and special characters.
First, you can use any character to delimit the pieces
of an
.UL s
command: there is nothing sacred about slashes.
(But you must use slashes for context searching.)
For instance, in a line that contains a lot of slashes already, like
.P1
//exec //sys.fort.go // etc...
.P2
you could use a colon as the delimiter _
to delete all the slashes, type
.P1
s:/::g
.P2
.PP
Second, if # and @ are your character erase and line kill characters,
you have to type \*e# and \*e@;
this is true whether you're talking to
.UL ed
or any other program.
.PP
When you are adding text with
.UL a
or
.UL i
or
.UL c ,
backslash is not special, and you should only put in
one backslash for each one you really want.
.SH
The Dollar Sign `$'
.PP
The next metacharacter, the `$', stands for `the end of the line'.
As its most obvious use, suppose you have the line
.P1
Now is the
.P2
and you wish to add the word `time' to the end.
Use the $ like this:
.P1
s/$/\*Btime/
.P2
to get
.P1
Now is the time
.P2
Notice that a space is needed before `time' in
the substitute command,
or you will get
.P1
Now is thetime
.P2
.PP
As another example, replace the second comma in
the following line with a period without altering the first:
.P1
Now is the time, for all good men,
.P2
The command needed is
.P1
s/,$/\*./
.P2
The $ sign here provides context to make specific which comma we mean.
Without it, of course, the
.UL s
command would operate on the first comma to produce
.P1
Now is the time\*. for all good men,
.P2
.PP
As another example, to convert
.P1
Now is the time\*.
.P2
into
.P1
Now is the time?
.P2
as we did earlier, we can use
.P1
s/\*.$/?/
.P2
.PP
Like `\*.', the `$'
has multiple meanings depending on context.
In the line
.P1
$s/$/$/
.P2
the first `$' refers to the
last line of the file,
the second refers to the end of that line,
and the third is a literal dollar sign,
to be added to that line.
.SH
The Circumflex `^'
.PP
The circumflex (or hat or caret)
`^' stands for the beginning of the line.
For example, suppose you are looking for a line that begins
with `the'.
If you simply say
.P1
/the/
.P2
you will in all likelihood find several lines that contain `the' in the middle before
arriving at the one you want.
But with
.P1
/^the/
.P2
you narrow the context, and thus arrive at the desired one
more easily.
.PP
The other use of `^' is of course to enable you to insert
something at the beginning of a line:
.P1
s/^/\*B/
.P2
places a space at the beginning of the current line.
.PP
Metacharacters can be combined. To search for a
line that contains
.ul
only
the characters
.P1
\&\*.PP
.P2
you can use the command
.P1
/^\*e\*.PP$/
.P2
.SH
The Star `*'
.PP
Suppose you have a line that looks like this:
.P1
\fItext \fR x y \fI text \fR
.P2
where
.ul
text
stands
for lots of text,
and there are some indeterminate number of spaces between the
.UL x
and the
.UL y .
Suppose the job is to replace all the spaces between
.UL x
and
.UL y
by a single space.
The line is too long to retype, and there are too many spaces
to count.
What now?
.PP
This is where the metacharacter `*'
comes in handy.
A character followed by a star
stands for as many consecutive occurrences of that
character as possible.
To refer to all the spaces at once, say
.P1
s/x\*B*y/x\*By/
.P2
The construction
`\*B*'
means
`as many spaces as possible'.
Thus `x\*B*y' means `an x, as many spaces as possible, then a y'.
.PP
The star can be used with any character, not just space.
If the original example was instead
.P1
\fItext \fR x--------y \fI text \fR
.P2
then all `\-' signs can be replaced by a single space
with the command
.P1
s/x-*y/x\*By/
.P2
.PP
Finally, suppose that the line was
.P1
\fItext \fR x\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.y \fI text \fR
.P2
Can you see what trap lies in wait for the unwary?
If you blindly type
.P1
s/x\*.*y/x\*By/
.P2
what will happen?
The answer, naturally, is that it depends.
If there are no other x's or y's on the line,
then everything works, but it's blind luck, not good management.
Remember that `\*.' matches
.ul
any
single character?
Then `\*.*' matches as many single characters as possible,
and unless you're careful, it can eat up a lot more of the line
than you expected.
If the line was, for example, like this:
.P1
\fItext \fRx\fI text \fR x\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.\*.y \fI text \fRy\fI text \fR
.P2
then saying
.P1
s/x\*.*y/x\*By/
.P2
will take everything from the
.ul
first
`x' to the
.ul
last
`y',
which, in this example, is undoubtedly more than you wanted.
.PP
The solution, of course, is to turn off the special meaning of
`\*.' with
`\*e\*.':
.P1
s/x\*e\*.*y/x\*By/
.P2
Now everything works, for `\*e\*.*' means `as many
.ul
periods
as possible'.
.PP
There are times when the pattern `\*.*' is exactly what you want.
For example, to change
.P1
Now is the time for all good men ....
.P2
into
.P1
Now is the time\*.
.P2
use `\*.*' to eat up everything after the `for':
.P1
s/\*Bfor\*.*/\*./
.P2
.PP
There are a couple of additional pitfalls associated with `*' that you should be aware of.
Most notable is the fact that `as many as possible' means
.ul
zero
or more.
The fact that zero is a legitimate possibility is
sometimes rather surprising.
For example, if our line contained
.P1
\fItext \fR xy \fI text \fR x y \fI text \fR
.P2
and we said
.P1
s/x\*B*y/x\*By/
.P2
the
.ul
first
`xy' matches this pattern, for it consists of an `x',
zero spaces, and a `y'.
The result is that the substitute acts on the first `xy',
and does not touch the later one that actually contains some intervening spaces.
.PP
The way around this, if it matters, is to specify a pattern like
.P1
/x\*B\*B*y/
.P2
which says `an x, a space, then as many more spaces as possible, then a y',
in other words, one or more spaces.
.PP
The other startling behavior of `*' is again related to the fact
that zero is a legitimate number of occurrences of something
followed by a star. The command
.P1
s/x*/y/g
.P2
when applied to the line
.P1
abcdef
.P2
produces
.P1
yaybycydyeyfy
.P2
which is almost certainly not what was intended.
The reason for this behavior is that zero is a legal number
of matches,
and there are no x's at the beginning of the line
(so that gets converted into a `y'),
nor between the `a' and the `b'
(so that gets converted into a `y'), nor ...
and so on.
Make sure you really want zero matches;
if not, in this case write
.P1
s/xx*/y/g
.P2
`xx*' is one or more x's.
.SH
The Brackets `[ ]'
.PP
Suppose that you want to delete any numbers
that appear
at the beginning of all lines of a file.
You might first think of trying a series of commands like
.P1
1,$s/^1*//
1,$s/^2*//
1,$s/^3*//
.P2
and so on,
but this is clearly going to take forever if the numbers are at all long.
Unless you want to repeat the commands over and over until
finally all numbers are gone,
you must get all the digits on one pass.
This is the purpose of the brackets [ and ].
.PP
The construction
.P1
[0123456789]
.P2
matches any single digit _
the whole thing is called a `character class'.
With a character class, the job is easy.
The pattern `[0123456789]*' matches zero or more digits (an entire number), so
.P1
1,$s/^[0123456789]*//
.P2
deletes all digits from the beginning of all lines.
.PP
Any characters can appear within a character class,
and just to confuse the issue there are essentially no special characters
inside the brackets;
even the backslash doesn't have a special meaning.
To search for special characters, for example, you can say
.P1
/[\*.\*e$^[]/
.P2
Within [...], the `[' is not special.
To get a `]' into a character class,
make it the first character.
.PP
It's a nuisance to have to spell out the digits,
so you can abbreviate them as
[0\-9];
similarly, [a\-z] stands for the lower case letters,
and
[A\-Z] for upper case.
.PP
As a final frill on character classes, you can specify a class
that means `none of the following characters'.
This is done by beginning the class with a `^':
.P1
[^0-9]
.P2
stands for `any character
.ul
except
a digit'.
Thus you might find the first line that doesn't begin with a tab or space
by a search like
.P1
/^[^(space)(tab)]/
.P2
.PP
Within a character class,
the circumflex has a special meaning
only if it occurs at the beginning.
Just to convince yourself, verify that
.P1
/^[^^]/
.P2
finds a line that doesn't begin with a circumflex.
.SH
The Ampersand `&'
.PP
The ampersand `&' is used primarily to save typing.
Suppose you have the line
.P1
Now is the time
.P2
and you want to make it
.P1
Now is the best time
.P2
Of course you can always say
.P1
s/the/the best/
.P2
but it seems silly to have to repeat the `the'.
The `&' is used to eliminate the repetition.
On the
.ul
right
side of a substitute, the ampersand means `whatever
was just matched', so you can say
.P1
s/the/& best/
.P2
and the `&' will stand for `the'.
Of course this isn't much of a saving if the thing
matched is just `the', but if it is something truly long or awful,
or if it is something like `.*'
which matches a lot of text,
you can save some tedious typing.
There is also much less chance of making a typing error
in the replacement text.
For example, to parenthesize a line,
regardless of its length,
.P1
s/\*.*/(&)/
.P2
.PP
The ampersand can occur more than once on the right side:
.P1
s/the/& best and & worst/
.P2
makes
.P1
Now is the best and the worst time
.P2
and
.P1
s/\*.*/&? &!!/
.P2
converts the original line into
.P1
Now is the time? Now is the time!!
.P2
.PP
To get a literal ampersand, naturally the backslash is used to turn off the special meaning:
.P1
s/ampersand/\*e&/
.P2
converts the word into the symbol.
Notice that `&' is not special on the left side
of a substitute, only on the
.ul
right
side.
.SH
Substituting Newlines
.PP
.UL ed
provides a facility for splitting a single line into two or more shorter lines by `substituting in a newline'.
As the simplest example, suppose a line has gotten unmanageably long
because of editing (or merely because it was unwisely typed).
If it looks like
.P1
\fItext \fR xy \fI text \fR
.P2
you can break it between the `x' and the `y' like this:
.P1
s/xy/x\*e
y/
.P2
This is actually a single command,
although it is typed on two lines.
Bearing in mind that `\*e' turns off special meanings,
it seems relatively intuitive that a `\*e' at the end of
a line would make the newline there
no longer special.
.PP
You can in fact make a single line into several lines
with this same mechanism.
As a large example, consider underlining the word `very'
in a long line
by splitting `very' onto a separate line,
and preceding it by the
.UL roff
or
.UL nroff
formatting command `.ul'.
.P1
\fItext \fR a very big \fI text \fR
.P2
The command
.P1
s/\*Bvery\*B/\*e
\&.ul\*e
very\*e
/
.P2
converts the line into four shorter lines,
preceding the word `very' by the
line
`.ul',
and eliminating the spaces around the `very',
all at the same time.
.PP
When a newline is substituted
in, dot is left pointing at the last line created.
.PP
.SH
Joining Lines
.PP
Lines may also be joined together,
but this is done with the
.UL j
command
instead of
.UL s .
Given the lines
.P1
Now is
\*Bthe time
.P2
and supposing that dot is set to the first of them,
then the command
.P1
j
.P2
joins them together.
No blanks are added,
which is why we carefully showed a blank
at the beginning of the second line.
.PP
All by itself,
a
.UL j
command
joins line dot to line dot+1,
but any contiguous set of lines can be joined.
Just specify the starting and ending line numbers.
For example,
.P1
1,$jp
.P2
joins all the lines into one big one
and prints it.
(More on line numbers in Section 3.)
.SH
Rearranging a Line with \*e( ... \*e)
.PP
(This section should be skipped on first reading.)
Recall that `&' is a shorthand that stands for whatever
was matched by the left side of an
.UL s
command.
In much the same way you can capture separate pieces
of what was matched;
the only difference is that you have to specify
on the left side just what pieces you're interested in.
.PP
Suppose, for instance, that
you have a file of lines that consist of names in the form
.P1
Smith, A. B.
Jones, C.
.P2
and so on,
and you want the initials to precede the name, as in
.P1
A. B. Smith
C. Jones
.P2
It is possible to do this with a series of editing commands,
but it is tedious and error-prone.
(It is instructive to figure out how it is done, though.)
.PP
The alternative
is to `tag' the pieces of the pattern (in this case,
the last name, and the initials),
and then rearrange the pieces.
On the left side of a substitution,
if part of the pattern is enclosed between
\*e( and \*e),
whatever matched that part is remembered,
and available for use on the right side.
On the right side,
the symbol `\*e1' refers to whatever
matched the first \*e(...\*e) pair,
`\*e2' to the second \*e(...\*e),
and so on.
.PP
The command
.P1
1,$s/^\*e([^,]*\*e),\*B*\*e(\*.*\*e)/\*e2\*B\*e1/
.P2
although hard to read, does the job.
The first \*e(...\*e) matches the last name,
which is any string up to the comma;
this is referred to on the right side with `\*e1'.
The second \*e(...\*e) is whatever follows
the comma and any spaces,
and is referred to as `\*e2'.
.PP
Of course, with any editing sequence this complicated,
it's foolhardy to simply run it and hope.
The global commands
.UL g
and
.UL v
discussed in section 4
provide a way for you to print exactly those
lines which were affected by the
substitute command,
and thus verify that it did what you wanted
in all cases.
.NH
LINE ADDRESSING IN THE EDITOR
.PP
The next general area we will discuss is that of
line addressing in
.UL ed ,
that is, how you specify what lines are to be
affected by editing commands.
We have already used constructions like
.P1
1,$s/x/y/
.P2
to specify a change on all lines.
And most users are long since familiar with
using a single newline (or return) to print the next line,
and with
.P1
/thing/
.P2
to find a line that contains `thing'.
Less familiar, surprisingly enough, is the
use of
.P1
?thing?
.P2
to scan
.ul
backwards
for the previous occurrence of `thing'.
This is especially handy when you realize that the thing
you want to operate on is back up the page from
where you are currently editing.
.PP
The slash and question mark are the only characters you can
use to delimit a context search, though you can use
essentially any character in a substitute command.
.SH
Address Arithmetic
.PP
The next step is to combine the line numbers
like `\*.', `$', `/.../' and `?...?'
with `+' and `\-'.
Thus
.P1
$-1
.P2
is a command to print the next to last line of
the current file (that is, one line before line `$').
For example, to recall how far you got in a previous editing session,
.P1
$-5,$p
.P2
prints the last six lines.
(Be sure you understand why it's six, not five.)
If there aren't six, of course, you'll get an error message.
.PP
As another example,
.P1
\&\*.-3,\*.+3p
.P2
prints from three lines before where you are now
(at line dot)
to three lines after,
thus giving you a bit of context.
By the way, the `+' can be omitted:
.P1
\&\*.-3,\*.3p
.P2
is absolutely identical in meaning.
.PP
Another area in which you can save typing effort
in specifying lines is to use `\-' and `+' as line numbers
by themselves.
.P1
-
.P2
by itself is a command to move back up one line in the file.
In fact, you can string several minus signs together to move
back up that many lines:
.P1
---
.P2
moves up three lines, as does `\-3'.
Thus
.P1
-3,+3p
.P2
is also identical to the examples above.
.PP
Since `\-' is shorter than `\*.\-1',
constructions like
.P1
-,\*.s/bad/good/
.P2
are useful. This changes `bad' to `good' on the previous line and
on the current line.
.PP
`+' and `\-' can be used in combination with searches using `/.../' and `?...?',
and with `$'.
The search
.P1
/thing/--
.P2
finds the line containing `thing', and positions you
two lines before it.
.SH
Repeated Searches
.PP
Suppose you ask for the search
.P1
/horrible thing/
.P2
and when the line is printed you discover that it
isn't the horrible thing that you wanted,
so it is necessary to repeat the search again.
You don't have to re-type the search,
for the construction
.P1
//
.P2
is a shorthand for `the previous thing that was searched for',
whatever it was.
This can be repeated as many times as necessary.
You can also go backwards:
.P1
??
.P2
searches for the same thing,
but in the reverse direction.
.PP
Not only can you repeat the search, but you can
use `//' as the left side of a substitute command,
to mean
`the most recent pattern'.
.P1
/horrible thing/
.ft I
.... ed prints line with `horrible thing' ...
.ft R
s//good/p
.P2
To go backwards and change a line, say
.P1
??s//good/
.P2
Of course, you can still use the `&' on the right hand side of a substitute to stand for
whatever got matched:
.P1
//s//&\*B&/p
.P2
finds the next occurrence of whatever you searched for last,
replaces it by two copies of itself,
then prints the line just to verify that it worked.
.SH
Default Line Numbers and the Value of Dot
.PP
One of the most effective ways to speed up your editing
is always to know what lines will be affected
by a command if you don't specify the lines it is to act on,
and on what line you will be positioned (i.e., the value of dot) when a command finishes.
If you can edit without specifying unnecessary
line numbers, you can save a lot of typing.
.PP
As the most obvious example, if you issue a search command
like
.P1
/thing/
.P2
you are left pointing at the next line that contains `thing'.
Then no address is required with commands like
.UL s
to make a substitution on that line,
or
.UL p
to print it,
or
.UL l
to list it,
or
.UL d
to delete it,
or
.UL a
to append text after it,
or
.UL c
to change it,
or
.UL i
to insert text before it.
.PP
What happens if there was no `thing'?
Then you are left right where you were _
dot is unchanged.
This is also true if you were sitting
on the only `thing' when you issued the command.
The same rules hold for searches that use
`?...?'; the only difference is the direction
in which you search.
.PP
The delete command
.UL d
leaves dot pointing
at the line that followed the last deleted line.
When line `$' gets deleted,
however,
dot points at the
.ul
new
line `$'.
.PP
The line-changing commands
.UL a ,
.UL c
and
.UL i
by default all affect the current line _
if you give no line number with them,
.UL a
appends text after the current line,
.UL c
changes the current line,
and
.UL i
inserts text before the current line.
.PP
.UL a ,
.UL c ,
and
.UL i
behave identically in one respect _
when you stop appending, changing or inserting,
dot points at the last line entered.
This is exactly what you want for typing and editing on the fly.
For example, you can say
.P1
.ta 1.5i
a
... text ...
... botch ... (minor error)
\&\*.
s/botch/correct/ (fix botched line)
a
... more text ...
.P2
without specifying any line number for the substitute command or for
the second append command.
Or you can say
.P1 2
.ta 1.5i
a
... text ...
... horrible botch ... (major error)
\&\*.
c (replace entire line)
... fixed up line ...
.P2
.PP
You should experiment to determine what happens if you add
.ul
no
lines with
.UL a ,
.UL c
or
.UL i .
.PP
The
.UL r
command will read a file into the text being edited,
either at the end if you give no address,
or after the specified line if you do.
In either case, dot points at the last line read in.
Remember that you can even say
.UL 0r
to read a file in at the beginning of the text.
(You can also say
.UL 0a
or
.UL 1i
to start adding text at the beginning.)
.PP
The
.UL w
command writes out the entire file.
If you precede the command by one line number,
that line is written,
while if you precede it by two line numbers,
that range of lines is written.
The
.UL w
command does
.ul
not
change dot:
the current line remains the same,
regardless of what lines are written.
This is true even if you say something like
.P1
/^\*e\*.AB/,/^\*e\*.AE/w abstract
.P2
which involves a context search.
.PP
Since the
.UL w
command is so easy to use,
you should save what you are editing regularly
as you go along
just in case the system crashes, or in case you do something foolish,
like clobbering what you're editing.
.PP
The least intuitive behavior, in a sense, is that of the
.UL s
command.
The rule is simple _
you are left sitting on the last line that got changed.
If there were no changes, then dot is unchanged.
.PP
To illustrate,
suppose that there are three lines in the buffer, and you are sitting on
the middle one:
.P1
x1
x2
x3
.P2
Then the command
.P1
\&-,+s/x/y/p
.P2
prints the third line, which is the last one changed.
But if the three lines had been
.P1
x1
y2
y3
.P2
and the same command had been issued while
dot pointed
at the second line, then the result
would be to change and print only the first line,
and that is where dot would be set.
.SH
Semicolon `;'
.PP
Searches with `/.../' and `?...?' start
at the current line and move
forward or backward respectively
until they either find the pattern or get back to the current line.
Sometimes this is not what is wanted.
Suppose, for example, that the buffer contains lines like this:
.P1
\*.
\*.
\*.
ab
\*.
\*.
\*.
bc
\*.
\*.
.P2
Starting at line 1, one would expect that the command
.P1
/a/,/b/p
.P2
prints all the lines from the `ab' to the `bc' inclusive.
Actually this is not what happens.
.ul
Both
searches
(for `a' and for `b')
start from the same point, and thus they both find the line
that contains `ab'.
The result is to print a single line.
Worse, if there had been a line with a `b' in it
before the `ab' line, then the print command
would be in error, since the second line number
would be less than the first, and it is illegal to
try to print lines in reverse order.
.PP
This is because the comma separator
for line numbers doesn't set dot as each address is processed;
each search starts from the same place.
In
.UL ed ,
the semicolon `;' can be used just like comma,
with the single difference that use of a semicolon
forces dot to be set at that point
as the line numbers are being evaluated.
In effect, the semicolon `moves' dot.
Thus in our example above, the command
.P1
/a/;/b/p
.P2
prints the range of lines from `ab' to `bc',
because after the `a' is found, dot is set to that line,
and then `b' is searched for, starting beyond that line.
.PP
This property is most often useful in a very simple situation.
Suppose you want to find the
.ul
second
occurrence of `thing'.
You could say
.P1
/thing/
//
.P2
but this prints the first occurrence as well as the second,
and is a nuisance when you know very well that it is only
the second one you're interested in.
The solution is to say
.P1
/thing/;//
.P2
This says to find the first occurrence of `thing', set dot to that line, then find the second
and print only that.
.PP
Closely related is searching for the second previous
occurrence of something, as in
.P1
?something?;??
.P2
Printing the third or fourth or ...
in either direction is left as an exercise.
.PP
Finally, bear in mind that if you want to find the first occurrence of
something in a file, starting at an arbitrary place within the file,
it is not sufficient to say
.P1
1;/thing/
.P2
because this fails if `thing' occurs on line 1.
But it is possible to say
.P1
0;/thing/
.P2
(one of the few places where 0 is a legal line number),
for this starts the search at line 1.
.SH
Interrupting the Editor
.PP
As a final note on what dot gets set to,
you should be aware that if you hit the interrupt or delete
or rubout or break key
while
.UL ed
is doing a command, things are put back together again and your state
is restored as much as possible to what it was before the command
began.
Naturally, some changes are irrevocable _
if you are reading or writing a file or making substitutions or deleting lines, these will be stopped
in some clean but unpredictable state in the middle
(which is why it is not usually wise to stop them).
Dot may or may not be changed.
.PP
Printing is more clear cut.
Dot is not changed until the printing is done.
Thus if you print until you see an interesting line,
then hit delete, you are
.ul
not
sitting on that line or even near it.
Dot is left where it was when the
.UL p
command was started.
.NH
GLOBAL COMMANDS
.PP
The global commands
.UL g
and
.UL v
are used to perform one or more editing commands on all lines that either
contain
.UL g ) (
or don't contain
.UL v ) (
a specified pattern.
.PP
As the simplest example, the command
.P1
g/UNIX/p
.P2
prints all lines that contain the word `UNIX'.
The pattern that goes between the slashes can be anything
that could be used in a line search or in a substitute command;
exactly the same rules and limitations apply.
.PP
As another example, then,
.P1
g/^\*e\*./p
.P2
prints all the formatting commands in a file (lines that begin with `\*.').
.PP
The
.UL v
command is identical to
.UL g ,
except that it operates on those line that do
.ul
not
contain an occurrence of the pattern.
(Don't look too hard for mnemonic significance to
the letter `v'.)
So
.P1
v/^\*e\*./p
.P2
prints all the lines that don't begin with `\*.' _
the actual text lines.
.PP
The command that follows
.UL g
or
.UL v
can be anything:
.P1
g/^\*e\*./d
.P2
deletes all lines that begin with `\*.',
and
.P1
g/^$/d
.P2
deletes all empty lines.
.PP
Probably the most useful command that can follow a global is the
substitute command, for this can be used to make a change
and print each affected line for verification.
For example, we could change the word `Unix' to `UNIX'
everywhere, and verify that
it really worked,
with
.P1
g/Unix/s//UNIX/gp
.P2
Notice that we used `//' in the substitute command to mean
`the previous pattern', in this case, `Unix'.
The
.UL p
command is done on every line
that matches the pattern,
not just those on which a substitution took place.
.PP
The global command operates by making
two passes over the file.
On the first pass, all lines that match the pattern are marked.
On the second pass, each marked line in turn is examined,
dot is set to that line, and the command executed.
This means that it is possible for the command that follows a
.UL g
or
.UL v
to use addresses, set dot, and so on, quite freely.
.P1
g/^\*e\*.PP/+
.P2
prints the line that follows each `.PP' command (the signal for
a new paragraph in some formatting packages).
Remember that `+' means `one line past dot'.
And
.P1
g/topic/?^\*e\*.SH?1
.P2
searches for each line that contains `topic', scans backwards until it finds
a line that begins `.SH' (a section heading) and prints the line
that follows that,
thus showing the section headings under which `topic' is mentioned.
Finally,
.P1
g/^\*e\*.EQ/+,/^\*e\*.EN/-p
.P2
prints all the lines that lie between
lines beginning with `.EQ' and `.EN' formatting commands.
.PP
The
.UL g
and
.UL v
commands can also be
preceded by line numbers, in which case the lines searched
are only those in the range specified.
.SH
Multi-line Global Commands
.PP
It is possible to do more than one command under the control of a
global command, although the syntax for expressing the operation
is not especially natural or pleasant.
As an example,
suppose the task is to change `x' to `y' and `a' to `b' on all lines
that contain `thing'.
Then
.P1
g/thing/s/x/y/\*e
s/a/b/
.P2
is sufficient.
The `\*e' signals the
.UL g
command that the set of commands continues on the next line;
it terminates on the first line that does not end with `\*e'.
(As a minor blemish, you can't use a substitute command
to insert a newline within a
.UL g
command.)
.PP
You should watch out for this problem:
the command
.P1
g/x/s//y/\*e
s/a/b/
.P2
does
.ul
not
work as you expect.
The remembered pattern is the last pattern that was actually
executed,
so sometimes it will be
`x' (as expected), and sometimes it will be `a'
(not expected).
You must spell it out, like this:
.P1
g/x/s/x/y/\*e
s/a/b/
.P2
.PP
It is also possible to execute
.UL a ,
.UL c
and
.UL i
commands under a global command; as with other multi-line constructions,
all that is needed is to add a `\*e' at the end of each line except the last.
Thus to add a `.nf' and `.sp' command before each `.EQ' line, type
.P1
g/^\*e\*.EQ/i\*e
\&\*.nf\*e
\&\*.sp
.P2
There is no need for a final line containing a
`\*.' to terminate the
.UL i
command,
unless there are further commands
being done under the global.
On the other hand, it does no harm to put it in either.
.NH
CUT AND PASTE WITH UNIX COMMANDS
.PP
One editing area in which non-programmers
seem not very confident
is in what might be called
`cut and paste' operations _
changing the name of a file,
making a copy of a file somewhere else,
moving a few lines from one place to another in a file,
inserting one file in the middle of another,
splitting a file into pieces,
and
splicing two or more files together.
.PP
Yet most of these operations are actually quite easy,
if you keep your wits about you
and go cautiously.
The next several sections talk about cut and paste.
We will begin with the
.UX
commands
for moving entire files around,
then discuss
.UL ed
commands
for operating on pieces of files.
.SH
Changing the Name of a File
.PP
You have a file named
`memo'
and you want it to be called
`paper'
instead.
How is it done?
.PP
The
.UX
program that renames files
is called
.UL mv
(for `move');
it `moves' the file from one name to another, like this:
.P1
mv memo paper
.P2
That's all there is to it:
.UL mv
from the old name to the new name.
.P1
mv oldname newname
.P2
Warning: if there is already a file around with the new name,
its present contents will be
silently
clobbered
by the information from the other file.
The one exception is that you can't move a file
to itself _
.P1
mv x x
.P2
is illegal.
.SH
Making a Copy of a File
.PP
Sometimes what you want is a copy of a file _
an entirely fresh version.
This might be because you want to work on a file, and
yet save a copy in case something gets fouled up,
or just because you're paranoid.
.PP
In any case, the way to do it is with the
.UL cp
command.
.UL cp \& (
stands for `copy';
the
.UC UNIX
system
is big on short command names,
which are appreciated by heavy users,
but sometimes a strain for novices.)
Suppose you have a file called
`good'
and
you want to save a copy before you make some
dramatic editing changes.
Choose a name _
`savegood'
might be acceptable _ then type
.P1
cp good savegood
.P2
This copies
`good'
onto
`savegood',
and you now have two identical copies of the file
`good'.
(If
`savegood'
previously contained something,
it gets overwritten.)
.PP
Now if you decide at some time that you want to get
back to the original state of
`good',
you can say
.P1
mv savegood good
.P2
(if you're not interested in
`savegood'
any more), or
.P1
cp savegood good
.P2
if you still want to retain a safe copy.
.PP
In summary,
.UL mv
just renames a file;
.UL cp
makes a duplicate copy.
Both of them clobber the `target' file
if it already exists, so you had better
be sure that's what you want to do
.ul
before
you do it.
.SH
Removing a File
.PP
If you decide you are really done with a file
forever, you can remove it
with the
.UL rm
command:
.P1
rm savegood
.P2
throws away (irrevocably) the file called
`savegood'.
.SH
Putting Two or More Files Together
.PP
The next step is the familiar one of collecting two or more
files into one big one.
This will be needed, for example,
when the author of a paper
decides that several sections need to be combined
into one.
There are several ways to do it,
of which the cleanest, once you get used to it,
is a program called
.UL cat .
(Not
.ul
all
.UC UNIX
programs have two-letter names.)
.UL cat
is short for
`concatenate', which is exactly
what we want to do.
.PP
Suppose the job is to combine the files
`file1'
and
`file2'
into a single file called
`bigfile'.
If you say
.P1
cat file
.P2
the contents of
`file'
will get printed on your terminal.
If you say
.P1
cat file1 file2
.P2
the contents of
`file1'
and then the contents of
`file2'
will
.ul
both
be printed on your terminal,
in that order.
So
.UL cat
combines the files, all right,
but it's not much help to print them on the terminal _
we want them in
`bigfile'.
.PP
Fortunately, there is a way.
You can tell
the system
that instead of printing on your terminal,
you want the same information put in a file.
The way to do it is to add to the command line
the character
.UL >
and the name of the file
where you want the output to go.
Then you can say
.P1
cat file1 file2 >bigfile
.P2
and the job is done.
(As with
.UL cp
and
.UL mv ,
you're putting something into
`bigfile',
and anything that was already there is destroyed.)
.PP
This ability to
`capture' the output of a program
is one of the most useful aspects of
the
.UC UNIX
system.
Fortunately it's not limited to the
.UL cat
program _
you can use it with
.ul
any
program that prints on your terminal.
We'll see some more uses for it in a moment.
.PP
Naturally, you can combine several files,
not just two:
.P1
cat file1 file2 file3 ... >bigfile
.P2
collects a whole bunch.
.PP
Question:
is there any difference between
.P1
cp good savegood
.P2
and
.P1
cat good >savegood
.P2
Answer: for most purposes, no.
You might reasonably ask why there are two programs
in that case,
since
.UL cat
is obviously all you need.
The answer is that
.UL cp
will do some other things as well,
which you can investigate for yourself
by reading the manual.
For now we'll stick to simple usages.
.SH
Adding Something to the End of a File
.PP
Sometimes you want to add one file to the end of another.
We have enough building blocks now that you can do it;
in fact before reading further it would be valuable
if you figured out how.
To be specific,
how would you use
.UL cp ,
.UL mv
and/or
.UL cat
to add the file
`good1'
to the end of the file
`good'?
.PP
You could try
.P1
cat good good1 >temp
mv temp good
.P2
which is probably most direct.
You should also understand why
.P1
cat good good1 >good
.P2
doesn't work.
(Don't practice with a good `good'!)
.PP
The easy way is to use a variant of
.UL > ,
called
.UL >> .
In fact,
.UL >>
is identical to
.UL >
except that instead of clobbering the old file,
it simply tacks stuff on at the end.
Thus you could say
.P1
cat good1 >>good
.P2
and
`good1'
is added to the end of
`good'.
(And if
`good'
didn't exist,
this makes a copy of
`good1'
called
`good'.)
.NH
CUT AND PASTE WITH THE EDITOR
.PP
Now we move on to manipulating pieces of files _
individual lines or groups of lines.
This is another area where new users seem
unsure of themselves.
.SH
Filenames
.PP
The first step is to ensure that you know the
.UL ed
commands for reading and writing files.
Of course you can't go very far without knowing
.UL r
and
.UL w .
Equally useful, but less well known, is the `edit' command
.UL e .
Within
.UL ed ,
the command
.P1
e newfile
.P2
says `I want to edit a new file called
.ul
newfile,
without leaving the editor.'
The
.UL e
command discards whatever you're currently working on
and starts over on
.ul
newfile.
It's exactly the same as if you had quit with the
.UL q
command, then re-entered
.UL ed
with a new file name,
except that if you have a pattern remembered, then a command
like
.UL //
will still work.
.PP
If you enter
.UL ed
with the command
.P1
ed file
.P2
.UL ed
remembers the name of the file,
and any subsequent
.UL e ,
.UL r
or
.UL w
commands that don't contain a filename
will refer to this remembered file.
Thus
.P1 2
.ta .5i .6i .7i
ed file1
... (editing) ...
w (writes back in file1)
e file2 (edit new file, without leaving editor)
... (editing on file2) ...
w (writes back on file2)
.P2
(and so on) does a series of edits on various files
without ever leaving
.UL ed
and without typing the name of any file more than once.
(As an aside, if you examine the sequence of commands here,
you can see why many
UNIX
systems use
.UL e
as a synonym
for
.UL ed .)
.PP
You can find out the remembered file name at any time
with the
.UL f
command;
just type
.UL f
without a file name.
You can also change the name of the remembered file name with
.UL f ;
a useful sequence is
.P1
ed precious
f junk
... (editing) ...
.P2
which gets a copy of a precious file,
then uses
.UL f
to guarantee that a careless
.UL w
command won't clobber the original.
.SH
Inserting One File into Another
.PP
Suppose you have a file called
`memo',
and you want the file called
`table'
to be inserted just after the reference to
Table 1.
That is, in
`memo'
somewhere is a line that says
.IP
Table 1 shows that ...
.LP
and the data contained in
`table'
has to go there,
probably so it will be formatted
properly by
.UL nroff
or
.UL troff .
Now what?
.PP
This one is easy.
Edit
`memo',
find
`Table 1',
and add the file
`table'
right there:
.P1
ed memo
/Table 1/
.ft I
Table 1 shows that ... [response from ed]
.ft
\&\*.r table
.P2
The critical line is the last one.
As we said earlier, the
.UL r
command reads a file;
here you asked for it to be read in right after
line dot.
An
.UL r
command without any address
adds lines at the end,
so it is the same as
.UL $r .
.SH
Writing out Part of a File
.PP
The other side of the coin is writing out part of
the document you're editing.
For example, maybe
you want to split out into a separate file
that table from the previous example,
so it can be formatted and tested separately.
Suppose that in the file being edited
we have
.P1
\&\*.TS
...[lots of stuff]
\&\*.TE
.P2
which is the way a table is set up for the
.UL tbl
program.
To isolate
the table
in a separate file called
`table',
first find the start of the table
(the `.TS' line), then write out the interesting part:
.P1
/^\*e\*.TS/
.ft I
\&\*.TS [ed prints the line it found]
.ft R
\&\*.,/^\*e\*.TE/w table
.P2
and the job is done.
If you are confident, you can do it all at once with
.P1
/^\*e\*.TS/;/^\*e\*.TE/w table
.P2
.PP
The point is that the
.UL w
command can
write out a group of lines, instead of the whole file.
In fact, you can write out a single line if you like;
just give one line number instead of two.
For example, if you have just typed a horribly complicated line
and you know that it (or something like it) is going to be needed later,
then save it _ don't re-type it.
In the editor, say
.P1
a
\&...lots of stuff...
\&...horrible line...
\&\*.
\&\*.w temp
a
\&\*.\*.\*.more stuff\*.\*.\*.
\&\*.
\&\*.r temp
a
\&\*.\*.\*.more stuff\*.\*.\*.
\&\*.
.P2
This last example is worth studying, to be sure you appreciate
what's going on.
.SH
Moving Lines Around
.PP
Suppose you want to
move a paragraph from its present position in a paper
to the end.
How would you do it?
As a concrete example, suppose each paragraph in the paper
begins with the formatting command
`.PP'.
Think about it and write down the details before reading on.
.PP
The brute force way
(not necessarily bad)
is to write the paragraph onto a temporary file,
delete it from its current position,
then read in the temporary file at the end.
Assuming that you are sitting on the
`.PP' command that begins
the paragraph, this is the sequence of commands:
.P1
\&\*.,/^\*e\*.PP/-w temp
\&\*.,//-d
$r temp
.P2
That is, from where you are now
(`\*.')
until one line before the next `\*.PP'
(`/^\*e\*.PP/\-')
write onto
`temp'.
Then delete the same lines.
Finally, read
`temp'
at the end.
.PP
As we said, that's the brute force way.
The easier way (often)
is to use the
.ul
move
command
.UL m
that
.UL ed
provides _
it lets you do the whole set of operations
at one crack,
without any temporary file.
.PP
The
.UL m
command
is like many other
.UL ed
commands in that it takes up to two line numbers in front
that tell what lines are to be affected.
It is also
.ul
followed
by a line number that tells where the lines are to go.
Thus
.P1
line1, line2 m line3
.P2
says to move all the lines between
`line1'
and
`line2'
after
`line3'.
Naturally, any of
`line1'
etc., can be patterns between slashes,
$
signs, or other ways to specify lines.
.PP
Suppose again that you're sitting at the first line of the
paragraph.
Then you can say
.P1
\&\*.,/^\*e\*.PP/-m$
.P2
That's all.
.PP
As another example of a frequent operation,
you can reverse the order of two adjacent lines
by moving the first one
to after the second.
Suppose that you are positioned at the first.
Then
.P1
m+
.P2
does it.
It says to move line dot to after one line after line dot.
If you are positioned on the second line,
.P1
m--
.P2
does the interchange.
.PP
As you can see, the
.UL m
command is more succinct and direct than
writing, deleting and re-reading.
When is brute force better anyway?
This is a matter of personal taste _
do what you have most confidence in.
The main difficulty with the
.UL m
command
is that if you use patterns to specify both the lines
you are moving and the target,
you have to take care that you specify them properly,
or you may well not move the lines you thought you did.
The result of a botched
.UL m
command can be a ghastly mess.
Doing the job a step at a time
makes it easier for you to verify at each step
that you accomplished what you wanted to.
It's also a good idea to issue a
.UL w
command
before doing anything complicated;
then if you goof, it's easy to back up
to where you were.
.SH
Marks
.PP
.UL ed
provides a facility for marking a line
with a particular name so you can later reference it
by name
regardless of its actual line number.
This can be handy for moving lines,
and for keeping track of them as they move.
The
.ul
mark
command is
.UL k ;
the command
.P1
kx
.P2
marks the current line with the name `x'.
If a line number precedes the
.UL k ,
that line is marked.
(The mark name must be a single lower case letter.)
Now you can refer to the marked line with the address
.P1
\(fmx
.P2
.PP
Marks are most useful for moving things around.
Find the first line of the block to be moved, and mark it
with
.ul
\(fma.
Then find the last line and mark it with
.ul
\(fmb.
Now position yourself at the place where the stuff is to go
and say
.P1
\(fma,\(fmbm\*.
.P2
.PP
Bear in mind that only one line can have a particular
mark name associated with it
at any given time.
.SH
Copying Lines
.PP
We mentioned earlier the idea of saving a line
that was hard to type or used often,
so as to cut down on typing time.
Of course this could be more than one line;
then the saving is presumably even greater.
.PP
.UL ed
provides another command,
called
.UL t
(for `transfer')
for making a copy of a group of one or more lines
at any point.
This is often easier than writing and reading.
.PP
The
.UL t
command is identical to the
.UL m
command, except that instead of moving lines
it simply duplicates them at the place you named.
Thus
.P1
1,$t$
.P2
duplicates the entire contents that you are editing.
A more common use for
.UL t
is for creating a series of lines that differ only slightly.
For example, you can say
.P1
.ta 1i
a
\&.......... x ......... (long line)
\&\*.
t\*. (make a copy)
s/x/y/ (change it a bit)
t\*. (make third copy)
s/y/z/ (change it a bit)
.P2
and so on.
.SH
The Temporary Escape `!'
.PP
Sometimes it is convenient to be able
to temporarily escape from the editor to do
some other
.UX
command,
perhaps one of the file copy or move commands
discussed in section 5,
without leaving the editor.
The `escape' command
.UL !
provides a way to do this.
.PP
If you say
.P1
!any UNIX command
.P2
your current editing state is suspended,
and the
.UX
command you asked for is executed.
When the command finishes,
.UL ed
will signal you by printing another
.UL ! ;
at that point you can resume editing.
.PP
You can really do
.ul
any
.UX
command, including another
.UL ed .
(This is quite common, in fact.)
In this case, you can even do another
.UL ! .
.NH
SUPPORTING TOOLS
.PP
There are several tools and techniques that go along with the
editor, all of which are relatively easy once you
know how
.UL ed
works,
because they are all based on the editor.
In this section we will give some fairly cursory examples
of these tools,
more to indicate their existence than to provide
a complete tutorial.
More information on each can be found in
[3].
.SH
Grep
.PP
Sometimes you want to find all occurrences of some word or pattern in
a set of files, to edit them
or perhaps just to verify their presence or absence.
It may be possible to edit each file separately and look
for the pattern of interest, but if there are many files
this can get very tedious,
and if the files are really big,
it may be impossible because of limits in
.UL ed .
.PP
The program
.UL grep
was invented to get around these limitations.
The search patterns that we have described in the paper are often
called `regular expressions', and
`grep' stands for
.P1
g/re/p
.P2
That describes exactly what
.UL grep
does _
it prints every line in a set of files that contains a
particular pattern.
Thus
.P1
grep \(fmthing\(fm file1 file2 file3 ...
.P2
finds `thing' wherever it occurs in any of the files
`file1',
`file2',
etc.
.UL grep
also indicates the file in which the line was found,
so you can later edit it if you like.
.PP
The pattern represented by `thing' can be any
pattern you can use in the editor,
since
.UL grep
and
.UL ed
use exactly the same mechanism for
pattern searching.
It is wisest always to enclose the pattern in the
single quotes \(fm...\(fm if it contains any non-alphabetic
characters, since many such characters also mean something
special to the
.UX
command interpreter
(the `shell').
If you don't quote them, the command interpreter will
try to interpret them before
.UL grep
gets a chance.
.PP
There is also a way to find lines that
.ul
don't
contain a pattern:
.P1
grep -v \(fmthing\(fm file1 file2 ...
.P2
finds all lines that
don't contains `thing'.
The
.UL \-v
must occur in the position shown.
Given
.UL grep
and
.UL grep\ \-v ,
it is possible to do things like selecting all lines that
contain some combination of patterns.
For example, to get all lines that contain `x' but not `y':
.P1
grep x file... | grep -v y
.P2
(The notation | is a `pipe',
which causes the output of the first command to be used as
input to the second command; see [2].)
.SH
Editing Scripts
.PP
If a fairly complicated set of editing operations
is to be done on a whole set of files,
the easiest thing to do is to make up a `script',
i.e., a file that contains the operations you want to perform,
then apply this script to each file in turn.
.PP
For example, suppose you want to change every
`Unix' to `UNIX' and every `Gcos' to `GCOS' in a large number of files.
Then put into the file `script' the lines
.P1
g/Unix/s//UNIX/g
g/Gcos/s//GCOS/g
w
q
.P2
Now you can say
.P1
ed file1 <script
ed file2 <script
\&...
.P2
This causes
.UL ed
to take its commands from the prepared script.
Notice that the whole job has to be planned in advance.
.PP
And of course by using the
.UX
command interpreter, you can
cycle through a set of files
automatically, with varying degrees of ease.
.SH
Sed
.PP
.UL sed
(`stream editor')
is a version of the editor with restricted capabilities
but which is capable of processing unlimited amounts of input.
Basically
.UL sed
copies its input to its output, applying one or more
editing commands to each line of input.
.PP
As an example, suppose that we want to do the `Unix' to `UNIX'
part of the
example given above,
but without rewriting the files.
Then the command
.P1
sed \(fms/Unix/UNIX/g\(fm file1 file2 ...
.P2
applies the command
`s/Unix/UNIX/g'
to all lines from `file1', `file2', etc.,
and copies all lines to the output.
The advantage of using
.UL sed
in such a case is that it can be used
with input too large for
.UL ed
to handle.
All the output can be collected in one place,
either in a file or perhaps piped into another program.
.PP
If the editing transformation is so complicated
that
more than one editing command is needed,
commands can be supplied from a file,
or on the command line,
with a slightly more complex syntax.
To take commands from a file, for example,
.P1
sed -f cmdfile input-files...
.P2
.PP
.UL sed
has further capabilities, including conditional testing
and branching, which we cannot go into here.
.SH
Acknowledgement
.PP
I am grateful to Ted Dolotta
for his careful reading and valuable suggestions.
.SH
References
.IP [1]
Brian W. Kernighan,
.ul
A Tutorial Introduction to the UNIX Text Editor,
Bell Laboratories internal memorandum.
.IP [2]
Brian W. Kernighan,
.ul
UNIX For Beginners,
Bell Laboratories internal memorandum.
.IP [3]
Ken L. Thompson and Dennis M. Ritchie,
.ul
The UNIX Programmer's Manual.
Bell Laboratories.
.sp
.I "May 1979"