Minix1.5/commands/elvis/READ_ME

			 	   E L V I S

Elvis is a clone of vi/ex.  It boasts about 97% of the vi commands and about
92% of the ex commands.  It is generally quite fast.  It can edit files that
are larger than a single process' data space.  Elvis also has a few features
that the real vi lacks.  Several related programs are included, too.

Elvis runs under BSD/SysV UNIX, MINIX-ST, and MINIX-PC.
Elvis should be fairly easy to port to any OS that has the termcap functions.

It cannot be compiled on MINIX using the standard ACK compiler because asld
runs out of memory while linking.  If you must recompile it, you will have to
use some other C compiler and cross compiler on MS-DOS.  This problem will
be remedied in MINIX 2.0 when a new ACK compiler becomes available.


------------------------------- COPYING -----------------------------
The copyright of Elvis is controlled by the author, Steve Kirkendall.
(That's me.  Hello!)

This document describes the restrictions on how Elvis can be distributed.
The restrictions are, basically, that anybody can make & distribute copies,
and that nobody except me can say otherwise.

You can make any number of copies of Elvis, and distribute it in either
source code form or executable form, either alone or as part of some
other package, to anybody you wish, provided that the following conditions
are met:

	* If you are distributing Elvis as part of a package, then you must
	  either distribute the whole package for free, or you must be willing
	  to distribute Elvis separately for free to anybody who wants it.

	  You can charge up to $6 for disks and labor, and still say the
	  software is free.

	* If you are distributing Elvis via a BBS or a network, then you
	  may charge for connection time & subscription fees, but there must
	  not be any surcharge for downloading Elvis.

	* This document must be included with each copy that you distribute.

	* The name "Elvis" cannot be changed -- not even in advertisements.
	  In particular, I don't want anybody claiming that Elvis is the
	  real "vi" from BSD.  It's okay to call it "Elvis - a clone of vi",
	  though.

	* This agreement cannot be modified without my permission.

My address is:	Steve Kirkendall
		16820 SW Tallac Way
		Beaverton, OR 97006

My phone# is:	(503) 642-9905

Email:		kirkenda@cs.pdx.edu
		...uunet!tektronix!psueea!eecs!kirkenda


------------------------------- Termcap -----------------------------
REQUIRED NUMERIC CAPABILITIES
	:co#:	number of columns on the screen (characters per line)
	:li#:	number of lines on the screen

REQUIRED STRING CAPABILITIES
	:ce=:	clear to end-of-line
	:cl=:	home the cursor & clear the screen
	:cm=:	move the cursor to a given row/column
	:up=:	move the cursor up one line

BOOLEAN CAPABILITIES
	:am:	auto margins - wrap when a char is written to the last column?
	:pt:	physical tabs?

OPTIONAL STRING CAPABILITIES
	:al=:	insert a blank row on the screen
	:dl=:	delete a row from the screen
	:cd=:	clear to end of display
	:ei=:	end insert mode
	:ic=:	insert a blank character
	:im=:	start insert mode
	:dc=:	delete a character
	:sr=:	scroll reverse (insert a row at the top of the screen)
	:vb=:	visible bell

OPTIONAL STRINGS RECEIVED FROM THE KEYBOARD
	:kd=:	sequence sent by the <down arrow> key
	:kl=:	sequence sent by the <left arrow> key
	:kr=:	sequence sent by the <right arrow> key
	:ku=:	sequence sent by the <up arrow> key
	:PU=:	sequence sent by the <PgUp> key
	:PD=:	sequence sent by the <PgDn> key
	:HM=:	sequence sent by the <Home> key
	:EN=:	sequence sent by the <End> key

OPTIONAL CAPABILITIES THAT DESCRIBE CHARACTER ATTRIBUTES
	:so=: :se=:	start/end standout mode (We don't care about :sg#:)
	:us=: :ue=:	start/end underlined mode
	:VB=: :Vb=:	start/end boldface mode
	:as=: :ae=:	start/end alternate character set (italics)
	:ug#:		visible gap left by :us=:, :ue=:, :VB=:, or :Vb=:


------------------------------- Cflags -----------------------------
Elvis uses many preprocessor symbols to control compilation...

-DM_SYSV
	If defined, then Elvis uses SysV ioctl() calls to control the tty;
	normally it uses V7/BSD/MINIX ioctl() calls.

-DDATE=\'\"`date`\"\'
	DATE should be defined to be a string constant.  It is printed by the
	:version command as the compilation date of the program.

	It is only used in cmd1.c, and even there you may leave it undefined
	without causing an urp.

	The form shown above only works if you use "eval".  See the Makefile.

-DTMPNAME=\"/tmp/vi%04x%04x\"
	This allows you to use a different name for Elvis' temporary files.
	The default value is defined near the top of vi.h, so you only need
	to use this on the commandline if the default name is wrong.

	It should contain two "%d" or "%x" formats, which are replaced by
	the inode number and device major/minor number.

-DCUTNAME=\"/tmp/cut%04x%04x\"
	This is similar to TMPNAME, but is used to generate names for old
	temp files which are being kept around because they are refered to
	by cut buffers.

	It should contain two "%d" or "%x" formats, which are replaced by the
	Elvis' getpid() and file-descriptor numbers.

-DCRUNCH
	This option causes some large & often-used macros to be replaced by
	equivelent functions.  It reduces the size of the ".text" segment by
	about 4K, and you don't sacrifice any features -- just a little speed.

-DSET_NOCHARATTR
	Permanently disables the charattr option.  This reduces the size of
	your ".text" segment by about 850 bytes.

-DNO_RECYCLE
	Normally, Elvis will recycle space from the tmp file which contains
	totally obsolete text.  This flag disables this recycling.  Without
	recycling, the ".text" segment is about 1K smaller that it would
	otherwise be, but the tmp file grows much faster.  If you have a lot
	of free space on your harddisk, but Elvis is too bulky to run with
	recycling, then try it without recycling.

-DDEBUG
	This adds the ":debug" and ":validate" commands, and also adds many
	internal consistency checks.  It increases the size of the ".text"
	segment by about 5K.


------------------------------- Mods -----------------------------
A few ideas for modifications...

MODE INDICATORS
	Elvis always reads keystrokes via the getkey() function.  This function
	is called with an argument which describes the context in which the
	character will be processed:

		WHEN_EX		- called from the vgets() function to read
				  a single line of text.  Either EX command
				  mode, EX text entry mode, or VI while reading
				  a search string.
		WHEN_VICMD	- VI mode, getting a command character.
		WHEN_VIINP	- VI's input mode.
		WHEN_VIREP	- VI's replace mode (the R command).
		0		- misc times, e.g. "HIT A KEY TO CONTINUE"

	So, the getkey() function would be a good place to add some kind of
	mode indicator.  Like, you could change the shape of the cursor for
	input mode vs. VI command mode.

ARROW KEYS IN INPUT MODE
	The arrow keys are not normally mapped during input mode.  It might
	be fun, though, to map them to ESC + [hjkl] + a.  This way, if you
	hit an arrow key while in input mode, elvis would take you out of input
	mode momentarily, move the cursor, and drop you back into input mode.

	Neat, huh?

	Something similar could be done with replace mode.

WRAP LONG LINES (INSTEAD OF SCROLLING SIDEWAYS)
	This would mostly require changes to redraw(), mark2phys(), and
	drawtext().  All of these are in the file "redraw.c".

ADD MORE SUPPORT FOR NON-ASCII CHARACTER SETS
	Elvis displays 8-bit character sets just fine, but is a bit weak in
	the input and search departments.

	For input, something similar to :map would be nice.  Actually, :abbr
	is a little closer.  How about ":digraph" to map a specified pair of
	ASCII characters into a single non-ASCII character?

REWRITE THE REGULAR EXPRESSION PARSER AND THE SEARCHING CODE
	The current doesn't allow you to search for non-ASCI characters.
	It could probably be made smaller & faster.

Suggestions are welcome.


------------------------------- internal -----------------------------
INTERNAL TEXT REPRESENTATION
	When elvis starts up, the file is copied into a temporary file.  Small
	amounts of extra space are inserted into the temporary file to insure
	that no text lines cross block boundaries; this speeds up processing.
	The "extra space" is filled with NUL charcters; the input file must
	not contain any NULs, to avoid confusion.

	The first block of the temporary file is an array of shorts which
	describe the order of the blocks; i.e. header[1] is the block number
	of the first block, and so on.  This limits the temporary file to
	512 active blocks, so the largest file you can edit is about 400K
	bytes long.  I hope that's enough!

	When blocks are altered, they are rewritten to a *different* block
	in the file, and the in-core version of the header block is updated
	accordingly.  The in-core header block will be copied to the temp
	file immediately before the next change... or, to undo this change,
	swap the old header (from the temp file) with the new (in-core)
	header.

	Elvis maintains another in-core array which contains the line-number
	of the last line in every block.  This allows you to go directly to a
	line, given its line number.

IMPLEMENTATION OF EDITING
	There are three basic operations which affect text:

		* delete text	- delete(from, to)
		* add text	- add(at, text)
		* yank text	- cut(from, to)

	To yank text, all text between two text positions is copied into
	a cut buffer.  The original text is not changed.  To copy the text
	into a cut buffer, you need only remember which physical blocks that
	contain the cut text, the offset into the first block of the start of
	the cut, the offset into the last block of the end of the cut, and
	what kind of cut it was.  (Cuts may be either character cuts or line
	cuts; the kind of a cut affects the way it is later "put".)  This is
	implemented in the function cut().

	To delete text, you must modify the first and last blocks, and remove
	any reference to the intervening blocks in the header's list.  The
	text to be deleted is specified by two marks.  This is implemented in
	the function delete();

	To add text, you must specify the text to insert (as a NUL-terminated
	string) and the place to insert it (as a mark).  The block into which
	the text is to be inserted may need to be split into as many as four
	blocks, with new intervening blocks needed as well... or it could be
	as simple as modifying the block.  This is implemented in the function
	add().

	Other interesting functions are paste() (to copy text from a cut buffer
	into the file), modify() (for an efficient way to implement a combined
	delete/add sequence), and input() (to get text from the user & insert
	it into the file).

	When text is modified, an internal file-revision counter, called
	"changes", is incremented.  This counter is used to detect when
	certain caches are out of date.  (The "changes" counter is also
	incremented when we switch to a different file, and also in one
	or two similar situations -- all related to invalidating caches.)

MARKS AND THE CURSOR
	Marks are places within the text.  They are represented internally
	as a long variable which is split into two bitfields: a line number
	and a character index.  Line numbers start with 1, and character
	indexes start with 0.

	Since line numbers start with 1, it is impossible for a set mark to
	have a value of 0L.  0L is therefore used to represent unset marks.

	When you do the "delete text" change, any marks that were part of
	the deleted text are unset, and any marks that were set to points
	after it are adjusted.  Similarly, marks are adjusted after new text
	is inserted.

	The cursor is represented as a mark.

EX COMMAND INTERPRETATION
	EX commands are parsed, and the command name is looked up in an array
	of structures which also contain a pointer to the function that
	implements the command, and a description of the arguments that the
	command can take.  If the command is recognized and its arguments
	are legal, then the function is called.

	Each function performs its task; this may cause the cursor to be moved
	to a different line, or whatever.

SCREEN CONTROL
	The screen is updated via a package which looks like the "curses"
	library, but which is actually implemented in a simpler, faster way.
	Most curses operations are implemented as macros which copy characters
	into a large I/O buffer, which is then written with a single large
	write() call as part of the refresh() operation.

	The functions which modify text remember where text has been modified;
	the screen redrawing function needs this information to help it reduce
	the amount of text that is redrawn each time.


------------------------------- regexp -----------------------------
Regular Expressions

Syntax

The code for handling regular expressions is derived from Henry Spencer's
regexp package.  However, I have modified the syntax to resemble that of
the real vi.

ELVIS' regexp package treats the following one- or two-character strings
(called meta-characters) in special ways:

	\( \)	Used to control grouping
	^	Matches the beginning of a line
	$	Matches the end of a line
	\<	Matches the beginning of a word
	\>	Matches the end of a word
	.	Matches any single character
	[ ]	Matches any single character inside the brackets
	*	The preceding may be repeated 0 or more times
	+	The preceding may be repeated 1 or more times
	?	The preceding is optional
	\|	Separates two alternatives

Anything else is treated as a normal character which must match exactly.
The special strings may all be preceded by a backslash to force them to
be treated normally.

For example, "\(for\|back\)ward" will find "forward" or "backward", and
"\<text\>" will find "text" but not "context".


Options

ELVIS has two options which affect the way regular expressions are used.
These options may be examined or set via the :set command.

The first option is called "[no]magic".  This is a boolean option, and it is
"magic" (TRUE) by default.  While in magic mode, all of the meta-characters
behave as described above.  In nomagic mode, only ^ and $ retain their
special meaning.

The second option is called "[no]ignorecase".  This is a boolean option, and
it is "noignorecase" (FALSE) by default.  While in ignorecase mode, the
searching mechanism will not distinguish between an uppercase letter and its
lowercase form.  In noignorecase mode, uppercase and lowercase are treated
as being different.

Also, the "[no]wrapscan" option affects searches.

Substitutions

The :s command has at least two arguments: a regular expression, and a
substitution string.  The text that matched the regular expression is
replaced by text which is derived from the substitution string.

Most characters in the substitution string are copied into the text literally
but a few have special meaning:

	&	Causes a copy of the original text to be inserted
	\1	Inserts a copy of that portion of the original text which
		  matched the first set of \( \) parentheses.
	\2 - \9	Does the same for the second (etc.) pair of \( \).

These may be preceded by a backslash to force them to be treated normally.