[TUHS] Data structures in Unix editors
Mary Ann Horton
mah at mhorton.net
Tue Apr 6 09:23:00 AEST 2021
This is actually what ed, ex, and vi do as well, albeit not with a
The buffer is an array of integer offsets into a file. Line n in the
buffer is pointed at by buffer[n], whose contents are an offset into the
temp file. That offset points to a null terminated line of buffer text.
When a new line of text is created (or changed) the new line is appended
to the temp file and the corresponding offset goes into the buffer array.
One big reason for this was that ed was originally written for a 16 bit
machine, and buffers didn't fit in memory.
This works very well to insert or delete lines, you only have to copy
the line references in the buffer array. It's also convenient for undo,
since putting things back just means restoring the original array. My
enhanced ed (hed at Wisconsin, Portable ed at Bell Labs) had a second
array for undo, and a full copy of the array was made before a change.
Bill Joy wrote a fancier implementation in vi, where only lines added or
deleted were saved, command-specific. Undo knew which command it had to
undo and special-cased each one.
The big disadvantage to this structure is that ends of lines are not
just newline characters. You can't backspace over a newline, or change
newlines to something else.
On 4/1/21 5:56 AM, Tony Finch wrote:
> David C. Brock <dbrock at computerhistory.org> wrote:
>> I’d like to read similar discussions of the data structures for ed, em,
>> ex/vi. If anyone has suggestions of references, they would be very
> A curious one is nvi, which uses the Berkeley DB RECNO interface to access
> a text file as an array of lines (RECNO = record number).
More information about the TUHS