4.4BSD/usr/src/contrib/bib/man/invert.1

Compare this file to the similar file:
Show the results in this format:

.\"	"@(#)invert.1	4.5	5/27/93";
.TH INVERT 1 "2 September 1983"
.UC 4
.SH NAME
invert, lookup \- create and access an inverted index
.SH SYNOPSIS
.B invert
[option ... ] file ...
.ns
.PP
.B lookup
[option ... ]
.SH DESCRIPTION
.I Invert
creates an inverted index to one or more files.
.I Lookup
retrieves records from files for which an inverted index exists.
The inverted indices are intended for use with
.IR bib (1).
.PP
.I Invert
creates one inverted index to all of its input files.
The index must be stored in the current directory and may not be moved.
Input files may be absolute path names or paths relative to the current
directory.
Each input file is viewed as a set of records;
each record consists of non-blank lines;
records are separated by blank lines.
.PP
.I Lookup
retrieves records based on its input
.I (stdin).
Each line of input is a retrieval request.
All records that contain all of the keywords in the retrieval request
are sent to
.I stdout.
If there are no matching references, ``No references found.'' is sent to
.I stdout.
.I \ Lookup
first searches in the user's private index (default INDEX)
and then, if no references are found,
in the system index (/usr/dict/papers/INDEX).
The system index was produced using
.I invert
with the default options;
in general, the user is advised to use the defaults.
.PP
Keywords are a sequence of non-white space characters
with non-alphanumeric characters removed.
Keywords must be at least two characters and are truncated
(default length is 6).
Some common words are ignored.
Some lines of input are ignored for the purpose of collecting keywords.
.PP
The following options are available for
.I invert:
.IP "\-c \fIfile\fP" 8m
.ns
.IP \-c\fIfile\fP
File contains common words, one per line.
Common words are not used as keys.
(Default /usr/new/lib/bmac/common.)
.IP "\-k \fIi\fP"
.ns
.IP \-k\fIi\fP
Maximum number of keys kept per record. (Default 100)
.IP "\-l \fIi\fP"
.ns
.IP \-l\fIi\fP
Maximum length of keys. (Default 6)
.IP "\-p \fIfile\fP"
.ns
.IP \-p\fIfile\fP
File is the name of the private index file (output of
.IR invert ).
(Default is INDEX.)
The index must be stored in the current directory.
(Be careful of the second form.
The shell will not know to expand the file name.
E.g. \-p~/index won't work; use \-p\ ~/index.)
.IP \-s
Silent.
Suppress statistics.
.IP -%\fIstr\fP
Ignore lines that begin with %x
where x is in
.I str.
(Default is CNOPVX. See
.IR bib (1)
for explanation of field names.)
.PP
.I Lookup
has only the options
.BR c ,
.BR l ,
and
.B  p
with the same meanings as
.I bib.
In particular, the
.B p
option can be followed by a list of comma separated index files.
These are searched in order from left to right until at least one reference
is found.
.SH FILES
INDEX                    inverted index
.br
/usr/tmp/invertxxxxxx    scratch file for invert
.br
/usr/new/lib/bmac/common     default list of common words
.br
/usr/dict/papers/INDEX   default system index
.SH SEE ALSO
\fIA UNIX Bibliographic Database Facility\fP,
Timothy A. Budd and Gary M. Levin,
University of Arizona Technical Report 82-1, 1982.
.br
bib(1)
.SH DIAGNOSTICS
Messages indicating trouble accessing files are sent on
.I  stderr.
There is an explicit message on
.I stdout
from
.I lookup
if no references are found.
.LP
.I Invert
produces a one line message of the form,
\*(oq%D\ documents\ \ \ %D distinct\ keys\ \ %D\ key\ occurrences\*(cq.
This can be suppressed with the \-s option.
.LP
The message \*(oqlocate: first key (%s) matched too many refs\*(cq
indicates that the first key matched more references than could be stored
in memory.
The simple solution is to use a less frequently occurring key as the first
key in the citation.
.SH BUGS
No attempt is made to check the compatibility between an index
and the files indexed.
The user must create a new index whenever
the files that are indexed are modified.
.LP
Attempting to invert a file containing unprintable characters can
cause chaos.