V4/man/man1/speak.1

.th SPEAK I 8/15/73
.if t .ds A \o"a\(ga"
.if n .ds A a`
.if t .ds v \|\(bv
.sh NAME
speak \*- word to voice translator
.sh SYNOPSIS
.bd speak
[
.bd \*-epsv
] [ vocabulary [ output ] ]
.sh DESCRIPTION
.it Speak
turns a stream of words
into utterances and outputs them to a voice synthesizer,
or to a specified output file.
It has facilities for maintaining a vocabulary.
It receives, from the standard input 
.s3
.lp +5 3
\*-	working lines: text of words separated by blanks
.lp +5 3
\*-	phonetic lines: strings of phonemes for one word preceded
and separated by commas.
The phonemes may be followed by comma-percent then a `replacement
part' \*- an ASCII string with no spaces.
The phonetic code is given in vsp(VII).
.lp +5 3
\*-	empty lines
.lp +5 3
\*-	command lines: beginning with
.bd !.
The following command lines
are recognized:
.s3
.lp +15 10
\fB!r\fR file	replace coded vocabulary from file
.lp +15 10
\fB!w\fR file	write coded vocabulary on file
.lp +15 10
\fB!p\fR	print parsing for working word
.lp +15 10
\fB!l\fR	list vocabulary on standard output with phonetics
.lp +15 10
\fB!c\fR word	copy phonetics from working word to
specified word
.lp +15 10
\fB!d\fR	print phonetics for working word
.s3
.i0
Each working line replaces its predecessor.
Its first word is the `working word'.
Each phonetic line replaces the phonetics stored for the
working word.
In particular, a phonetic line of comma only deletes the
entry for the working word.
Each working line, phonetic line or empty line
causes the working line to be uttered.
The process terminates at the end of input.
.s3
Unknown words are pronounced by rules, and failing that,
are spelled.
Spelling is done by taking each character of
the word, prefixing it with *, and looking it up.
Unspellable words burp.
.s3
.it Speak
is initialized with a coded vocabulary stored in file
/usr/lib/speak.m.
The vocabulary option substitutes a different file for
/usr/lib/speak.m.
.s3
A set of single letter options may
appear in any order preceded by
.bd \*-.
Their meanings are:
.s3
.lp +8 4
\fB\*-e\fR	suppress English steps (4\*-8) below
.lp +8 4
\fB\*-p\fR	suppress pronunciation by rule
.lp +8 4
\fB\*-s\fR	suppress spelling
.lp +8 4
\fB\*-v\fR	suppress voice output
.s3
.i0
.ne4
The steps of pronunciation by rule are:
.s3
.lp +5 5
(1)	If there were no lower case letters in the working line,
fold all upper case letters to lower.
.lp +5 5
(2)	Fold an initial cap to lower case, and try again.
.lp +5 5
(3)	If word has only one letter, or has no lower case vowels, quit.
.lp +5 5
(4)	If there is a final 
.it s,
strip it.
.lp +5 5
(5)	Replace final \*-ie by \*-y.
.lp +5 5
(6)	If any changes have been made, try whole word again.
.lp +5 5
(7)	Locate probable long vowels and capitalize them.
Mark probable silent \fIe\fR's.
.lp +5 5
(8)	Put back the 
.it s
stripped in (4), if any.
.lp +5 5
(9)	Place # before and after word.
.lp +5 5
(10)	Prefix word with 
.it %,
and look up longest initial match
in the stored table of words; if none, quit.
.lp +5 5
(11)	Use phonemes from the stored phonetic string as
pronunciation, and replace the matched stuff by the
replacement part of the phonetic string.
.lp +5 5
(12)	If anything remains, go to (10).
.s3
.i0
Long vowels are located this way in step (7):
.s3
.lp +5 5
(1)	A \fIu\fR appearing in context
[^aeiou]u[^aeiouwxy][aieouy].
(The notation is just a regular expression \*A la ed(I).)
.ft I
(pustUlous)
.ft R
.lp +5 5
(2)	One of [aeo] appearing in the context 
[aeo][^aehiouwxy][ie][aou]
or in the context [aeo][^aehiouwxy]ien is assumed long.
The digram \fIth\fR behaves as a single letter in this test.
.ft I
(rAdium, facEtious, quOtient, carpAthian)
.ft R
.lp +5 5
(3)	If the first vowel in the word is \fIi\fR followed by one of
\fIaou,\fR it is assumed long.
.ft I
(Iodine, dIameter, trIumph)
.ft R
.lp +5 5
(4)	If the only vowel in the word is final \fIe,\fR the vowel is
assumed long.
.ft I
(bE, shE)
.ft R
.lp +5 5
(5)	If the only vowels in the word appear in the pattern
[aeiouy][^aeiouwxy]S, where S is one of the suffixes
.br
.dt
	\*-al	\*-le	\*-re	\*-y
.br
.lp +5 0
then the first vowel is assumed long.
.ft I
(glObal, tAble, lUcre, lAdy)
.ft R
.lp +5 5
(6)	If no suffix was found in (5),
as many of these suffixes as possible are isolated from
right to left.
Stripping stops when \fIe\fR has been stripped,
nor is \fIe\fR stripped before a suffix beginning with \fIe\fR.
Each suffix is marked by inserting \*v just before the first letter, or
just after \fIe\fR in those suffixes that begin with \fIe\fR.
.br
.dt
	\*-able	\*-ably	\*-e	\*-ed	\*-en
.br
	\*-er	\*-ery	\*-est	\*-ful	\*-ly
.br
	\*-ing	\*-less	\*-ment	\*-ness	\*-or
.lp +5 0
.ft I
(care\*vful\*vly, maj\*vor, fine\*vry, state\*v, caree\*vr)
.ft R
.lp +5 5
(7)	If the word, exclusive of suffixes, ends in \fIi\fR or \fIy\fR,
and contains no earlier vowel, then \fIi\fR or \fIy\fR is assumed long.
.ft I
(pY \fR(from pie),\fI crY\*ving, lIe\*vd)
.ft R
.lp +5 5
(8)	If the first suffix begins with one of [aeio],
then the vowel [aeiouy] in an immediately
preceding pattern [^aeo][aeiouy][^aeiouwxy]
is assumed long.
The digram \fIth\fR behaves as a single letter in this test.
.ft I
(cAre\*vful\*vly, bAthe\*vd, mAj\*vor, pOt\*vable, port\*vable)
.ft R
.lp +5 5
(9)	In these exceptional cases no long letter is assumed in the
preceding step:
.lp +10 5
(i)	before \fIg\fR, if there are any earlier vowels
.ft I
(postage\*v, stAge\*v, college\*v)
.ft R
.lp +10 5
(ii)	\fIe\fR is not long before \fIl\fR
.ft I
(travele\*vd)
.ft R
.lp +5 5
(10)	If the first suffix begins with one of [aeio],
and the word exclusive of suffixes ends in 
[aeiouyAEIOUY]th, then
digram \fIth\fR is capitalized.
.ft I
(breaTH\*ving, blITHe\*vly)
.ft R
.lp +5 5
(11)	An attempt is made to
recognize silent \fIe\fR in the middle of compound words.
Such an \fIe\fR is marked by a following \*v, and preceding vowels, other than
\fIe\fR, are assumed long as in step (8).
Silent \fIe\fR is marked in the context 
[bdgmnprst][bdgpt]le[^aeioruy\*v]S,
where S is any
string that contains [aeiouy] but does not contain \*v or the end of the word.
Silent \fIe\fR is also marked in the context
[^aeiu][aiou][^aeiouwxy]e[^aeinoruy]S.
.ft I
(simple\*vton, fAce\*vguard, cAve\*vman, cavernous)
.ft R
.s0
.sh FILES
/usr/lib/speak.m
.sh "SEE ALSO"
vs(VII), vs(IV)
.sh DIAGNOSTICS
`?' for unknown command with 
.bd !,
or for
unreadable or unwritable vocabulary file
.sh BUGS
Vocabulary overflow is unchecked.
Excessively long words cause dumps.
Space is not reclaimed from deleted entries.