[TUHS] Was the compressed dictionary used?
Warner Losh
imp at bsdimp.com
Fri Jan 3 01:12:45 AEST 2025
On Thu, Jan 2, 2025, 7:51 AM Douglas McIlroy <douglas.mcilroy at dartmouth.edu>
wrote:
> I am not aware that the compressed dictionary was used for anything.
> Steve Johnson's first shell-script spelling-checker did make a pass
> over a dictionary, but not Webster's second, which would have caused
> lots of false negatives because it contains so many exotic small words
> that could result from typos.
Where did the Websters Second file come from? Did the labs give the public
domain paper dictionary to the equivalent of a typing pool and had them
enter it? It did it come from elsewhere? Or something else? How was it
checked for accuracy?
Warner
My production spell aggresively stripped
> affixes and used hashing and other coding tricks to keep its
> "dictionary" in the limited memory of a PDP-11. (The whole story is
> told in https://www.cs.dartmouth.edu/~doug/spell.pdf and insightfully
> described by Jon Bentley in
> https://dl.acm.org/doi/pdf/10.1145/3532.315102.) When larger memory
> became available, these heroics were replaced by basic common-prefix
> coding patterned after Morris and Thompson, just as Arnold surmised.
>
> On Thu, Jan 2, 2025 at 7:41 AM <arnold at skeeve.com> wrote:
> >
> > Hi.
> >
> > The paper on compressing the dictionary was interesting. In the day
> > of 20 meg disks, compressing a ~ 2.5 meg file down to ~ .5 meg is
> > a big savings.
> >
> > Was the compressed dictionary put into use? I could imaging that
> > spell(1) at least would have needed some library routines to return
> > a stream of words from it.
> >
> > Just wondering. Thanks,
> >
> > Arnold
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.tuhs.org/pipermail/tuhs/attachments/20250102/864eb399/attachment.htm>
More information about the TUHS
mailing list