[TUHS] [groff] The hyphenation algorithm produces wrong results

Bakul Shah bakul at bitblocks.com
Mon Mar 5 08:36:58 AEST 2018



> On Mar 4, 2018, at 1:50 PM, Ralph Corderoy <ralph at inputplus.co.uk> wrote:
> 
> Hi Doug,
> 
> Bakul wrote:
>> I remembered reading about Knuth's line-breaking algorithm in Software
>> Practice & Experience in early eighties and being quite impressed with
>> it.  So may be that clear description of the algorithm has something
>> to do with it?  Ah, here it is:
>> 
>> “Breaking Paragraphs into lines” by Donald Knuth & Plass, SP&E, Volume
>> 11, issue 11, Nov. 1981
> 
> That's more to do with TeX looking at the whole paragraph when deciding
> where to split lines.  Hyphenation is part of that because a word might
> help out by being the ideal thing to split and have the rest of the
> lines sit easily in their length, but TeX's hyphenation algorithm is
> distinct again.
> 
> Ted Harding gives some background on the groff list back in 2001,
> https://lists.gnu.org/archive/html/groff/2001-03/msg00026.html
> but I expect groff used TeX's algorithm because it was published, could
> handle multiple languages, e.g. hyphen.us, and the data files were
> available to contort into what groff ended up using in its simplified
> TeX algorithm.
> 
>    $ cd /usr/share/groff/1.22.3/tmac
>    $ ls hyphen*
>    hyphen.den  hyphenex.cs hyphenex.us hyphen.sv   hyphen.us
>    hyphen.cs   hyphen.det  hyphenex.de hyphen.fr
>    $
> 
> They've comments explaining their content.
> 
> Werner Lemburg on the groff list probably knows for certain as he had to
> fathom all this out before becoming groff's excellent maintainer for
> many years.
> 
> -- 
> Cheers, Ralph.
> https://plus.google.com/+RalphCorderoy

Thanks Ralph and Toby. “Because it was clearly described and published”
was the point I was trying to make and should’ve stopped there : ). SP&E
article had made a strong impression on me and that is what I instantly
thought of. 



More information about the TUHS mailing list