[TUHS] Paragraphs formatted differently depending on previous ones
    G. Branden Robinson 
    g.branden.robinson at gmail.com
       
    Sat May  3 22:58:41 AEST 2025
    
    
  
[looping the groff list back in; Doug emailed TUHS instead]
At 2025-05-03T08:14:18-0400, Douglas McIlroy wrote:
> > The relevant function fits on one screen, if your terminal window is
> > at least 36 lines high.  :)  (Much of it is given over to comments.)
> 
> >   https://git.savannah.gnu.org/cgit/groff.git/tree/src/roff/troff/env.cpp?id=d96a9c58bbe296b065fa250e3ea1e1a410cdde81#n2185
> 
> Actually there's still another function, spread_space that contains
> the inner R-L and L-R loops.
Yes.  `distribute_space()` is in "env.cpp" (environment handling) and
operates on the output line.  `spread_space()` is in "node.cpp" and is
what alters the width of `word_space_node` (and derived
`unbreakable_space_node`) objects on the line.  Whereas in troff mode,
often every adjustable space on an underset line experiences adjustment,
in nroff mode the converse is frequently true, as shown below.
Some of this stuff will be more visible for debugging purposes with the
new `pline` request and improved `pm` request in the forthcoming groff
1.24.0 release.
Here's an altered version of the adjustment demonstrator I cooked up for
Alex.  It uses a shorter line length and fewer repetitions of "alex",
but still illustrates alternating adjustment "parity", as I term it.
$ { echo .ll 15n; echo .di dd; for n in $(seq 7); do echo alex; done; \
  printf '.pl \\n(nlu\n'; echo .di; echo .pm dd; echo .dd; } \
  | nroff 2>/dev/null | cat -s
alex alex  alex
alex  alex alex
alex
If we discard normal output with the `-z` option, reënable output to
standard error, and send that to jq(1) for formatting, we get more
information, which I'll relegate to a footnote because it's lengthy.[1]
It also serves to illustrate how we can dump diversions, and the
intriguing properties thereof in GNU troff.
$ { echo .ll 15n; echo .di dd; for n in $(seq 7); do echo alex; done; \
  printf '.pl \\n(nlu\n'; echo .di; echo .pm dd; } | nroff -z 2>&1 | jq
> The whole thing has become astonishingly complicated compared to what
> I remember as a few (carefully crafted) lines of code in the early
> roff.
The first computer I ever touched, and programmed, had 16 KB of RAM.
Necessity is a mother in more than one sense.  ;-)
I'm doing what I can with the GNU troff code base to make it more
intelligible.  Among the windmills I'm tilting at are improved type
annotations (like using `bool` for Booleans instead of integers for Yet
Another Purpose), explicitly annotated null pointers, and above all,
more meaningful variable and function names.  Kernighan and Plauger,
then Pike, beat this drum repeatedly in their books on programming
style, but we're still not rid of hackers who think naming a variable
`bflag` is a good idea.
> I admire your intrepid forays into the groff woods, of which this part
> must be among the less murky.
Thank you!  The reformed handling of device extension requests/escapes
so that they could encode Unicode characters, and their conversion into
nodes, was almost more murk than I could stand.  I think it might have
helped to have some of the new introspection features 12-18 months ago,
but we have them now.  There's always more to learn, and to document.
For those who hadn't noticed, I'll put the relevant part of the "NEWS"
file in another footnote.[2]
I'm on the verge of adding another, `pftr`, to dump the dictionary of
font translations (remappings), because mdoc is proving to be a
headache this week with Savannah #66126.
Regards,
Branden
[1] Observe how some of the `word_space_node`s have a width of `24`, and
    others a width of `48`.  The latter are the "adjusted" spaces.
$ { echo .ll 15n; echo .di dd; for n in $(seq 7); do echo alex; done; printf '.pl \\n(nlu\n'; echo .di; echo .pm dd; } | nroff -z 2>&1 | jq
{
  "name": "dd",
  "file name": "<standard input>",
  "starting line number": 2,
  "length": 35,
  "contents": "\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\n\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\n",
  "node list": [
    {
      "type": "line_start_node",
      "diversion level": 0,
      "is_special_node": false
    },
    {
      "type": "glyph_node",
      "diversion level": 0,
      "is_special_node": false,
      "character": "a"
    },
    {
      "type": "glyph_node",
      "diversion level": 0,
      "is_special_node": false,
      "character": "l"
    },
    {
      "type": "glyph_node",
      "diversion level": 0,
      "is_special_node": false,
      "character": "e"
    },
    {
      "type": "glyph_node",
      "diversion level": 0,
      "is_special_node": false,
      "character": "x"
    },
    {
      "type": "word_space_node",
      "diversion level": 0,
      "is_special_node": false,
      "hunits": 48,
      "undiscardable": true,
      "is hyphenless breakpoint": false,
      "terminal_color": "default",
      "width_list": [
        {
          "width": 24,
          "sentence_width": 24
        }
      ],
      "unformat": false
    },
    {
      "type": "glyph_node",
      "diversion level": 0,
      "is_special_node": false,
      "character": "a"
    },
    {
      "type": "glyph_node",
      "diversion level": 0,
      "is_special_node": false,
      "character": "l"
    },
    {
      "type": "glyph_node",
      "diversion level": 0,
      "is_special_node": false,
      "character": "e"
    },
    {
      "type": "glyph_node",
      "diversion level": 0,
      "is_special_node": false,
      "character": "x"
    },
    {
      "type": "word_space_node",
      "diversion level": 0,
      "is_special_node": false,
      "hunits": 24,
      "undiscardable": true,
      "is hyphenless breakpoint": false,
      "terminal_color": "default",
      "width_list": [
        {
          "width": 24,
          "sentence_width": 24
        }
      ],
      "unformat": false
    },
    {
      "type": "glyph_node",
      "diversion level": 0,
      "is_special_node": false,
      "character": "a"
    },
    {
      "type": "glyph_node",
      "diversion level": 0,
      "is_special_node": false,
      "character": "l"
    },
    {
      "type": "glyph_node",
      "diversion level": 0,
      "is_special_node": false,
      "character": "e"
    },
    {
      "type": "glyph_node",
      "diversion level": 0,
      "is_special_node": false,
      "character": "x"
    },
    {
      "type": "vertical_size_node",
      "diversion level": 0,
      "is_special_node": false,
      "vunits": -40
    },
    {
      "type": "vertical_size_node",
      "diversion level": 0,
      "is_special_node": false,
      "vunits": 0
    },
    {
      "type": "glyph_node",
      "diversion level": 0,
      "is_special_node": false,
      "character": "a"
    },
    {
      "type": "glyph_node",
      "diversion level": 0,
      "is_special_node": false,
      "character": "l"
    },
    {
      "type": "glyph_node",
      "diversion level": 0,
      "is_special_node": false,
      "character": "e"
    },
    {
      "type": "glyph_node",
      "diversion level": 0,
      "is_special_node": false,
      "character": "x"
    },
    {
      "type": "word_space_node",
      "diversion level": 0,
      "is_special_node": false,
      "hunits": 24,
      "undiscardable": true,
      "is hyphenless breakpoint": false,
      "terminal_color": "default",
      "width_list": [
        {
          "width": 24,
          "sentence_width": 24
        }
      ],
      "unformat": false
    },
    {
      "type": "glyph_node",
      "diversion level": 0,
      "is_special_node": false,
      "character": "a"
    },
    {
      "type": "glyph_node",
      "diversion level": 0,
      "is_special_node": false,
      "character": "l"
    },
    {
      "type": "glyph_node",
      "diversion level": 0,
      "is_special_node": false,
      "character": "e"
    },
    {
      "type": "glyph_node",
      "diversion level": 0,
      "is_special_node": false,
      "character": "x"
    },
    {
      "type": "word_space_node",
      "diversion level": 0,
      "is_special_node": false,
      "hunits": 48,
      "undiscardable": true,
      "is hyphenless breakpoint": false,
      "terminal_color": "default",
      "width_list": [
        {
          "width": 24,
          "sentence_width": 24
        }
      ],
      "unformat": false
    },
    {
      "type": "glyph_node",
      "diversion level": 0,
      "is_special_node": false,
      "character": "a"
    },
    {
      "type": "glyph_node",
      "diversion level": 0,
      "is_special_node": false,
      "character": "l"
    },
    {
      "type": "glyph_node",
      "diversion level": 0,
      "is_special_node": false,
      "character": "e"
    },
    {
      "type": "glyph_node",
      "diversion level": 0,
      "is_special_node": false,
      "character": "x"
    },
    {
      "type": "vertical_size_node",
      "diversion level": 0,
      "is_special_node": false,
      "vunits": -40
    },
    {
      "type": "vertical_size_node",
      "diversion level": 0,
      "is_special_node": false,
      "vunits": 0
    }
  ]
}
[2] NEWS:
---snip---
*  A new request, `pcolor`, reports to the standard error stream details
   of each color name specified as an argument, including its color
   space identifier and channel value assignments.  Without arguments,
   all defined colors are listed.  (A device's default stroke and/or
   fill colors, "default", are not listed since they are immutable and
   their details unknown to the formatter.)
*  A new request, `pchar`, reports to the standard error stream data
   about any ordinary, special, or indexed character arguments.
*  A new request, `pcomposite`, reports to the standard error stream the
   list of defined composite character mappings.
*  A new request, `phw`, reports to the standard error stream the
   list of hyphenation exceptions associated with the current
   hyphenation language.
*  A new request, `pline`, reports to the standard error stream the list
   of output nodes (an internal data structure) corresponding to the
   pending output line.  The list is empty if no such nodes exist.
*  The `pm` request now interprets any arguments as a sequence of macro,
   string, or diversion names, and reports their contents.
*  The `pnr` request now additionally reports the autoincrementation
   amount and interpolation format of each register (if it is not
   string-valued).
*  The `pnr` request now accepts arguments.  It treats each as
   identifying a register and reports its properties to the standard
   error stream.
*  A new request, `pstream`, reports to the standard error stream the
   name of each stream opened with the `open` or `opena` requests, the
   name of the file backing it, and its mode (writing or appending).
---end snip---
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://www.tuhs.org/pipermail/tuhs/attachments/20250503/78f83e01/attachment.sig>
    
    
More information about the TUHS
mailing list