[TUHS] Systematic approach to command-line interfaces [ meta issues ]

Douglas McIlroy douglas.mcilroy at dartmouth.edu
Mon Aug 2 04:17:39 AEST 2021


I have considerable sympathy with the general idea of formally
specifying and parsing inputs. Langsec people make a strong case
for doing so. The white paper,"A systematic approach to modern
Unix-like command interfaces", proposes to "simplify parsing by
facilitating the creation of easy-to-use 'grammar-based' parsers".

I'm not clear on what is meant by "parser". A plain parser is a
beast that builds a parse tree according to a grammar. For most
standard Unix programs, the parse tree has two kinds of leaves:
non-options and options with k parameters. Getopt restricts
k to {0,1}.

Aside from fiddling with argc and argv, I see little difference
in working with a parse tree for arguments that could be
handled by getopt and working with using getopt directly.

A more general parser could handle more elaborate grammatic
constraints on options, for example, field specs in sort(1),
requirements on presence of options in tar(1), or representation
of multiple parameters in cut(1).

In realizing the white paper's desire to "have the parser
provide the results to the program", it's likely that the mechanism
will, like Yacc, go beyond parsing and invoke semantic actions
as it identifies tree nodes.

Pioneer Yaccification of some commands might be a worthy demo.

Doug


More information about the TUHS mailing list