[TUHS] What is this 1972 C/NB program?
Thalia Archibald via TUHS
tuhs at tuhs.org
Thu May 14 11:57:27 AEST 2026
This mysterious program is in the s1 tape, but has not yet been identified.
Does it look familiar?
https://github.com/DoctorWkt/unix-jun72/blob/master/src/cmd/unknown.c
It looks like it processes line continuations. It filters files to only runs of
lines continued with hyphen with the adjoining whitespace stripped. Only letters
and hyphens are allowed in such lines.
[a-zA-Z-]+(-\n[\t\n]*[a-zA-Z-]+)*
But there's bugs, so the grammar is actually the following, where EOF is
included in the negated sets as NUL:
([a-zA-Z]|-[^\n])+(-\n[ \t\n]*[^ \t\n]([a-zA-Z]|-[^\n])*)*
Could this be a sort of preprocessor? Perhaps for some sort of a configuration
language?
It reads the list of named files, printing each filename with "%s:\n \n" before.
The space on an empty line is strange.
It uses & and | for conditionals and is typed, characteristic of early C and NB.
It uses this horrid indentation style, which doesn't match ken or dmr.
Do you recognize who? Example:
while((b[++i] = get(ifile)) != 0)
{if((b[i] >= 'a' & b[i] <= 'z') |
(b[i] >= 'A' & b[i] <= 'Z'))
{c[j++] = b[i];
goto cont;
}
I've taken some liberties to simplify it below. I changed it to operate on a
single file. And equivalently, I reformatted it, replaced an unnecessary buffer
with a single char, and simplified control flow. See the above link for the
original.
char c[60];
main(argc, argv)
int argc;
char *argv[];
{
char b;
int isw, j, k;
isw = j = 0;
while((b = getchar()) != 0) {
if((b >= 'a' && b <= 'z') || (b >= 'A' && b <= 'Z')) {
c[j++] = b;
continue;
}
if(b == '-') {
c[j++] = b;
if((b = getchar()) != '\n') {
c[j++] = b;
continue;
}
if(j == 1) {
isw = j = 0;
continue;
}
isw = 1;
while(((b = getchar()) == ' ') || (b == '\t') || (b == '\n'))
;
c[j++] = b;
continue;
}
if(isw == 1) {
k = 0;
c[j++] = '\n';
while(k < j)
putchar(c[k++]);
}
isw = j = 0;
}
}
Thalia
More information about the TUHS
mailing list