[TUHS] origin of null-terminated strings

Douglas McIlroy douglas.mcilroy at dartmouth.edu
Fri Dec 16 13:02:18 AEST 2022


I think this cited quote from
https://www.joelonsoftware.com/2001/12/11/ is urban legend.

    Why do C strings [have a terminating NUl]? It’s because the PDP-7
microprocessor, on which UNIX and the C programming language were
invented, had an ASCIZ string type. ASCIZ meant “ASCII with a Z (zero)
at the end.”

This assertion seems unlikely since neither C nor the library string
functions existed on the PDP-7. In fact the "terminating character" of
a string in the PDP-7 language B was the pair '*e'. A string was a
sequence of words, packed two characters per word. For odd-length
strings half of the final one-character word was effectively
NUL-padded as described below.

One might trace null termination to the original (1965) proposal for
ASCII,  https://dl.acm.org/doi/10.1145/363831.363839. There the only
role specifically suggested for NUL is to "serve to accomplish time
fill or media fill." With character-addressable hardware (not the
PDP-7), it is only a small step from using NUL as terminal padding to
the convention of null termination in all cases.

Ken would probably know for sure whether there's any  truth in the
attribution to ASCIZ.

Doug


More information about the TUHS mailing list