[TUHS] origin of null-terminated strings

Steve Nickolas usotsuki at buric.co
Fri Dec 16 13:17:51 AEST 2022


On Thu, 15 Dec 2022, Douglas McIlroy wrote:

> I think this cited quote from
> https://www.joelonsoftware.com/2001/12/11/ is urban legend.
>
>    Why do C strings [have a terminating NUl]? It’s because the PDP-7
> microprocessor, on which UNIX and the C programming language were
> invented, had an ASCIZ string type. ASCIZ meant “ASCII with a Z (zero)
> at the end.”
>
> This assertion seems unlikely since neither C nor the library string
> functions existed on the PDP-7. In fact the "terminating character" of
> a string in the PDP-7 language B was the pair '*e'. A string was a
> sequence of words, packed two characters per word. For odd-length
> strings half of the final one-character word was effectively
> NUL-padded as described below.
>
> One might trace null termination to the original (1965) proposal for
> ASCII,  https://dl.acm.org/doi/10.1145/363831.363839. There the only
> role specifically suggested for NUL is to "serve to accomplish time
> fill or media fill." With character-addressable hardware (not the
> PDP-7), it is only a small step from using NUL as terminal padding to
> the convention of null termination in all cases.
>
> Ken would probably know for sure whether there's any  truth in the
> attribution to ASCIZ.
>
> Doug
>

For what it's worth, when I code for the Apple //e (using 65C02 
assembler), I use C strings.  I can just do something like

prstr:   ldy       #$00
@1:      lda       msg, y
          beq       @2        ; string terminator
          ora       #$80      ; firmware wants high bit on
          jsr       $FDED     ; write char
          iny
          bne       @1
@2:      rts

msg:     .byte     "Hello, cruel world.", 13, 0

and using a NUL terminator just makes sense here because of how simple it 
is to check for (BEQ and BNE check the 6502's zero flag, which LDA 
automatically sets).

-uso.


More information about the TUHS mailing list