[TUHS] origin of null-terminated strings
Steve Nickolas
usotsuki at buric.co
Fri Dec 16 13:17:51 AEST 2022
On Thu, 15 Dec 2022, Douglas McIlroy wrote:
> I think this cited quote from
> https://www.joelonsoftware.com/2001/12/11/ is urban legend.
>
> Why do C strings [have a terminating NUl]? It’s because the PDP-7
> microprocessor, on which UNIX and the C programming language were
> invented, had an ASCIZ string type. ASCIZ meant “ASCII with a Z (zero)
> at the end.”
>
> This assertion seems unlikely since neither C nor the library string
> functions existed on the PDP-7. In fact the "terminating character" of
> a string in the PDP-7 language B was the pair '*e'. A string was a
> sequence of words, packed two characters per word. For odd-length
> strings half of the final one-character word was effectively
> NUL-padded as described below.
>
> One might trace null termination to the original (1965) proposal for
> ASCII, https://dl.acm.org/doi/10.1145/363831.363839. There the only
> role specifically suggested for NUL is to "serve to accomplish time
> fill or media fill." With character-addressable hardware (not the
> PDP-7), it is only a small step from using NUL as terminal padding to
> the convention of null termination in all cases.
>
> Ken would probably know for sure whether there's any truth in the
> attribution to ASCIZ.
>
> Doug
>
For what it's worth, when I code for the Apple //e (using 65C02
assembler), I use C strings. I can just do something like
prstr: ldy #$00
@1: lda msg, y
beq @2 ; string terminator
ora #$80 ; firmware wants high bit on
jsr $FDED ; write char
iny
bne @1
@2: rts
msg: .byte "Hello, cruel world.", 13, 0
and using a NUL terminator just makes sense here because of how simple it
is to check for (BEQ and BNE check the 6502's zero flag, which LDA
automatically sets).
-uso.
More information about the TUHS
mailing list