[TUHS] origin of null-terminated strings

Dan Cross crossd at gmail.com
Sat Dec 17 02:10:31 AEST 2022


On Fri, Dec 16, 2022 at 8:42 AM Dan Halbert <halbert at halwitz.org> wrote:
> ASCIZ was an assembler directive used for a number of different DEC computers, and also the name for null-terminated strings. I learned it for the PDP-10, but I'm sure it existed on other machines. It is in some PDP-10 documentation I am looking at right now. Anyone who used DEC and did assembly programming would have known about it. Various system calls took ASCIZ strings.

This raises something I've always been curious about. To what extent were
the Unix folks at Bell Labs already familiar with DEC systems before the PDP-7?

It strikes me that much of the published work was centered around IBM and GE
systems (e.g., Ken's wonderful paper on regular expressions, and of course the
Multics work). Were there other Digital machines floating around? I know a
proposal was written to get a PDP-10 for operating systems research, but it
wasn't approved.

Relatedly, was any thought given to trying to get a 360 system?

On 12/16/22 04:13, Dr Iain Maoileoin wrote:
> ASCIZ
> Lost in the mists of time in my mind.

Origin, perhaps, but it exists in contemporary assemblers. Like most
sane people I try to avoid being in assembler for too long, when you're
first turning on a machine it is useful to be able to squirt a message
out of the UART if something goes dramatically wrong, and the directive
is handy for that.

It seems to have made its way into Research assembler via BSD; it's in
locore.s in 8th Edition, for instance, but doesn't appear before that.  The
"UNIX Assembler Manual" describes "String Statements" for the 7th
Edition assembler; strings are sequences of ASCII characters between
'<' and '>'.  But it doesn't say that they're NUL terminated, and they are
not: adding the terminator was manual via the familiar, `\0` escape
sequence.

        - Dan C.


> I remember running into a .asciz directive n the 70s “somewhere”.
> It was an assembler directive in one of the RT11 systems??? or perhaps the unix bootstrap and/or “.s” files - when I get some time I will go read some old code/manuals.
>
> I
>
> Yes, it put a null byte at the end of a string.
>
> On 16 Dec 2022, at 03:14, Ken Thompson <kenbob at gmail.com> wrote:
>
> asciz -- this is the first time i heard of it.
> doug -- yes.
>
>
> On Thu, Dec 15, 2022 at 7:04 PM Douglas McIlroy <douglas.mcilroy at dartmouth.edu> wrote:
>>
>> I think this cited quote from
>> https://www.joelonsoftware.com/2001/12/11/ is urban legend.
>>
>>     Why do C strings [have a terminating NUl]? It’s because the PDP-7
>> microprocessor, on which UNIX and the C programming language were
>> invented, had an ASCIZ string type. ASCIZ meant “ASCII with a Z (zero)
>> at the end.”
>>
>> This assertion seems unlikely since neither C nor the library string
>> functions existed on the PDP-7. In fact the "terminating character" of
>> a string in the PDP-7 language B was the pair '*e'. A string was a
>> sequence of words, packed two characters per word. For odd-length
>> strings half of the final one-character word was effectively
>> NUL-padded as described below.
>>
>> One might trace null termination to the original (1965) proposal for
>> ASCII,  https://dl.acm.org/doi/10.1145/363831.363839. There the only
>> role specifically suggested for NUL is to "serve to accomplish time
>> fill or media fill." With character-addressable hardware (not the
>> PDP-7), it is only a small step from using NUL as terminal padding to
>> the convention of null termination in all cases.
>>
>> Ken would probably know for sure whether there's any  truth in the
>> attribution to ASCIZ.
>>
>> Doug
>
>
>


More information about the TUHS mailing list