[TUHS] Minimum Array Sizes in 16 bit C (was Maximum)

Sun Sep 29 03:52:16 AEST 2024

In the compilers I'm talking about, you pass a structure by passing a
pointer to it - but the receiving function knows the argument is a
structure, and not a pointer to a structure, so it knows it needs to use
the pointer to copy to its own local version.

On 09/28/2024 10:47 AM, Luther Johnson wrote:
> I don't know that structure copying breaks any complexity or bounds on
> execution time rules. Many compilers may be different, but in the
> generated code I've seen, when you pass in a structure to a function,
> the receiving function copies it to the stack. In the portable C
> compiler, when you return  a structure as a result, it is first copied
> to a static area, a pointer to that area is returned, then the caller
> copies that out to wherever it's meant to go, either a variable that's
> being assigned (which could be on the stack or elsewhere), or to a place
> on the stack that was reserved for it because that result will now be an
> argument to another function to be called. So there's some copying, but
> that's proportional to the size of the structure, it's linear, and
> there's no dynamic memory allocation going on.
>
> On 09/28/2024 09:58 AM, G. Branden Robinson wrote:
>> At 2024-09-28T09:34:14-0400, Douglas McIlroy wrote:
>>>> C's refusal to specify dynamic memory allocation in the language
>>>> runtime (as opposed to, eventually, the standard library)
>>> This complaint overlooks one tenet of C: every operation in what you
>>> call "language runtime" takes O(1) time. Dynamic memory allocation
>>> is not such an operation.
>> A fair point.  Let me argue about it anyway.  ;-)
>>
>> I'd make three observations.  First, K&R did _not_ tout this in their
>> book presenting ANSI C.  I went back and checked the prefaces,
>> introduction, and the section presenting a simple malloc()/free()
>> implementation.  The tenet you claim for the language is not explicitly
>> articulated and, if I squint really hard, I can only barely perceive
>> (imagine?) it deeply between the lines in some of the prefatory material
>> to which K&R mostly confine their efforts to promote the language.  In
>> my view, a "tenet" is something more overt: the sort of thing U.S.
>> politicians try to get hung on the walls of every public school
>> classroom, like Henry Spencer's Ten Commandments of C[1] (which itself
>> does not mention this "core language has only O(1) features" principle).
>>
>> Second, in reviewing K&R I was reminded that structure copying is part
>> of the language.  ("There are no operations that manipulate an entire
>> array or string, although structures may be copied as a unit."[2])
>> Doesn't that break the tenet right there?
>>
>> Third, and following on from the foregoing, your point reminded me of my
>> youth programming non-pipelined machines with no caches.  You could set
>> your watch by (just about) any instruction in the set--and often did,
>> because we penurious microcomputer guys often lacked hardware real-time
>> clocks, too.  That is to say, for a while, every instruction has an
>> exactly predictable and constant cycle count.  (The _value_ of that
>> constant might depend on the addressing mode, because that would have
>> consequences on memory fetches, but the principle stood.)  When the Z80
>> extended the 8080's instruction set, they ate from Tree of Knowledge
>> with block-copy instructions like LDIR and LDDR, and all of a sudden you
>> had instructions with O(n) cycle counts.  But as a rule, programmers
>> seemed to welcome this instead of recognizing it as knowing sin, because
>> you generally knew worst-case how many bytes you'd be copying and take
>> that into account.  (The worst worst case was a mere 64kB!)
>>
>> Further, Z80 home computers in low-end configurations (that is, no disk
>> drives) often did a shocking thing: they ran with all interrupts masked.
>> All the time.  The one non-maskable interrupt was RESET, after which you
>> weren't going to be resuming execution of your program anyway. Not from
>> the same instruction, at least.  As I recall the TRS-80 Model I/III/4
>> didn't even have logic on the motherboard to decode the Z80's "interrupt
>> mode 2", which was vectored, I think.  Even in the "high-end"
>> configurations of these tiny machines, you got a whopping ONE interrupt
>> to play with ("IM 1").
>>
>> Later, when the Hitachi 6309 smuggled similar block-transfer decadence
>> into its extensions to the Motorola 6809 (to the excitement of we
>> semi-consciously Unix-adjacent OS-9 users) they faced a starker problem,
>> because the 6809 didn't wall off interrupts in the same way the 8080 and
>> Z80.  They therefore presented the programmer with the novelty of the
>> restartable instruction, and a new generation of programmers became
>> acquainted with the hard lessons time-sharing minicomputer people were
>> familiar with.
>>
>> My point in this digression is that, in my opinion, it's tough to hold
>> fast to the O(1) tenet you claim for C's core language and to another at
>> the same time: the language's identity as a "portable assembly
>> language".  Unless every programmer has control over the compiler--and
>> they don't--you can't predict when the compiler will emit an O(n) block
>> transfer instruction.  You'll just have to look at the disassembly.
>>
>> _Maybe_ you can retain purity by...never copying structs.  I don't think
>> lint or any other tool ever checked for this.  Your advocacy of this
>> tenet is the first time I've heard it presented.
>>
>> If you were to suggest to me that most of the time I've spent in my life
>> arguing with C advocates was with rotten exemplars of the species and
>> therefore was time wasted, I would concede the point.
>>
>> There's just so danged _many_ of them...
>>
>>> Your hobbyhorse awakened one of mine.
>>>
>>> malloc was in v7, before the C standard was written. The standard
>>> spinelessly buckled to allow malloc(0) to return 0, as some
>>> implementations gratuitously did.
>> What was the alternative?  There was no such thing as an exception, and
>> if a pointer was an int and an int was as wide as a machine address,
>> there'd be no way to indicate failure in-band, either.
>>
>> If the choice was that or another instance of atoi()'s wincingly awful
>> "does this 0 represent an error or successful conversion of a zero
>> input?" land mine, ANSI might have made the right choice.
>>
>>> I can't imagine that any program ever actually wanted the feature. Now
>>> it's one more undefined behavior that lurks in thousands of programs.
>> Hoare admitted to only one billion-dollar mistake.  No one dares count
>> how many to write in C's ledger.  This was predicted, wasn't it?
>> Everyone loved C because it was fast: it was performant, because it
>> never met a runtime check it didn't eschew--recall again Kernighan
>> punking Pascal on this exact point--and it was quick for the programmer
>> to write because it never met a _compile_-time check it didn't eschew.
>> C was born as a language for wizards who never made mistakes.
>>
>> The problem is that, like James Madison's fictive government of angels,
>> such entities don't exist.  The staff of the CSRC itself may have been
>> overwhelmingly populated with frank, modest, and self-deprecating
>> people--and I'll emphasize here that I'm aware of no accounts that this
>> is anything but true--but C unfortunately played a part in stoking a
>> culture of pretension among software developers.  "C is a language in
>> which wizards program.  I program in C.  Therefore I'm a wizard." is how
>> the syllogism (spot the fallacy) went.  I don't know who does more
>> damage--the people who believe their own BS, or the ones who know
>> they're scamming their colleagues.
>>
>>> There are two arguments for malloc(0), Most importantly, it caters for
>>> a limiting case for aggregates generated at runtime--an instance of
>>> Kernighan's Law, "Do nothing gracefully". It also provides a way to
>>> create a distinctive pointer to impart some meta-information, e.g.
>>> "TBD" or "end of subgroup", distinct from the null pointer, which
>>> merely denotes absence.
>> I think I might be confused now.  I've frequently seen arrays of structs
>> initialized from lists of literals ending in a series of "NULL"
>> structure members, in code that antedates or ignores C99's wonderful
>> feature of designated initializers for aggregate types.[3]  How does
>> malloc(0) get this job done and what benefit does it bring?
>>
>> Last time I posted to TUHS I mentioned a proposal for explicit tail-call
>> elimination in C.  I got the syntax wrong.  The proposal was "return
>> goto;".  The WG14 document number is N2920 and it's by Alex Gilding.
>> Good stuff.[4]  I hope we see it in C2y.
>>
>> Predictably, I must confess that I didn't make much headway on
>> Schiller's 1975 "secure kernel" paper.  Maybe next time.
>>
>> Regards,
>> Branden
>>
>> [1] https://web.cs.dal.ca/~jamie/UWO/C/the10fromHenryS.html
>>
>>      I can easily imagine that the tenet held at _some_ point in the
>>      C's history.  It's _got_ to be the reason that the language
>>      relegates memset() and memcpy() to the standard library (or to the
>>      programmer's own devise)!  :-O
>>
>> [2] Kernighan & Ritchie, _The C Programming Language_, 2nd edition, p. 2
>>
>>      Having thus admitted the camel's nose to the tent, K&R would have
>>      done the world a major service by making memset(), or at least
>>      bzero(), a language feature, the latter perhaps by having "= 0"
>>      validly apply to an lvalue of non-primitive type.  Okay,
>>      _potentially_ a major service.  You'd still need the self-regarding
>>      wizard programmers to bother coding it, which they wouldn't in many
>>      cases "because speed".  Move fast, break stuff.
>>
>>      C++ screwed this up too, and stubbornly stuck by it for a long
>> time.
>>
>>      https://cplusplus.github.io/CWG/issues/178.html
>>
>> [3] https://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html
>> [4] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2920.pdf
>
>