[TUHS] C history question: why is signed integer overflow UB?
Luther Johnson
luther.johnson at makerlisp.com
Sat Aug 16 03:36:59 AEST 2025
Or one's complement on those machines, but the idea was that this case
is out of bounds, so you don't have to worry if munging some computation
by substituting or rearranging expressions would change it, whatever the
machine-specific behavior was.
On 08/15/2025 10:31 AM, Luther Johnson wrote:
> My belief is that this was done so compilers could employ
> optimizations that did not have to consider or maintain
> implementation-specific behavior when integers would wrap. I don't
> agree with this, I think 2's complement behavior on integers as an
> implementation-specific behavior can be well-specified, and
> well-understood, machine by machine, but I think this is one of the
> places where compilers and benchmarks conspire to subvert the obvious
> and change the language to "language-legally" allow optimizations that
> can break the used-to-be-expected 2's complement
> implementation-specific behavior.
>
> I'm sure many people will disagree, but I think this is part of the
> slippery slope of modern C, and part of how it stopped being more
> usefully, directly, tied to the machine underneath.
>
> On 08/15/2025 10:17 AM, Dan Cross wrote:
>> [Note: A few folks Cc'ed directly]
>>
>> This is not exactly a Unix history question, but given the close
>> relationship between C's development and that of Unix, perhaps it is
>> both topical and someone may chime in with a definitive answer.
>>
>> Starting with the 1990 ANSI/ISO C standard, and continuing on to the
>> present day, C has specified that signed integer overflow is
>> "undefined behavior"; unsigned integer arithmetic is defined to be
>> modular, and unsigned integer operations thus cannot meaningfully
>> overflow, since they're always taken mod 2^b, where b is the number of
>> bits in the datum (assuming unsigned int or larger, since type
>> promotion of smaller things gets weird).
>>
>> But why is signed overflow UB? My belief has always been that signed
>> integer overflow across various machines has non-deterministic
>> behavior, in part because some machines would trap on overflow (e.g.,
>> Unisys 1100 series mainframes) while others used non-2's-complement
>> representations for signed integers (again, the Unisys 1100 series,
>> which used 1's complement), and so the results could not be precisely
>> defined: even if it did not trap, overflowing a 1's complement machine
>> yielded a different _value_ than on 2's complement. And around the
>> time of initial standardization, targeting those machines was still an
>> important use case. So while 2's complement with silent wrap-around
>> was common, it could not be assumed, and once machines that generated
>> traps on overflow were brought into the mix, it was safer to simply
>> declare behavior on overflow undefined.
>>
>> But is that actually the case?
>>
>> Thanks in advance.
>>
>> - Dan C.
>>
>
More information about the TUHS
mailing list