[TUHS] C history question: why is signed integer overflow UB?
    Luther Johnson 
    luther.johnson at makerlisp.com
       
    Sat Aug 16 04:25:32 AEST 2025
    
    
  
I hear and understand what you're saying. I think what I'm trying to 
point out, is that in C, as it was originally implemented, in 
expressions "a + b", "a >> 1", "++a", C "does what the machine does". 
That's a very different thing from having rational, safe, predictable 
language semantics for operations on types - but it was also a strength, 
and a simple way to describe what C would do, deferring to machine 
semantics. I believe one place in C89/C90 where this is stated 
explicitly, as "do what the machine does", is "-1 >> 1", as opposed to 
"-1 / 2".  On most machines, this program:
#include <stdio.h>
int main()
{
     printf("%d\n", -1 >> 1);
     printf("%d\n", -1 / 2);
     return 0;
}
returns:
-1
0
directly reflecting the underlying machine shift and divide instructions 
- but if you made an appeal to rational integer type semantics, you 
might decide for it to do something else.
Old C was one way. Modern C has gone another way, good tools and 
rational semantics for safer and/or higher performance code, or some 
balance between those and other goals. Old C just did what the machine 
did, and was a high leverage tool - but you had to understand your machine.
On 08/15/2025 11:02 AM, Nevin Liber wrote:
> On Fri, Aug 15, 2025 at 12:32 PM Luther Johnson 
> <luther.johnson at makerlisp.com <mailto:luther.johnson at makerlisp.com>> 
> wrote:
>
>     My belief is that this was done so compilers could employ
>     optimizations
>     that did not have to consider or maintain implementation-specific
>     behavior when integers would wrap. I don't agree with this, I
>     think 2's
>     complement behavior on integers as an implementation-specific
>     behavior
>     can be well-specified, and well-understood, machine by machine, but I
>     think this is one of the places where compilers and benchmarks
>     conspire
>     to subvert the obvious and change the language to "language-legally"
>     allow optimizations that can break the used-to-be-expected 2's
>     complement implementation-specific behavior.
>
>
> It isn't just about optimizations.
>
> Unsigned math in C is well defined here.  The problem is that its 
> wrapping behavior is almost (but not) always a bug.  Because of that, 
> for instance, one cannot write a no-false-positive sanitizer to catch 
> this because it cannot tell the difference between an accidental bug 
> and a deliberate use.  This is a well-defined case with a very 
> reasonable definition which most of the time leads to bugs.
>
> There are times folks want the wrapping behavior.  There are times 
> folks want saturating behavior.  There are times folks want such code 
> to error out.  There are times folks want the optimizing behavior 
> because their code doesn't go anywhere near wrapping.
>
> Ultimately, one needs different functions for the different 
> behaviors, but if you only have one spelling for that operation, you 
> can only get one behavior.  A given type has to pick one of the above 
> behaviors for a given spelling of an operation.
>
> You can, of course, disagree with what C picked here (many do), but it 
> is unlikely to change in the future.
>
> Not that it hasn't been tried.  In 2018 there was a proposal for C++ 
> P0907R0 Signed Integers are Two's Complement 
> <https://wg21.link/P0907R0>, and if you look at the next revision of 
> that paper P0907R1 <https://wg21.link/P0907R1>, there was no consensus 
> for the wrapping behavior.  Quoting the paper:
>
>   * Performance concerns, whereby defining the behavior prevents
>     optimizers from assuming that overflow never occurs;
>   * Implementation leeway for tools such as sanitizers;
>   * Data from Google suggesting that over 90% of all overflow is a
>     bug, and defining wrapping behavior would not have solved the bug.
>
> Fun fact:  in C++ std::atomic<int> does wrap, so you can actually get 
> the behavior you want.  I haven't looked to see if that is also true 
> using C's _Atomic type qualifier.
>
> Full disclosure:  I am on the WG21 (C++) Committee and am starting to 
> participate on the WG14 (C) Committee.
> -- 
>  Nevin ":-)" Liber  <mailto:nevin at eviloverlord.com 
> <mailto:nevin at eviloverlord.com>>  +1-847-691-1404
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.tuhs.org/pipermail/tuhs/attachments/20250815/56406566/attachment.htm>
    
    
More information about the TUHS
mailing list