[TUHS] I can't drive 55: "GOTO considered harmful" 55th anniversary

Sat Mar 11 01:54:24 AEST 2023

On Fri, Mar 10, 2023 at 6:37 AM Noel Chiappa <jnc at mercury.lcs.mit.edu> wrote:
>
>     > From: Warner Losh
>
>     > In C I use it all the time to do goto err for common error recovery
>     > because C doesn't have anything better.
>
> That's because C doesn't have 'conditions'. (Apparently, from the following
> posts, most people here are unfamiliar with them. Too bad, they are a great
> idea. OK summary here:
>
>   http://gunkies.org/wiki/Condition_handler
>
> for those who are unfamiliar with the idea.)

I don't know if I'd say they're a great idea.

The problem with exceptions (nee conditions, though I most often
associate the term "condition" with Lisp, and in particular Common
Lisp's implementation has a rather different flavor in the ability to
restart execution _at the point where the condition was raised_, even
if the handler is conceptually much higher up in the call stack) is
that they introduce non-linear control flow, which can be very
difficult to reason about. This is especially challenging in code that
may allocate resources and must manually deallocate them (such as C);
without some notion of RAII or finalizers for arbitrary objects. It's
really easy to introduce leaks in code with exceptions, and while
often this is for memory, where you're ok if you're in a garbage
collected language, you're gonna have a bad day when it's for
something like file descriptors (which are much scarcer than memory).
Unless pretty much everything is behind a stack guard, or whatever the
moral equivalent in your language is, you're constrained to handling
the errors at many places in the call stack, in which case, why
bother?

But the point about error handling and the use of `goto` in C in lieu
of something better is well taken, and conditions are _a_ reasonable
mechanism for dealing with the issue. I'd argue that a `Result` monad
and some short-circuiting sugar used in conjunction with RAII is
another that is better. For example in Rust, the result type interacts
with the `?` operator so that, if a call returns a `Result<T, E>`, the
T will be unwrapped if the result is Ok(T), otherwise, the code will
return `Err(E)`. So one can write code that has the brevity of
exceptions without introducing the control-flow weirdness:

    fn make_request(host: &str) -> std::io::Result<()> {
        let req = "hi\r\n".as_bytes();
        std::net::TcpStream::connect(host)?.write(req)?;
        Ok(())
    }

(Note that the TCP stream will be "dropped" after the call to `write`,
and the drop impl on the TcpStream type will close the socket.)

Combined with pattern matching on the error type, this is quite expressive.

> I was at one point writing a compiler using a recursive descent parser, and
> decided the code would be a lot simpler/cleaner if I had them. (If, for
> example, one discovers discovers an un-expected 'end of file', there can be
> an extremely large number of procedure invocations in between where that is
> discovered, and where it is desirable to handle it. So every possible
> intervening procedure would have to have an 'unexpected EOF' return value,
> one would have to write code in every possible intervening procedure to
> _handle_ an 'unexpected EOF' return value, etc.)'
>
> (Yes, I could have used setjmp/longjmp; those are effectively a limited
> version of condition handlers.)
>
> Since C has a stack, it was relatively easy to implement, without any compiler
> support: on() became a macro for 'if _on("condition_name")'; _on() was a
> partially-assembler procedure which i) stacked the handler (I forget where I
> put it; I may have created a special 'condition stack', to avoid too many
> changes to the main C stack), and ii) patched the calling procedure's return
> point to jump to some code that unstacked the handler, and iii) returned
> 'false'. If the condition occurred, a return from _on() was simulated,
> returning 'true', etc.
>
> So the code had things like:
>
>         on ("unexpected EOF") {
>                 code to deal with it
>                 }
>
> With no compiler support, it added a tiny bit of overhead
> (stacking/unstacking conditions), but not too bad.
>
> John Wroclawski and someone implemented a very similar thing
> entirely in C; IIRC it was built on top of setjmp/longjmp. I don't
> recall how it dealt with un-stacking handlers on exit (which mine
> did silently).

The plan9 kernel has something remarkably similar; there is a
pre-process error stack containing the local equivalent of a bunch of
`jmp_buf`'s.  One could write, `if (waserror()) { /* handle cleanup */
}` where, `waserror` would push a jmp_buf onto the stack a la the
`setjmp` equivalent. Code later on could call `error(Ewhatever)` and
that would cache the error somewhere in the proc struct and invoke the
`longjmp` equivalent to jump back to the label at the top of the
stack, where `waserror()` would now return 1. Things would have to
manually `poperror()`, to pop the stack. I'm told that the plan9 C
compilers were callee-save in part to keep these state labels svelte.

This got pulled into the Akaros kernel at one point, and for a while,
we had someone working with us who was pretty prominent in the Linux
community. What was interesting was that he was so used to the `goto
err;` convention from that world that he just could not wrap his head
around how the `waserror()` stuff worked; at one point there was a
sequence like,

    char *foo;

    if (waserror()) {
        free(foo);
        return -1;
    }
    foo = malloc(len);
    something();
    free(foo);
    return 0;

...and the guy just couldn't get how the code inside of the
`waserror()` wasn't trashing the system, since obviously the malloc
was done after `waserror()`, and so the pointer was meaningless at
that point. It took quite a while to explain what was going on.

Btw: I was once told by a reliable authority that the Go developers
considered implementing exceptions, but decided against it because of
the cognitive load it imposes.

        - Dan C.