[TUHS] the true reason why c++ always wins
Dan Cross via TUHS
tuhs at tuhs.org
Fri May 29 08:24:36 AEST 2026
On Thu, May 28, 2026 at 12:01 AM G. Branden Robinson via TUHS
<tuhs at tuhs.org> wrote:
> [snip; PCLSRing]
> The fundamental solution
> ------------------------
>
> ...arises from consideration of this clause of Stevens.
>
> 'if a process caught a signal while the process was blocked in a "slow"
> system call'
>
> My thesis: An OS kernel should not _have_ "slow" system calls.
Bakul hit this; "slow" and "fast" system calls are a misnomer. The
real distinction is "interruptible" and "non-interruptible" operations
in the kernel initiated by system calls. This is relevant when it
comes to deciding when you deliver a signal: if you're blocked in an
interruptible operation and a signal arrives, then the system call is
effectively aborted and the kernel delivers the signal (approximately)
immediately. After the signal is handled, presuming the process
continues running, the system call returns to the user program with an
error and errno is set to `EINTR`.
On the other hand, if a process is blocked in a non-interruptible
operation and a signal arrives, then signal delivery is deferred until
after the after that completes. Often we refer to non-interruptible
operations as "fast" because the expectation is that they'll complete
quickly, and trying to interrupt them to deliver a signal (...and
possibly clean up a bunch of state) is not worth it.
A problem is that many of the same system calls are overloaded for
both interruptible and non-interruptible operations. So for example,
reading a disk block is (presumably) pretty quick; if the process is
waiting for a block to be read from a storage device, don't bother
interrupting that operation since it'll be unblocked pretty soon
anyway. By contrast, if the process is blocked waiting on a data read
a network connection, or a pipe, or a terminal device, go ahead and
interrupt that, since the read may not by fulfilled any time soon: the
distant end might have gone down; the program writing to the pipe
might be waiting for something, or the user might have gone out to
lunch or left for the day...who knows when someone might wander by and
type a character? It could be hours or days.
Btw, this touches on an earlier topic
(https://www.tuhs.org/pipermail/tuhs/2026-May/033745.html); I don't
know if you saw that, however, as I wrote it right around the time the
list had a hiccup.
> The modern Linux kernel is festooned with things called "worker threads"
> in an attempt to solve the same problem.[2] (I presume successfully.)
Not really. "Worker threads" are a means for structuring
_concurrency_, that is multiple logical flows of execution in a
program. They're implemented in Linux, as in a bunch of other systems,
because they're a convenient programming abstraction, and having a
schedulable execution context is useful. But they're largely
orthogonal to the whole fast vs slow thing and how those interact with
signals (and thus `EINTR`): if every system call was asynchronous,
having threads in a kernel would still be handy.
But the link you referred to in your footnote is about "workqueues",
which are built with pool of worker threads, but that's different
again. Threads might be thought of as synchronous primitives: or at
least they can be used as a building block for writing synchronous
programs. They do something, and then the next thing, in program
order. Sure, something a thread does might block, and that will
usually result in the thread "yielding" control of execution to a
scheduler, which may pick some other (runnable) thread and context
switching to some it, but that's fine: from a logical perspective, the
thread doesn't really know.
Work queues are different: they're an asynchronous thing. You wrap up
a little description of some work you want to have done and toss it to
someone else to run on your behalf, asynchronously. You don't wait for
it to be done before you move on to the next thing in your synchronous
thread of execution.
> But because Linux is still monolithic--and has to be, lest its lead
> author's now largely achieved objective of "world domination"[3] be
> squandered--it has an ever-growing set of worker threads as its
> revisions steadily find more things for idle cores to do.
This is conflating parallelism and concurrency; the two are not the
same. Concurrency is a way to structure the flow of a program;
threads, flow through state machines, and similar _software_
constructs are the basic unit of work there. Parallelism is distinct,
and is about simultaneity: hardware threads, usually cores or CPUs or
some other _hardware_ computing block are the unit of work there.
The number of threads that exist purely within the Linux kernel
increases because people keep discovering that there are things that
that kernel needs to do concurrently, and which are conveniently
expressed as threads: threads have nice properties, like independent
scheduling, and the ability to block, and so on, that you _don't_ have
if (say) you try to do everything directly from an interrupt handler
like in Unix systems of yore.
> This strategy isn't wrong or stupid, but in my view it quietly concedes
> an argument that Torvalds and his acolytes declared themselves as having
> prevailed in over 30 years ago.
>
> You _could_ reach the same point, with more easily managed permission/
> security boundaries (I claim), by employing a microkernel design.
Eh....I don't know about that. Unix has been "multithreaded" since
the early- or mid-1970s; we've been overlapping IO with computation
since '72 or '73 or something. The situation in Linux isn't all that
different. But I wouldn't call early Unix or Linux "microkernels" as
that term is generally known; that's more about explicit control flow
via passing messages, and separation of responsibilities into discrete
tasks that are mutually ignorant of one another. Is that a "better"
organization? Maybe, but consider that early versions of Minix didn't
even use an MMU, so you don't always have an address space boundary
between those tasks; and of course usually messages are implemented
using shared memory. ';-}
Put another way, the existence of threads or asynchronous work queues
in a monolithic kernel isn't really an argument for or against
microkernels conceptually; they're orthogonal.
> And let us please forever erase, or at least suffix a fat asterisk to,
> the name "Mach" from association with microkernels. That was indeed the
> status its promulgators hoped for, but by starting with the BSD kernel
> and cutting it down, instead of building one up from nothing, they
> picked an approach that frustrated their objectives.
>
> https://cs.nyu.edu/~mwalfish/classes/15fa/ref/liedtke93improving.pdf
>
> Thus my enthusiasm for microkernels. Yes, context switches are costly,
> and mode switches are more costly still. Where these costs are
> unbearable, you either need more CPU, more memory, or to move away from
> a general-purpose OS kernel. Solve your problem in a free-standing, not
> a hosted, runtime environment.
>
> As we've seen, any proper engineering problem requires making tradeoffs
> somewhere. (If it doesn't, it's an arithmetic problem.) An OS kernel,
> in days of yore a "job monitor", was supposed to be a small, lean thing
> of minimal footprint that perturbed the availability of CPU cycles and
> RAM storage to the "real programs"--the _jobs_--as little as possible.
This is a definitional question. I like Roscoe's definition: the
operating system is, "that body of software that, 1) multiplexes the
machine's hardware sources, 2) abstracts the hardware pratform, and 3)
protects software principals from each other (using the hardware)".
[https://www.usenix.org/conference/osdi21/presentation/fri-keynote]
Nothing in there says that it has to be a priori small, though of
course it should strive for efficiency. The point is to provide useful
abstractions for software. I think saying, "if the cost is unbearable
you need more hardware" is imposing an unreasonable burden on that
software.
> But if we count lines of code, monolithic (or semi-monolthic, "modular")
> kernels are some of the biggest software projects in the world, with the
> ones we've all heard of weighing in at tens of millions of lines.
>
> https://interestingengineering.com/lists/whats-the-biggest-software-package-by-lines-of-code
>
> Zero-copy operations are not a silver bullet, either. They can be
> beautiful but you then have to spend more time considering whose
> responsibility it is to the validate the data in shared buffers, lest
> you end up with _no one_ taking responsibility for it...and suffering
> security pwnage.
It's interesting that you mention L4, then, which made use of
zero-copy techniques to speed up message passing. This is a
performance thing, however, and the issues transcend issues of the
design metaphor for the kernel writ large. Consider DMA; if a program
wants to cancel an IO operation that is in flight, then you've got to
be able to reliably cancel it _or_ pin the memory it's using until the
operation is done. That's the same whether the kernel is a ukernel or
not.
> While researching this message I happened across the following paper
> proposing a dedicated memory allocator for I/O operations in the Linux
> kernel.
>
> https://netdevconf.info/0x15/papers/1/maio_netdev0x15.pdf
>
> I don't know what has become of that work. To me, the very existence of
> {e,}BPF is a concession that monokernels are too complex, and the
> proposed MAIO (from the foregoing paper) suggests the same.
Well, eBPF et al are attempts to add flexibility to systems by
dynamically instrumenting or changing their behavior. Again, I suspect
that's largely orthogonal to microkernel vs monolithic kernel. Perhaps
you are suggesting that a microkernel could swap in a new task
something that you want to change the behavior of, but the ukernel
might not allow you to do that; perhaps the task is compiled in, or
perhaps it must be specially blessed in some way; instrumenting it as
one might with eBPF or something similar may be administratively
banned.
> All right, tell me how I'm wrong! :)
Wrong? I don't know. Rather, I think this is jumping around between
multiple places and drawing conclusions that don't necessarily follow.
- Dan C.
> [1] https://lkml.org/lkml/2012/12/23/75
>
> As is the way of iron rules, people have found ways to game them.
>
> https://lwn.net/Articles/1070072/
>
> [2] https://docs.kernel.org/core-api/workqueue.html
> [3] https://www.linuxjournal.com/article/36
More information about the TUHS
mailing list