[TUHS] FreeBSD behind the times? (was: Favorite unix design principles?)

John Gilmore gnu at toad.com
Sat Feb 6 11:18:34 AEST 2021


On Thu, Feb 04, 2021 at 09:17:54PM -0800, Bakul Shah wrote:
> Write(2)ing to a mapped page sounds pretty dodgy. Likely to get you
> in trouble in any case. Similarly read(2)ing. 

Uh, no.  You misunderstand completely.

The purpose of the kernel is to provide a reliable interface to system
facilities, that lets processes NOT DEPEND on what other processes are
doing.

The decision about whether Tool X uses mmap() versus read() to access a
file, or mmap() versus write() to change one, is a decision that DOES
NOT DEPEND on what Tool Y is doing.  Tools X and Y may have been written
by different groups in different decades.  Tool X may have been written
to use stdio, which used read().  Three years later, stdio got rewritten
to use mmap() for speed, but that's invisible to the author of Tool X.
And maybe an end user in 2025 decides to use both Tool X and Tool Y on
the same file.  So only much later will any malign interactions between
read/write and mmap actually be noticed by end users.  And the fix is
not to create new dependencies between Tool X, stdio, and Tool Y.  It is
to fix the kernel so they do not depend on each other!

Here is a real-life example from my own experience.

There is a long-standing bug in the Linux kernel, in which the inotify()
system call simply didn't work on nested file systems.  This caused a
long-standing bug in Ubuntu, which I reported in 2012 here:

  https://bugs.launchpad.net/ubuntu/+source/rpcbind/+bug/977847

The symptom was that after booting from a LiveCD image, "apt-get
install" for system services (in my case an NFS client package) wouldn't
work.  Turned out the system startup scripts used inotify() to notice
and start newly installed system services.  The root cause was that
inotify failed because the root file system was an "overlayfs" that
overlaid a RAMdisk on top of the read-only LiveCD file system.  The
people who implemented "overlayfs" didn't think inotify() was important,
or they thought it would be too much work to make it actually meet its
specs, so they just made it ignore changes to the files in the overlaid
file system.  So the startup daemon's inotify() would never report the
creation of new files about the new services, because those files were
in the overlaying RAM disk, and so it would not start them and the user
would notice the error.

The underlying overlayfs bug was reported in 2011 here:

  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/882147

As far as I know it has never been fixed.  (The bug report was
closed in 2019 for one of the usual bogus reasons.)

The problem came because real tools (like systemd, or the tail command)
actually started using inotify, assuming that as a well documented
kernel interface, it would actually meet its specs.  And because a
completely unrelated other real tool (like the LiveCD installer)
actually started using overlayfs, assuming that as a well documented
kernel interface, it too would actually meet its specs.  And then one
day somebody tried to use both those tools together and they failed.

That's why telling people "Don't use mmap() on the same file that you
use read() on" is an invalid attitude for a Real Kernel Maintainer.
Props to Larry McVoy for caring about this.  Boos to the Linux
maintainers of overlayfs who didn't give a shit.

	John
	


More information about the TUHS mailing list