[TUHS] ATC/OSDI'21 joint keynote: It's Time for Operating Systems to Rediscover Hardware (Timothy Roscoe)

Marshall Conover marzhall.o at gmail.com
Fri Sep 3 06:12:58 AEST 2021


Kevin, I think that's a great framing of why this talk actually seemed
inverted in its focus for me, and a good identification of why the
presenter might see OS development stalling out and ossifying around
Linux.

I come from the opposite side of the presenter here: my frustration as
a backend dev and user has been that modern OSs still think presenting
an abstraction over my resources means making it easy to use one
single machine (or, as the presenter brings up, a subset of the
machine). Instead, my resources are spread out among many machines and
a number of remote web services, for which I'd like to have one
seamless interface - both for development and use. From an OS
perspective, Plan 9 and its daughter systems have come the closest
I've seen to addressing this by intentionally thinking about the
problem and creating an API system for representing resources that
reaches across networks, and a mutable namespace for using and
manipulating those APIs. Despite pulling other ideas from 9, the
importance of having an opinion on the distributed nature of modern
computing seems to have been missed by prominent operating systems
today. As a result, their development has been relegated to what they
do: be a platform for things that actually provide an abstraction for
my resources.

And userspace systems have filled the demand for abstracting
distributed resource usage to demonstrable business success, if
questionable architectural success (as in, they can still be a
confusing pain in the buns and require excess work sometimes). As a
dev, the systems that have come the closest to presenting one unified
abstraction over my resources are the meta-services offered by Google,
MS and Amazon such as Azure and AWS.

I think the distributed nature of things today is also potentially why
the focus of the conference is on distributed systems now, as lamented
by the presenter. Granted that I'm not the sharpest bulb in the
drawer, but I can't think of a way an OS taking more direct control of
the internal hardware of an individual computer would impact me beyond
the security issues mentioned in the talk. However, I can think of a
number of ways an OS being opinionated about working with networked
machines would greatly improve my situation. Boy, it would be great to
just spin up a cluster of machines, install one OS on all of them, and
treat them as one resource. That's the dream the k8s mentality
promises, and MS and Amazon are already walking towards being this
sort of one-stop shop: "want cluster computing? Press a button to spin
up a cluster with ECS, and store your containers in ECR. Want to run a
program or twelve somewhere on the cluster? Just tell us which one and
how many. Worried about storage? Just tell us what size storage it
needs. We've got you covered!" None of it is perfect, but it shows
that there's heavy demand for a system where users don't have to think
about how to architect and maintain arbitrary groupings of their
resources as necessitated by how OSs think of their job now, and
instead just want to feel as if they're writing and running programs
on one big 'thing'.

So I think the ossification around Linux mentioned in the talk might
be that unless operating systems start doing something more than being
a host for the tools that actually provide an abstraction over all my
resources, there's no real reason to make them do anything else. If
you're not making it easier to use my resources than k8s or Azure, why
would I want you?

Cheers,

Marshall






On Thu, Sep 2, 2021 at 11:42 AM Kevin Bowling <kevin.bowling at kev009.com> wrote:
>
> On Wed, Sep 1, 2021 at 3:00 PM Dan Cross <crossd at gmail.com> wrote:
> >
> > One of the things I really appreciate about participating in this community and studying Unix history (and the history of other systems) is that it gives one firm intellectual ground from which to evaluate where one is going: without understanding where one is and where one has been, it's difficult to assert that one isn't going sideways or completely backwards. Maybe either of those outcomes is appropriate at times (paradigms shift; we make mistakes; etc) but generally we want to be moving mostly forward.
> >
> > The danger when immersing ourselves in history, where we must consider and appreciate the set of problems that created the evolutionary paths leading to the systems we are studying, is that our thinking can become calcified in assuming that those systems continue to meet the needs of the problems of today. It is therefore always important to reevaluate our base assumptions in light of either disconfirming evidence or (in our specific case) changing environments.
> >
> > To that end, I found Timothy Roscoe's (ETH) joint keynote address at ATC/OSDI'21 particularly compelling. He argues that what we consider the "operating system" is only controlling a fraction of a modern computer these days, and that in many ways our models for what we consider "the computer" are outdated and incomplete, resulting in systems that are artificially constrained, insecure, and with separate components that do not consider each other and therefore frequently conflict. Further, hardware is ossifying around the need to present a system interface that can be controlled by something like Linux (used as a proxy more generally for a Unix-like operating system), simultaneously broadening the divide and making it ever more entrenched.
> >
> > Another theme in the presentation is that, to the limited extent the broader systems research community is actually approaching OS topics at all, it is focusing almost exclusively on Linux in lieu of new, novel systems; where non-Linux systems are featured (something like 3 accepted papers between SOSP and OSDI in the last two years out of $n$), the described systems are largely Linux-like. Here the presentation reminded me of Rob Pike's "Systems Software Research is Irrelevant" talk (slides of which are available in various places, though I know of no recording of that talk).
> >
> > Roscoe's challenge is that all of this should be seen as both a challenge and an opportunity for new research into operating systems specifically: what would it look like to take a holistic approach towards the hardware when architecting a new system to drive all this hardware? We have new tools that can make this tractable, so why don't we do it? Part of it is bias, but part of it is that we've lost sight of the larger picture. My own question is, have we become entrenched in the world of systems that are "good enough"?
> >
> > Things he does NOT mention are system interfaces to userspace software; he doesn't seem to have any quibbles with, say, the Linux system call interface, the process model, etc. He's mostly talking about taking into account the hardware. Also, in fairness, his highlighting a "small" portion of the system and saying, "that's what the OS drives!" sort of reminds me of the US voter maps that show vast tracts of largely unpopulated land colored a certain shade as having voted for a particular candidate, without normalizing for population (land doesn't vote, people do, though in the US there is a relationship between how these things impact the overall election for, say, the presidency).
> >
> > I'm curious about other peoples' thoughts on the talk and the overall topic?
> >
> > https://www.youtube.com/watch?v=36myc8wQhLo
> >
> >         - Dan C.
>
>
> One thing I've realized as the unit of computing becomes more and more
> abundant (one off
> HW->mainframes->minis->micros->servers->VMs->containers) the OS
> increasingly becomes less visible and other software components become
> more important.  It's an implementation detail like a language runtime
> and software developers are increasingly ill equipped to work at this
> layer.  Public cloud/*aaS is a major blow to interesting general
> purpose OS work in commercial computing since businesses increasingly
> outsource more and more of their workloads. The embedded (which
> includes phones/Fuschia, accelerator firmware/payload, RTOS etc) and
> academic (i.e. Cambridge CHERI) world may have to sustain OS research
> for the foreseeable future.
>
> There is plenty of systems work going on but it takes place in
> different ways, userspace systems are completely viable and do not
> require switching to microkernels.  Intel's DPDK/SPDK as one
> ecosystem, Kubernetes as another - there is a ton of rich systems work
> in this ecosystem with eBPF/XDP etc, and I used to dismiss it but it
> is no longer possible to do so rationally.  I would go as far as
> saying Kubernetes is _the_ datacenter OS and has subsumed Linux itself
> as the primary system abstraction for the next while.. even Microsoft
> has a native implementation on Server 2022.  It looks different and
> smells different, but being able to program compute/storage/network
> fabric with one abstraction is the holy grail of cluster computing and
> interestingly it lets you swap the lower layer implementations out
> with less risk but also less fanfare.
>
> Regards,
> Kevin


More information about the TUHS mailing list