[COFF] Capabilities (was Re: Other OSes?

Tue Jul 17 02:59:08 AEST 2018

On Mon, 16 Jul 2018 10:49:52 -0400 Dan Cross <crossd at gmail.com> wrote:
>
> On Tue, Jul 10, 2018 at 1:30 AM Bakul Shah <bakul at bitblocks.com> wrote:
>
> > I tend to think they are orthogonal.  Namspaces map names to
> > objects -- they control what objects you can see.
> >
>
> In many ways, capabilities do the same thing: by not being able to name a
> resource, I cannot access it.
>
> _A_ way to think of plan9 style namespaces are as objects, with the names
> of files exposed by a particular filesystem being operations on those
> objects (I'm paraphrasing Russ Cox here, I think). In the plan9 world, most
> useful resources are implemented as filesystems. In that sense, not only
> does a particular namespace present a program with the set of resources it
> can access, but it also defines what I can do with those resources.
>
> Capabilities control what you can do with such an object. Cap
> > operations such as revoke, grant, attenunating rights
> > (subsetting the allowed ops) don't translate to namespaces.
> >
>
> I disagree. By way of example, I again reiterate the question I've asked
> several times now: there's a capability in e.g. Capsicum to allow a program
> to invoke the connect(2) system call: how does the capability system allow
> me to control the 5 tuple one might connect(2) to? I gave an example where,
> in the namespace world, I can interpose a proxy namespace that emulates the
> networking filesystem and restrict what I can do with the network.

You'd do something like "networkStackCap.connect(5-tuple)" where
"networkStackCap" is a capability to a network stack service (or
object) you were given. It could a proxy server or a NAT server
that rewrites things, or a remote handle on a service on another
node. A client starts with some given caps. It can gain new caps
only through operations on its existing caps. Whether you represent
the 5-tupe a path string or a tuple of 5 numbers or something else
is upto how the network stack expects it.

> A cap is much like a file descriptor but it doesn't have to be
> > limited to just open files or directories.
> >
> > As an example, a client make invoke "read(fd, buffer, count)"
> > or "write(fd, buffer, count)". This call may be serviced by a
> > remote file server. Here "buffer" would be a cap on a memory
> > range within the client process -- we don't want the server to
> > have any more access to client's memory. When the fileserver
> > is ready to read or write, it has to securely arrange data
> > transfer to/from this memory range[1].
> >
>
> So in other words, "buffer" is a name of an object and by being able to
> name that object, I can do something useful with it, subject to the
> limitations imposed by the object itself (e.g., `read(fd, buffer, count);`
> will fault if `buffer` points to read-only memory).

Right but then in order to access contents of this named
buffer object you need to pass it another named buffer
object....  Where is the bottom turtle? In Unix/plan9 world
buffer is strictly local and you have to play games (such as
copyin/copyout or "meltdown" friendly mapping).

One way to compare them is this: imagine in a game of
adventure you are exploring a place with many passages and
rooms. In the cap world some of these rooms have a lock and
you need the right key to access them (and in turn they may
lead to other locked or open rooms). You can only access those
rooms for which you were given keys when you started or the
keys you found during your exploration.

In the plan9 world you can't even see the door of a room if
you are not allowed access it. You need to read some magic
scroll (mount) and new doors would magically appear! And there
are some global objects that anyone can access (e.g. #e,
#c, #k, #M, #s etc.). In contrast cap clients are inherently
sandboxed. They can't tell if it is Live or it is Memorex. 

plan9 / unix filesystems control access to a collection of
objects. You still need rwxrwxrwx mode bits, which are not
finegrained enough -- the same for each group of users.

Caps can control access to individual objects and you can
subset this access (e.g. the equivalent of a valet key that
don't allow access to the glove box or trunk, or a trunk key
that doesn't allow driving a car). You can even revoke access
(i.g. change the locks). None of these map to namespaces
without adding much more complication.

Even plan9's namespaces are rather expensive in practice.  May
be because they evolved from unix but for an arbitray objects
things like owner, group, access and modification times etc.
don't really matter.

> > [1] In the old days we used copyin/copyout for this sort of
> > data transfer between a user process address space and the
> > kernel. This is analogous.
> >
>
> It's an aside, but it's astonishing to see how that's been bastardized by
> the use of ioctl() for such things in the Linux world.

The unix filesystem abstraction is very good but it doesn't
cover some uses hence it has become a leaky abstraction. In
plan9 world you have ctl files but commands they accept are
still arbitrary (specific to the object).

The great invention of unix was a set of a few abstractions
that served extremely well for a majority of tasks. These can
be used with caps.