[TUHS] Tech Sq elevator [ really type-checking ]
Jon Steinhart
jon at fourwinds.com
Mon Jan 13 10:44:47 AEST 2020
Bakul Shah writes:
> On Jan 12, 2020, at 3:40 PM, Jon Steinhart <jon at fourwinds.com> wrote:
> >
> > Kevin Bowling writes:
> >>
> >> I am regularly surprised by how surprising type systems are to
> >> computing professionals.
> >
> > I am currently astonished at this. Unfortunately, I need to make a hopefully
> > minor change to the linux kernel to support something that I want to do in my
> > current project. But, this is my first time looking at the internals which is
> > way different that my recollection of UNIX kernels. It's being enough of an
> > adventure that I'm writing up a travelogue of my journey through the code.
> > While I swore that I was done writing books this is sure looking like another
> > one :-)
> >
> > So I came across this piece of what I consider to be bad programming that's
> > all over the place...
> >
> > One of my programming style rules is to program in the language in which you're
> > programming. The canonical example of not doing this is the Bourne shell which
> > was originally written using macros to redefine C to look like Algol68.
> >
> > Linux contains several sets of list_for_each_entry() macros that are essentially
> > obfuscated for loops that generate inefficient code. To make things worse, the
> > way that they're implemented is by embedding list_head structures into other
> > structures.
> >
> > In the diagram below, the labels above boxes are structure names. Names inside
> > of boxes are structure member names. super_blocks, s_list, s_mounts, and
> > mnt_instance are all list_head structures. (Trick question, how many lines are
> > in this diagram :-) )
> >
> > +-----------------------------------------+
> > | super_block |
> > | +--------------+ +----------+ |
> > +->| super_blocks |<->| s_list |<- ... -+
> > +--------------+ +----------+
> > | | mount mount
> > | ... | +--------------+ +--------------+
> > | | | ... | | ... |
> > +----------+ +--------------+ +--------------+
> > +->| s_mounts |<->| mnt_instance |<->| mnt_instance |<- ... -+
> > | +----------+ +--------------+ +--------------+ |
> > | | ... | | ... | |
> > | +--------------+ +--------------+ |
> > +------------------------------------------------------------+
> >
> > The bizarre thing to me is that the list_head structures point to other list_head
> > structures that are embedded in other structures. When one needs to access a
> > non list-head member of the structure one has to pass both the structure type and
> > the list_head member name to a macro that figures out how to subtract the offset
> > of the list_head member of the structure from the address of that list_head to
> > get the address of the structure, and then casts that as the structure type so
> > that members can be accessed.
>
> There is similar code in FreeBSD kernel. Embedding head and next ptrs reduces
> memory allocation and improves cache locality somewhat. Since C doesn't have
> generics, they try to gain the same functionality with macros. See
>
> https://github.com/freebsd/freebsd/blob/master/sys/sys/queue.h
>
> Not that this is the same as what Linux does (which I haven't dug into) but
> I suspect they may have had similar motivation.
Not sure that I understand.
A list_head structure takes exactly the same amount of space as a pair of pointers,
so the memory allocation should be identical.
I also don't see the cache locality. Both the embedded list_head and the start
of the structure are accessed in either case; once the programmer has played
guess-the-type the list macro returns a pointer to the start of the structure.
So again, I'm just commenting on what I'm seeing in linux, and while it is
possible that I may be misunderstanding something so far I don't think so.
Jon
More information about the TUHS
mailing list