[TUHS] kernel boots kernel in 1977

Fri Sep 20 03:19:46 AEST 2024

On Wed, Sep 18, 2024 at 9:54 PM Bakul Shah <bakul at iitbombay.org> wrote:
> > On Sep 18, 2024, at 5:53 PM, Dan Cross <crossd at gmail.com> wrote:
> > On Wed, Sep 18, 2024 at 8:05 PM Bakul Shah via TUHS <tuhs at tuhs.org> wrote:
> >> Can you not avoid resetting the machine? This can be treated almost as sleep in the old kernel, wakeup in the new one! You do have to reset devices individually (which may not always work if it requires assistance from some undocumented firmware).
> >
> > Perhaps this is what you mean when you mention assistance from
> > firmware, Bakul, but it may be useful to consider that _many_ devices
> > are touched by e.g. a BIOS or UEFI or whatever well before the OS is
> > even loaded.
>
> Right but presumably the old kernel leaves them in a good enough state.

I suspect this is one of the thornier parts of the whole problem. In
some sense, kexec is similar to live migration of a VM: it's certainly
possible to do, but in particular devices have to be quiesced and in a
state where they are ready to migrate; outstanding IOs may cause
problems with synchronization between the source and destination.
Similarly, if the outgoing kernel in a kexec cannot adequately ensure
that device state is going to be (at a minimum) discoverable in the
incoming kernel, you're going to have a bad time. If there's an
outstanding DMA request? Well, good luck, but you're likely going to
have a bad day....

> > If one steps back and considers the utility of a BIOS/UEFI (and I
> > often lump these into the same category), there are three principal
> > reasons for it: 1) back in the bad old days, we could offload common
> > IO functions into code stored on a ROM, freeing up precious RAM for
> > programs. 2) firmware provides a layer of indirection between the
> > system and the host software, allowing both to vary while continuing
> > to work with newer versions of the other. And finally 3) firmware
> > facilitates bootstrapping the system by providing the host some way to
> > access devices and locate and load an OS image, er, before the OS
> > image is loaded. SOMETHING has to get enough code loaded from
> > somewhere to start the system; often times that's firmware.
>
> The new OS image is already in memory but may need to be copied to
> the right place. The devices were already working (but may need to
> have their interrupts disabled and any DMA stopped etc.).

Yes, sorry, I was trying to explain why firmware is in the loop for
those who may not be familiar.

> > Anyway, the last two suggest that device state can be arbitrarily
> > munged before the OS takes over, and an actual reset at the device
> > level might wipe out some state the OS depends on. Consider, for
> > example, programming PCI BARs; on a "modern" x86-64 system with UEFI,
> > this is done by firmware in the PEI layer, and the OS may expect that
> > to already be set up by the time it is probing buses. An actual
> > honest-to-goodness reset will probably wipe the BARs, requiring the
> > host OS to program them (ironically, many OSes are already equipped to
> > do so, as they have to handle these cases for e.g. PCI hotplug events,
> > though many don't do it in the "ordinary" discovery and initialization
> > phase of boot).
>
> All that is done on powerup.

That's true, but it's non-trivial, and done by opaque firmware that
one has no control over; in particular, it's hard to get the firmware
to cooperate in the kexec protocol.

> > I suppose the point is that a reset is great because it really does
> > wipe out state, but it may also be a bummer because, well, it really
> > does wipe out state. :-)
>
> :-) I was speculating that kernel to kernel warmboot should be doable.

Oh sorry; I think I misunderstood that and thought you were asking,
"why can't you reset the machine?" Apologies there; my bad. I
absolutely agree that it is doable, and that we have several existence
proofs showing just that.

        - Dan C.