[TUHS] Is it time to resurrect the original dsw (delete with switches)?

Theodore Ts'o tytso at mit.edu
Tue Aug 31 00:42:53 AEST 2021


On Mon, Aug 30, 2021 at 09:06:03AM -0400, Norman Wilson wrote:
> Not to get into what is soemthing of a religious war,
> but this was the paper that convinced me that silent
> data corruption in storage is worth thinking about:
> 
> http://www.cs.toronto.edu/~bianca/papers/fast08.pdf
> 
> A key point is that the character of the errors they
> found suggests it's not just the disks one ought to worry
> about, but all the hardware and software (much of the latter
> inside disks and storage controllers and the like) in the
> storage stack.

There's nothing I'd disagree with in this paper.  I'll note though
that part of the paper's findings is that silent data corruption
occured at a rate roughly 1.5 orders of magnitude less often than
latent sector errors (e.g., UER's).

The probability of at least one silent data corruption during the
study period (41 months, although not all disks would have been in
service during that entire time) was P = 0.0086 for nearline disks and
P = 0.00065 for enterprise disks.  And P(1st error) remained constant
over disk age, and per the authors' analysis, it was unclear whether
P(1st error) changed as disk size increased --- which is
representative of non media-related failures (not surprising given how
much checksum and ECC checks are done by the HDD's).

A likely supposition IMHO is that the use of more costly enterprise
disks correlated with higher quality hardware in the rest of the
storage stack --- so things like ECC memory really do matter.

> As Ted has said, there are philosophical reasons why some prefer to
> avoid it, but if you don't subscribe to those it's a fine answer.

WRT to running ZFS on Linux, I wouldn't call it philosophical reasons,
but rather legal risks.  Life is not perfect, so you can't drive any
kind of risk (including risks of hardware failure) down to zero.

Whether you should be comfortable with the legal risks in this case
very much depends on who you are and what your risk profile might be,
and you should contact a lawyer if you want legal advice.  Clearly the
lawyers at companies like Red Hat and SuSE have given very answers
from the lawyers at Canonical.  In addition, the answer for hobbyists
and academics might be quite different from a large company making
lots of money and more likely to attract the attention of the
leadership and lawyers at Oracle.

Cheers,

					- Ted


More information about the TUHS mailing list