[TUHS] ECC memory

John Gilmore gnu at toad.com
Tue Jul 7 09:23:37 AEST 2020


Chris Torek <torek at torek.net> wrote:
> I keep thinking I'll replace it with a new box that does have ECC,
> but haven't gotten around to it yet.  I see some consumer-priced
> AMD CPUs have at least theoretical ECC support but I haven't found
> anything that says the ECC actually works, and have seen a few
> articles that hint that it doesn't.

All the AMD Ryzen CPUs and chipsets have built-in ECC.  It's easy since
the CPU pins talk directly to main memory.  This is one among thousands
of reasons to avoid buying Intel CPUs.

The machine I'm typing on has ECC memory, and I bought it in 2019.  You
have to pick a motherboard that wasn't designed by dolts.  (I got the
Gigabyte Aorus "AX-370-GAMING5", which of course is no longer
manufactured.)  And you have to spend an extra $5 or $10 on your DIMMs.
Get the motherboard maker's "Qualified Vendor List", e.g.:

    http://download.gigabyte.us/FileList/Memory/mb_memory_ga-ax370-gaming5_pinnacle_v4.pdf

Be sure there *are* some approved ECC dimms on the list.  Buy those.
You may or may not have to fiddle something in the BIOS settings.

Check dmesg when you first boot the machine (booting the installer is
fine).  Make sure the Linux kernel sees the DIMMS, isn't subverted by
the BIOS, and understands the CPU/memory control registers.  When it
works, you'll see something like:

[    0.180161] EDAC MC: Ver: 3.0.0
[    9.389338] EDAC amd64: Node 0: DRAM ECC enabled.
[    9.389339] EDAC amd64: F17h detected (node 0).
[    9.389375] EDAC MC: UMC0 chip selects:
[    9.389376] EDAC amd64: MC: 0:     0MB 1:     0MB
[    9.389376] EDAC amd64: MC: 2:  4096MB 3:  4096MB
[    9.389378] EDAC MC: UMC1 chip selects:
[    9.389379] EDAC amd64: MC: 0:     0MB 1:     0MB
[    9.389379] EDAC amd64: MC: 2:  4096MB 3:  4096MB
[    9.389380] EDAC amd64: using x8 syndromes.
[    9.389380] EDAC amd64: MCT channel count: 2
[    9.389422] EDAC MC0: Giving out device to module amd64_edac controller F17h: DEV 0000:00:18.3 (INTERRUPT)
[    9.389428] EDAC PCI0: Giving out device to module amd64_edac controller EDAC PCI controller: DEV 0000:00:18.0 (POLLED)
[    9.389429] AMD64 EDAC driver v3.5.0

Success!  Non-flaky main memory in a PC clone!  Cost: about $50 plus
paying attention.

	John
	


More information about the TUHS mailing list