[TUHS] SunOS code?

Kevin Bowling kevin.bowling at kev009.com
Sun Sep 2 02:27:46 AEST 2018

On Fri, Aug 31, 2018 at 4:02 PM, Arthur Krewat <krewat at kilonet.net> wrote:
> On 8/31/2018 5:58 PM, Larry McVoy wrote:
>> Solaris was Sys Vr4 (which, if I recall correctly, differed from r3
>> largely due to some stuff being ported over from SunOS).  Both the kernel
>> and user space went to a Sys V compat system, it no longer felt anything
>> like BSD.
> I would be very interested in anyone's recollections of how Solaris
> eventually turned out performance-wise, say version 9+, compared to other
> operating systems. SunOS, Linux, AIX, etc.

Linux started pulling away fast even on high end systems by the early
2000s.  IBM and SGI dumped a ton of money, knowledge, and talent into
this.  By Linux kernel 2.6 the race was entirely won.

After this HP-UX, AIX, and Solaris persist mainly in mainframe-like
vertical stacks used mainly to host mission critical applications that
are sold in bundles or "solutions"

> I find it's about equal, and even exceeds Linux in terms of it's NUMA
> support and multi-processor support. I need to move some systems away from
> Solaris and off to Linux, and I find it's NUMA support lacking in certain
> ways.

This is pure fantasy.  To understand Linux performance on high core
count and multi-socket machines is to have at least passing knowledge
of Paul McKenney's genius work on RCU [1] and NUMA [2] at Sequent [3]
and on Linux.  IBM bought Sequent, made a favorable patent grant of
RCU for Linux, and the rest history.

There is a single feature I have seen in Solaris NUMA that should be
implemented elsewhere.  It does a micro-benchmark on boot to figure
out the inter-core latency map.  On stupid technology like Intel's
ACPI and Xeon Cluster-on-Die and Sub-NUMA-Clustering, you get bogus
data back in the SRAT table describing where the cores are on the on
chip network it just chops things in half and doesn't reflect where
the cores actually were fused off for yield or binning reasons which
is statistically almost always asymmetric.  On better engineered
technology like IBM's POWER8/9 and OPAL firmware, you get the real
locality information of where cores and cache groups actually are.
Solaris' neat little micro-benchmark would work optimally even on the
brain damaged data from Intel.

[1] http://www2.rdrop.com/~paulmck/RCU/
[2] http://www2.rdrop.com/~paulmck/scalability/
[3] http://www2.rdrop.com/~paulmck/techreports/stingcacm3.1999.08.04a.pdf


More information about the TUHS mailing list