[TUHS] scaling on TCP socket connections

Larry McVoy lm at mcvoy.com
Fri Mar 10 11:32:16 AEST 2023


SGI made TCP go very fast on 200Mhz MIPS processors.  The tricks were to
mark the page copy on write on output so the driver could use the page
without copying and to page flip on input.  Only worked with multiples
of page sized requests.

SGI had very fast sequential I/O through XFS (the trick there was big 
I/O requests on files opened with O_DIRECT, the volume manager split
the request into chunks sending each chunk to a disk controller; 
when you get the stupid 200Mhz processors out of the way and 
spin up a boat load of DMA engines, yeah, you can scale up until
the bus is full).

And they had fast TCP over Hippi.

I showed up and asked why hasn't anyone plugged XFS into TCP?  In a 
few weeks I wrote a user level server that demonstrated that it could
work and that turned into this talk:  

http://mcvoy.com/lm/papers/bds.pdf

If you don't go look it made O_DIRECT work on NFS and gave much faster
NFS sequential I/O performance.  Real NFS was 18MB/sec, my stuff was
67MB/sec for a single file, scaled to 100s of MB/sec.

Everyone thought I was smart for doing that but the reality was the 
XFS/XLV folks and the TCP folks had done all the hard work, I just 
did some plumbing to connect the two fat pipes.

I did eventually push it into the kernel because of stuff like the
page cache coherency but even that was easy.

And I found the SGI writeup of it, it has more perf numbers:

http://mcvoy.com/lm/papers/bdspro.pdf

If anyone cares I can post a link to the user level server code.

On Thu, Mar 09, 2023 at 04:59:54PM -0800, Tom Lyon wrote:
> Sun chose UDP for NFS at a point when few if any people believed TCP could
> go fast.
> I remember (early  80s) being told that one couldn't use TCP/IP in LANs
> because they were WAN protocols.  In the late 80s, WAN people were saying
> you couldn't use TCP/IP because they were LAN protocols.
> 
> But UDP for NFS was more attractive because it was not byte stream
> oriented, and didn't require copying to save for retransmissions.  And
> there was hope we'd be able to do zero copy transmissions from the servers
> - also the reason for inventing Jumbo packets to match the 8K page size of
> Sun3 systems.
> 
> I did get zero copy serving working with ND (network disk block protocol) -
> but it was terribly specific to particular hardware components.
> 
> On Thu, Mar 9, 2023 at 4:24???PM ron minnich <rminnich at gmail.com> wrote:
> 
> > Ca. 1981, if memory serves, having even small numbers of TCP connections
> > was not common.
> >
> > I was told at some point that Sun used UDP for NFS for that reason. It was
> > a reasonably big deal when we started to move to TCP for NFS ca 1990 (my
> > memory of the date -- I know I did it on my own for SunOS as an experiment
> > when I worked at the SRC -- it seemed to come into general use about that
> > time).
> >
> > What kind of numbers for TCP connections would be reasonable in 1980, 90,
> > 2000, 2010?
> >
> > I sort of think I know, but I sort of think I'm probably wrong.
> >

-- 
---
Larry McVoy           Retired to fishing          http://www.mcvoy.com/lm/boat


More information about the TUHS mailing list