[TUHS] Research Datakit notes

Lawrence Stewart stewart at serissa.com
Wed Jun 29 07:32:03 AEST 2022


I’ll argue there is quite a lot known about where to put network functionality, much
of it from HPC.  If you want minimum latency and minimum variance of latency, both
of which are important to big applications, you make the network reliable and move
functionality into the host adapters and the switches.  The code path at each end
of a very good MPI implementation will be under 200 machine instructions, all in
user mode.  There is no time to do retry or variable code paths.

Doesn’t work on WANs of course, or at consumer price points.

(I think there is still a lot to do, because the best networks still hover around 
800 nanoseconds calling SEND to returning from RECV, and I think it could be 100).

-L

> On 2022, Jun 28, at 11:50 AM, Noel Chiappa <jnc at mercury.lcs.mit.edu> wrote:
> 
>> From: Rob Pike
> 
>> having the switch do some of the call validation and even maybe
>> authentication (I'm not sure...) sounds like it takes load off the host.
> 
> I don't have enough information to express a judgement in this particular
> case, but I can say a few things about how one would go about analyzing
> questions of 'where should I put function [X]; in the host, or in the
> 'network' (which almost inevitably means 'in the switches')'.
> 



More information about the TUHS mailing list