below are some thoughts/hopefully answers to your questions....

On Wed, Feb 22, 2023 at 3:16 PM segaloco via TUHS <tuhs@tuhs.org> wrote:
Good day all, figured I'd start a thread on this matter as I'm starting to piece enough together to articulate the questions arising in my research.

So based on my analysis of the 3B20S UNIX 4.1 manual I've been working through, all evidence points to the formalized SGS package and COFF originating tightly coupled to the 3B-20 line, then growing legs to support VAX, but never quite absorbing PDP-11 in entirety. That said, there are bits and pieces of the manual pages for the object format libraries that suggest there was some providence for PDP-11 in the development of COFF as well.

Where this has landed though is a growing curiosity regarding:
  1. Whether SGS and COFF were tightly coupled to one another from the outset, with SGS being supported by the general library routines being developed for the COFF format
@scj - any enlightenment -- your team in USG must have been part of all that. 
  1. Whether COFF was envisioned as a one-size-fits-all object format from its inception or started as an experiment in 3B-20 development that wound up being general enough for other platforms
That I can not say, but I can say that to the UNIX source licenses (i.e. not the Universities in the Research system or inside of the Bell Systems) - it was used in the "consider it standard" campaign that AT&T marketing in NC was starting to push.  This was around the time that PCC2 was coming out to replace the original PCC but I remember getting PCC2 was extra cost.  

Most of the BSD based kernels (DEC, HP, etc..) were originally using a modified a.out of their own flavor but I think almost all them switched to COFF post the System III license.   What I have forgotten, and it may have been a requirement/mixed up in the license.  

I do remember this was right around when gcc first starts coming out, and they had a tool called robitussin to "cure coffs" as they were using a.out wen they could.

  1. If, prior to this format, there were any other efforts to produce a unifying binary format and set of development tools, or if COFF was a happy accident from what were a myriad of different architectural toolset streams
MIT had a modified a.out format for the NU machine ports - that might have been called b.out.     
CMU had macho which again was an extended a.out but even more flexible.
  1. One of the curious things is how VAX for a brief moment did have its own set of tools and a.out particulars before SGS/COFF.
Why is that curious - all original Vax development was just using the original PCC stream from V7  (and pre-Judge Green more in a minute).

What I don't remember is if PCC2 was COFF when introduced, or COFF can first but I think they were separate things - again someone like scj would be authoritative.

The three tools that have to care are the assembler (as), the linker (ld) program loading code in the kernel itself.

 

 
For instance, many of the VAX-targeted utilities in 3.0/System III bear little in common option/manual-wise with the general common SGS utilities in System V. The "not on PDP-11" pages for various SGS components in System V much more closely resemble the 3B-20 utilities in 4.1 than any of the non PDP-11/VAX-only bits in System III.

Some examples:
  • The VAX assembler in System III contains a -dN option indicating the number of bytes to set aside for forward/external references for the linker to fill in.
  • The VAX assembler in System V contains among others the -n and -m options from 4.1 which indicate to disable address optimization and use m4 respectively
  • The System V assembler goes on to also include -R (remove input file after completion) -r (VAX only, add .data contents to .text instead) and options -b, -w, and -l to replace the -d1, -d2, and -d4 options indicated in the previous VAX assembler
  • System V further adds a -V to all the SGS software indicating the version of the software. This is new circa 5.0, absent from the 4.1 manual like the R, r, b, w, and l options

  • The 4.1 manual's singular ar(1) entry still agrees with the System III version. No arcv(1) is listed, implying the old ar format never made it to 3B-20
Hmm this is confusing old v[456] ar format to new ar format was during Research V6 to Research V7.  By the time of any Vax development the old format had pretty much been killed. I'd look at check what PWB 1.0 and 2.0 used. The new ar format was independent of what it was in it.

i.e. V7: man 5 ar 
  AR(5)                                                       AR(5)

     NAME
          ar - archive (library) file format

     SYNOPSIS
          #include <ar.h>

     DESCRIPTION
          The archive command ar is used to combine several files into
          one.  Archives are used mainly as libraries to be searched
          by the link-editor ld.

          A file produced by ar has a magic number at the start, fol-
          lowed by the constituent files, each preceded by a file
          header.  The magic number and header layout as described in
          the include file are:

#define ARMAG 0177545
struct ar_hdr {
char ar_name[14];
long ar_date;
char ar_uid;
char ar_gid;
int ar_mode;
long ar_size;
};



 
  • The System V manual has both this ar(1) version as well as the new COFF-supporting version.
Why would ar(1) care?

 
  • Not sure if this implies the VAX ar format was expanded to support the COFF stuff for a little while until they decided on a new one or what.

  • The System III ld (which is implied to support PDP and VAX) survives in System V, but is cut down to supporting PDP-11 only
  • The COFF-ish ld shows up in 4.1, is then extended to VAX presumably in the same breath as the other COFF-supporting bits by Sys V, leading to two copies like many others, PDP-11-specific stuff and then COFF-specific stuff

The picture that starts to form in the context of all of this is, for a little while in the late 70s/early 80s, the software development environments for PDP-11, VAX-11, and 3B-20 were interplaying with each other in often times inconsistent ways. Taking a peek at the 32V manuals, the VAX tools in System III appear to originate with that project, which makes sense. If I'm understanding the timeline, COFF starts to emerge from the 3B-20 project and USG probably decides that's the way to go, a unified format, but with PDP-11 pretty much out the door support wise already, there was little reason to apply that to PDP-11 as well, so the PDP-11 tools get their swan song in System V, original VAX-11 tools from 32V are likely killed off in 4.x, and the stuff that started with the 3B-20 group goes on to dominate the object file format
That makes sense - but be careful - the 3B and WE32000 ISA may have been the driver but I would expect that compiler folk in Summit were more in the driver seat.   The 3B20 kernel would use what they were getting from the tools team and core kernel team in USG. 

Remember the politic at the time is Judge Green has unleashed AT&T and they are now allowed to be in the biz, and the  sales/marketing folks AT&T was pushing the 3B20 and the WE32000 - so there are big forces behind the scenes that are not obvious/clear.

 
and development software stuff until ELF comes along some time later.
Yep - never quite understood what the push for ELF was over COFF after all the effort to drive COFF down people's throat.   Note Microsoft "embraced and extended" COFF as their format -- originally because of Xenix I believe.   Someone like Paul W may have some insights on this and that was before the 3B20. 

What was the format that the original Xenix used - when it was targeting PDP-11, 68000, x86 and Z8000?  Again I'm fuzzy on the details here. But I do remember during the license discussions that would lead to System III, that one of things the Microsoft team was worried about -- IIRC it was Bob Greenberg pushing all that.  I lost contact with Bob a few years ago, but if we can find him, I would expect Bob to know what Xenix was doing.  And again that negotiation>>starts<< all pre-Judge Green, but finishes up soon afterwards.

I guess other questions this raises are:
  1. Were the original VAX tools built with any attention to compatibility with the PDP-11 bits Ken and Dennis wrote many years prior (based on some option discrepancies, possibly not?)
hrmph... folks started with the PDP-11 tools and changed them as needed. I'm not sure compatibility is the right term.  They were retargeted nad moved forward by people trying support a new machine they got and did not want run DEC's OS.
  1. Do the VAX utilities derive from the Interdata 8/32 work or if there was actually another stream of tools as part of that project?
I guess I don't understand the question.  The original V7 tools were retargeted.  When useful features were added, they might be offered/returned to other folks, but remember, Research is not "supporting" UNIX.  USG is where things start to think in terms of multiple targets >>before Judge Green<< and then after Judge Green, there was a push to stop using non-AT&T based equipment or chips in the Bell System and make what Western Electric was selling be attractive [which sometimes was a little bit of putting lipstick on porcine as it were].  For instance, Rob and Barts's original JERQ is 68000 based, but by the time it becomes a product as 5620 it has to be refactored as a WE32000.

 
  1. Was there any interplay between the existing tool streams (original PDP-11, 32V's VAX utilities, possibly Interdata 8/32) and the eventual COFF/SGS stuff, or was the latter pretty well siloed in 3B-20 land until deployment with 4.1?
I think you are putting too much on the 3B program itself.  The 3B was the task at hand at the time and a solid opportunity to bring to bear business choices being made.  You need to look at the greater business to understand a lot of the choices.  A lot of things were happening in parallel in the market that had other impacts on technology and how it was delivered -- the 3B program was the "technology train" leaving the station that some of them got attached to/delivered using.  

But, I as I said to you when we chatted, you really can not underestimate what was happening (or not happening) as AT&T changed its business focus - pre/post-Judge Green. It was a large company with lots of different spheres of interest (read - different executives), each being measured with different things that they might value.