[TUHS] CB-UNIX 2.3 Manual Split

segaloco via TUHS tuhs at tuhs.org
Mon May 29 06:16:38 AEST 2023


Hello, I'm just emailing to notify that I've managed to split up the CB-UNIX 2.3 manual in the archive[1] into individual items.  I've moved the original contents of this directory into the "raw" directory, and now the split PDFs of the manual live under "man".  I intend to do the same with the source code scans, breaking them up into individual files, which will eventually go in an accompanying "sys" folder.

As for my process, decided to throw together a little something to facilitate splitting up PDFs from a simple table.  I've created two scripts[2], pdfslice and pdfbutcher.  The former is an interface on top of Ghostscript to take a particular page-bound slice out of a PDF on stdin and drop it on stdout.  The latter then reads a tab-delimited list of slices out of a table, butchering the PDFs down into their various cuts.  The format is dirt simple: the source PDF name, the start page, the end page, and the destination file prefix to which .pdf is appended on output.  This isn't by any means a formal or robust solution, just something that came together easy and works for my application.  I'm sure this could be made much more efficient; it just operates on one slice at a time, including all the opening and closing for each slice, but gets the job done.  Feel free to use it for whatever just don't complain to me when it eats your favorite file or scribbles all over your disk.  Also, an example input file for the curious is included[3].

As for the CB-UNIX pages, my hunch is that whoever owned this manual had a CB-UNIX 2.1 manual originally and "upgraded" it with supplied pages for 2.3, as was conventional with documentation updates.  For this reason, there are a few random blank pages and several locally printed pages strewn throughout.  In any case a blank page was encountered, it was retained in the document for the manpage it followed.  In other words, if there is a blank page between a.1 and b.1, it is appended to a.1.  The likely reason for blank pages on the back of 2.1 pages was that new copies of the same 2.1 pages were provided with the replacements to keep the page spacing correct with respect to the pages not being replaced.  That's my hunch anyway.

There are also pages here and there missing a page, or more likely that were supposed to be removed in the 2.1->2.3 update and simply weren't.  Plus, there are a few instances of more than one copy of a non-local version of a page present (in other words, situations where the original 2.1 page wasn't removed but a 2.3 or other newer page was also added).  In all these circumstances, the 2.1 page is the one with the normal name and the 2.3 page has been affixed .1l instead of .1, despite not being in the "local pages" PDFs.  I'm open to suggestions but my reasoning was that if the 2.1 was the original page for that actual binder, and wasn't replaced by 2.3 but rather that was added, then the 2.3 page for all intents and purposes is a local addition.  When in doubt, [4] should be a reasonably complete list of which non-l-suffixed pages aren't from 2.1.  Anything else non-local should originate from the previous manual.  Also, where there were duplicates on pages that otherwise couldn't be solved this way, the older of the two pages is marked with a .o in the path before the manual section, keeping with the CB-UNIX convention of doing this for old versions of pages.

As usual, please let me know if anything seems amiss.  I'll admit after a few spot checks I didn't check each and every page my script popped out for accuracy, but everything I've checked had the right pages.  If you do find something off and want to try and slice it right, the scripts above include manpages that should give you a good idea on how to use them if simply reading the scripts isn't clear.

- Matt G.

[1] - https://www.tuhs.org/Archive/Distributions/USDL/CB_Unix/
[2] - https://gitlab.com/segaloco/misc/-/tree/master/scripts
[3] - https://pastebin.com/9s2ene9g
[4] - https://pastebin.com/jHw7JeDc

P.S. Wholly unrelated but just out of curiosity, if anyone knows the 16650 UART well and has some time, can you please email me privately?  It's tangential to a UNIX-y project but I'll spare the details here.


More information about the TUHS mailing list