uuencode/decode for the million-and-first time...

Richard L. Goerwitz goer at ellis.uchicago.edu
Sun Feb 10 03:56:36 AEST 1991


"This" = uuen/decode:
>
>This seems to be wanted frequently....so here's saving somebody a trip
>to find the archives. This is as sent to me.
>

Often at EBCDIC sites, uuencoded files get mangled.  There are lots
of ways around this, but the most popular seems to be xxen/decode.
Here's a little Icon program that will handle both xxen/decoded
files, and uuen/decoded files as well.  It works under MS-DOS, in-
cidentally, as well as Unix.

-Richard (goer at sophist.uchicago.edu)

---- Cut Here and feed the following to sh ----
#!/bin/sh
# This is a shell archive (produced by shar 3.49)
# To extract the files from this archive, save it to a file, remove
# everything above the "!/bin/sh" line above, and type "sh file_name".
#
# made 01/27/1991 23:52 UTC by goer at sophist.uchicago.edu
# Source directory /u/richard/Iiencode
#
# existing files will NOT be overwritten unless -c is specified
# This format requires very little intelligence at unshar time.
# "if test", "cat", "rm", "echo", "true", and "sed" may be needed.
#
#                                                                          
#                                                                          
#
# This shar contains:
# length  mode       name
# ------ ---------- ------------------------------------------
#   6515 -r--r--r-- iiencode.icn
#   6103 -r--r--r-- iidecode.icn
#   1573 -rw-r--r-- README
#
if test -r _shar_seq_.tmp; then
	echo 'Must unpack archives in sequence!'
	echo Please unpack part `cat _shar_seq_.tmp` next
	exit 1
fi
# ============= iiencode.icn ==============
if test -f 'iiencode.icn' -a X"$1" != X"-c"; then
	echo 'x - skipping iiencode.icn (File already exists)'
	rm -f _shar_wnt_.tmp
else
> _shar_wnt_.tmp
echo 'x - extracting iiencode.icn (Text)'
sed 's/^X//' << 'SHAR_EOF' > 'iiencode.icn' &&
X############################################################################
X#
X#	Name:	 iiencode.icn
X#
X#	Title:	 iiencode (port of the Unix/C uuencode program to Icon)
X#
X#	Author:	 Richard L. Goerwitz
X#
X#	Version: 1.7
X#
X############################################################################
X#
X#  This is an Icon port of the Unix/C uuencode utility.  Since
X#  uuencode is publicly distributable BSD code, I simply grabbed a
X#  copy, and rewrote it in Icon.  The only basic functional changes I
X#  made to the program were:  1) To simplify the notion of file mode
X#  (everything is encoded with 0644 permissions), and 2) to add sup-
X#  port for xxencode format (which will generally pass unscathed even
X#  through EBCDIC sites).
X#
X#  Iiencode's usage is compatible with that of the Unix uuencode
X#  command, i.e. a first (optional) argument gives the name the file
X#  to be encoded.  If this is omitted, iiencode just uses the standard
X#  input.  The second argument specifies the name the encoded file
X#  should be given when it is ultimately decoded.
X#
X#  Extensions to the base uuencode command options include -x and -o.
X#  An -x tells iiencode to use xxencode (rather than uuencode) format.
X#  Option -o causes the following argument to be used as the file
X#  iiencode is to write its output to (the default is &output).  Note
X#  that, on systems with newline translation (e.g. MS-DOS), the -o
X#  argument should always be used.
X#
X#    iiencode [infile] [-x] remote-filename [-o] output-filename
X#
X#  BUGS:  Slow.  I decided to go for clarity and symmetry, rather than
X#  speed, and so opted to do things like use ishift(i,j) instead of
X#  straight multiplication (which under Icon v8 is much faster).  Note
X#  that I followed the format of the newest BSD release, which refuses
X#  to output spaces.  If you want to change things back around so that
X#  spaces are output, look for the string "BSD" in my comments, and
X#  then (un)comment the appropriate sections of code.
X#
X#  NOTE ON MS-DOS:  Systems for which newline translation is necessary
X#  can encode files.  The problem is that, since iiencode sends coded
X#  files to the standard output, it is impossible to avoid sending out
X#  OS-specific sequences at the end of each line.  While most uudecode
X#  programs will be able to handle the resulting file, they will not
X#  always decode the file *name* properly.  Binary files simply won't
X#  work, unless the program is modified to write to a file instead of
X#  the standard output.  If you do this, make sure you open the file
X#  for writing in untranslated mode.  If someone modifies this program
X#  so that it works really will under DOS, please send me the results.
X#
X############################################################################
X#
X#  See also: iidecode.icn
X#
X############################################################################
X
X
Xprocedure main(a)
X
X    local ofs, in_filename, out_filename, in, out, is_xx, remotename
X
X    usage := "usage:  iiencode [infile] [-x] _
X	remote-filename	[-o output-filename]"
X
X    # Parse arguments.
X    ofs := 0
X    while (ofs +:= 1) <= *a do {
X        case a[ofs] of {
X	    "-x"    : is_xx := 1
X	    "-o"    : out_filename := a[ofs +:= 1] | stop(usage)
X	    default : {
X		if not (/in_filename := a[ofs]) then
X		    remotename := a[ofs]
X	    }
X	}
X    }
X
X    # If remotename is null, set it to in_filename.  If it's still
X    # null, then abort with usage message.
X    if /(/remotename :=: in_filename) then {
X        write(&errout,usage)
X        exit(2)
X    }
X
X    # If no input filename was supplied, use &input.
X    if /in_filename then
X	/in := &input
X    else
X	in := open(in_filename) |
X	stop(&errout,"Can't open input file, ",in_filename,".\n",usage)
X
X    # If an output filename was specified, open it for writing.
X    if \out_filename then
X	out := open(out_filename, "wu") |
X	    stop("Can't open output file, ",out_filename,".\n",usage)
X    # Set null out to &output; advise DOS users to use -o option.
X    else {
X	out := &output
X	if find("MS-DOS",&features) then
X	    write(&errout, "Okay, but the -o option is recommended for DOS.")
X    }
X
X    # This generic version of uuencode treats file modes in a primitive
X    # manner so as to be usable in a number of environments.  Please
X    # don't get fancy and change this unless you plan on keeping your
X    # modified version on-site (or else modifying the code in such a
X    # way as to avoid dependence on a specific operating system).
X    writes(out, "begin 644 ",remotename,"\n")
X
X    encode(out, in, is_xx)
X
X    writes(out, "end\n")
X
X    every close(in|out)
X    exit(0)
X
Xend
X
X
X
Xprocedure encode(out, in, is_xx)
X
X    # Copy from in to standard output, encoding as you go along.
X
X    local line
X
X    if \is_xx then
X	ENC := xxENC
X
X    # 1 (up to) 45 character segment
X    while line := reads(in, 45) do {
X	writes(out, ENC(*line))
X	line ? {
X	    while outdec(move(3), out)
X	    pos(0) | outdec(left(tab(0), 3, " "), out)
X	}
X	writes(out, "\n")
X    }
X    # Uuencode adds a space and newline here, which is decoded later
X    # as a zero-length line (signals the end of the decoded text).
X    # writes(" \n")
X    # The new BSD code (compatible with the old) avoids outputting
X    # spaces by writing a ` (see also how it handles ENC() below).
X    if \is_xx
X    then writes(out, "+\n")
X    else writes(out, "`\n")
X    
Xend
X
X
X
Xprocedure outdec(s, out)
X
X    # Output one group of 3 bytes (s) to standard output.  This is one
X    # case where C is actually more elegant than Icon.  Note well!
X
X    local c1, c2, c3, c4
X
X    c1 := ishift(ord(s[1]),-2)
X    c2 := ior(iand(ishift(ord(s[1]),+4), 8r060),
X	      iand(ishift(ord(s[2]),-4), 8r017))
X    c3 := ior(iand(ishift(ord(s[2]),+2), 8r074),
X	      iand(ishift(ord(s[3]),-6), 8r003))
X    c4 := iand(ord(s[3]),8r077)
X    every writes(out, ENC(c1 | c2 | c3 | c4))
X
X    return
X
Xend
X
X
X
Xprocedure ENC(c)
X
X    # ENC is the basic 1 character encoding procedure to make a char
X    # printing.
X
X    # New BSD code doesn't output spaces...
X    return " " ~== char(iand(c, 8r077) + 32) | "`"
X    # ...the way the old code does:
X    # return char(iand(c, 8r077) + 32)
X
Xend
X
X
X
Xprocedure xxENC(c)
X
X    # ENC is the basic 1 character encoding procedure to make a char
X    # printing.
X
X    local k, ordval
X    static ordtbl
X    initial {
X	ordval := -1
X	ordtbl := table()
X	every k := !"+-0123456789ABCDEFGHIJKLMNOPQRST_
X		     UVWXYZabcdefghijklmnopqrstuvwxyz"
X	do insert(ordtbl, ordval +:= 1, k)
X	oversizes := 0
X    }
X
X    return ordtbl[iand(c, 8r077)]
X
Xend
SHAR_EOF
true || echo 'restore of iiencode.icn failed'
rm -f _shar_wnt_.tmp
fi
# ============= iidecode.icn ==============
if test -f 'iidecode.icn' -a X"$1" != X"-c"; then
	echo 'x - skipping iidecode.icn (File already exists)'
	rm -f _shar_wnt_.tmp
else
> _shar_wnt_.tmp
echo 'x - extracting iidecode.icn (Text)'
sed 's/^X//' << 'SHAR_EOF' > 'iidecode.icn' &&
X############################################################################
X#
X#	Name:	 iidecode.icn
X#
X#	Title:	 iidecode (port of the Unix/C uudecode program to Icon)
X#
X#	Author:	 Richard L. Goerwitz
X#
X#	Version: 1.7
X#
X############################################################################
X#
X#  This is an Icon port of the Unix/C uudecode utility.  Since
X#  uudecode is publicly distributable BSD code, I simply grabbed a
X#  copy, and rewrote it in Icon.  The only basic functional changes I
X#  made to the program were:  1) To simplify the notion of file mode
X#  (everything is encoded with 0644 permissions), and 2) to add a
X#  command-line switch for xxencoded files (similar to uuencoded
X#  files, but capable of passing unscathed through non-ASCII EBCDIC
X#  sites).
X#
X#         usage:  iidecode [infile] [-x]
X#
X#  Usage is compatible with that of the UNIX uudecode command, i.e. a
X#  first (optional) argument gives the name the file to be decoded.
X#  If this is omitted, iidecode just uses the standard input.  The -x
X#  switch (peculiar to iidecode) forces use of the the xxdecoding
X#  algorithm.  If you try to decode an xxencoded file without speci-
X#  -x on the command line, iidecode will try to forge ahead anyway.
X#  If it thinks you've made a mistake, iidecode will inform you after
X#  the decode is finished.
X#
X#  BUGS:  Slow.  I decided to go for clarity and symmetry, rather than
X#  speed, and so opted to do things like use ishift(i,j) instead of
X#  straight multiplication (which under Icon v8 is much faster).
X#
X############################################################################
X#
X#  See also: iiencode.icn
X#
X############################################################################
X
X
Xglobal oversizes
X
Xprocedure main(a)
X
X    local ARG, in, out, dest, is_xx
X
X    # Check for correct number of args.
X    if *a > 2 then {
X	write(&errout,"usage:  iidecode [infile] [-x]")
X	exit (2)
X    }
X
X    # Check for optional input filename and -x
X    every ARG := !a do {
X	if ARG == "-x" then
X	    is_xx := 1
X	else {
X	    if not (in := open(ARG, "r")) then {
X		write(&errout,"Can't open input file, ",a[1],".")
X		write(&errout,"usage:  iidecode [infile] [-x]")
X		exit(1)
X	    }
X	}
X    }
X    /in := &input
X
X    # Find the "begin" line, and determine the destination file name.
X    !in ? {
X	tab(match("begin ")) &
X	tab(many(&digits))   &	# mode ignored
X	tab(many(' '))       &
X	dest := trim(tab(0),'\r') # concession to MS-DOS
X    }
X
X    # If dest is null, the begin line either isn't present, or is
X    # corrupt (which necessitates our aborting with an error msg.).
X    if /dest then {
X	write(&errout,"No begin line.")
X	exit(3)
X    }
X
X    # Tilde expansion is heavily Unix dependent, and we can't always
X    # safely write the file to the current directory.  Our only choice
X    # is to abort.
X    if match("~",dest) then {
X	write(&errout,"Please remove ~ from input file begin line.")
X	exit(4)
X    }
X       
X    out := open(dest, "wu")
X    decode(in, out, is_xx)	# decode checks for "end" line
X    if not match("end", !in) then {
X	write(&errout,"No end line.\n")
X	exit(5)
X    }
X
X    # Check global variable oversizes (set by DEC) to see if we used the
X    # correct decoding algorithm.
X    if \is_xx then {
X	if oversizes = 0 then {
X	    write(&errout, "Input file appears to have been uuencoded.")
X	    write(&errout, "Try invoking iidecode without the -x arg.")
X	}
X    }
X    else {
X	if oversizes > 1 then {
X	    write(&errout, "Input file is either corrupt, or xxencoded.")
X	    write(&errout, "Please check the output; try the -x option.")
X	}
X    }
X
X    every close(\in | out)
X
X    exit(0)
X
Xend
X
X
X
Xprocedure decode(in, out, is_xx)
X    
X    # Copy from in to out, decoding as you go along.
X
X    local line, chunk
X
X    if \is_xx then
X	DEC := xxDEC
X
X    while line := read(in) do {
X
X	if *line = 0 then {
X	    write(&errout,"Short file.\n")
X	    exit(10)
X	}
X
X	line ? {
X	    n := DEC(ord(move(1)))
X
X	    if not ((*line-1) % 4 = 0, n <= ((*line / 4)*3)) then {
X		write(&errout,"Short and/or corrupt line:\n",line)
X		if /is_xx & oversizes > 1 then
X		    write(&errout,"Try -x option?")
X                exit(15)
X            }
X
X	    # Uuencode signals the end of the coded text by a space
X	    # and a line (i.e. a zero-length line, coded as a space).
X	    if n <= 0 then break
X	    
X	    while (n > 0) do {
X		chunk := move(4) | tab(0)
X		outdec(chunk, out, n)
X		n -:= 3
X	    }
X	}
X    }
X    
X    return
X
Xend
X
X
X
Xprocedure outdec(s, f, n)
X
X    # Output a group of 3 bytes (4 input characters).  N is used to
X    # tell us not to output all of the chars at the end of the file.
X
X    local c1, c2, c3
X
X    c1 := iand(
X	       ior(
X		   ishift(DEC(ord(s[1])),+2),
X		   ishift(DEC(ord(s[2])),-4)
X		   ),
X	       8r0377)
X    c2 := iand(
X	       ior(
X		   ishift(DEC(ord(s[2])),+4),
X		   ishift(DEC(ord(s[3])),-2)
X		   ),
X	       8r0377)
X    c3 := iand(
X	       ior(
X		   ishift(DEC(ord(s[3])),+6),
X		   DEC(ord(s[4]))
X		   ),
X	       8r0377)
X
X    if (n >= 1) then
X	writes(f,char(c1))
X    if (n >= 2) then
X	writes(f,char(c2))
X    if (n >= 3) then
X	writes(f,char(c3))
X
Xend	
X
X
X
Xprocedure DEC(c)
X
X    # global oversizes
X    initial oversizes := 0
X
X    # Count characters lexically greater or equal to 'a.'
X    # If we get a lot of these, the file is corrupt, or perhaps
X    # xxencoded (in which case -x should have been specified).
X    if c >= 97 then
X	oversizes +:= 1
X
X    # Subtract 32 and mask off seventh and higher bits.
X    return iand(c - 32, 8r077)
X
Xend
X
X
X
Xprocedure xxDEC(c)
X
X    local k, ordval
X    static ordtbl
X    # global oversizes
X    initial {
X	ordval := -1
X	ordtbl := table()
X	every k := ord(!"+-0123456789ABCDEFGHIJKLMNOPQRST_
X		         UVWXYZabcdefghijklmnopqrstuvwxyz")
X	do insert(ordtbl, k, ordval +:= 1)
X	oversizes := 0
X    }
X
X    # Mask off eighth and higher bits.
X    new_c := iand(c, 8r177)
X
X    # Count characters lexically greater or equal to 'a.'
X    # If we find none of these, the file probably wasn't xxencoded.
X    if new_c >= 97 then
X	oversizes +:= 1
X
X    # Map to 0-63 range (00111111 or less), mask off extra bits.
X    return iand(ordtbl[new_c], 8r077)
X
Xend
SHAR_EOF
true || echo 'restore of iidecode.icn failed'
rm -f _shar_wnt_.tmp
fi
# ============= README ==============
if test -f 'README' -a X"$1" != X"-c"; then
	echo 'x - skipping README (File already exists)'
	rm -f _shar_wnt_.tmp
else
> _shar_wnt_.tmp
echo 'x - extracting README (Text)'
sed 's/^X//' << 'SHAR_EOF' > 'README' &&
X
XIncluded in this package are two Icon source files, iiencode.icn and
Xiidecode.icn.  When compiled, these will yield icode files which
Xemulate the Unix uuencode and uudecode commands.  They are supposed to
Xbe completely compatible with all existing uuen/decode versions, and
Xare patterned after latest publicly distributable BSD C source code.
X
XFor those who are working at sites subject to the vicissitudes of
XASCII<->EBCDIC translation, an extra switch (-x) is included that
Xmakes iiencode/iidecode produce/extract xxencode-format files (which
Xdo not get mangled when moving through non-ASCII sites).  Before using
Xthis switch with the uuencode command, make sure that the person to
Xreceive the coded transmission can unpack xxencoded files.
X
XIiencode and iidecode have been tested under Unix and Xenix, and work
Xfine.  They have received brief testing under MS-DOS, and appear to
Xwork there as well.
X
XFor systems which use some other sequence in place of UNIX LFs (e.g.
XMS-DOS), there is a special iiencode switch, -o filename, which
Xdirects iiencode to send its output to filename instead of &output.
XThis makes it possible to strip out trailing carriage returns, and
Xwrite an output file in UNIX format.
X
XIien/decode are written in Icon, which is extremely portable.  Anyone
Xwith an Icon interpreter or compiler can install them.  And, since
XIcon is free (obtainable from the U. of Arizona at cs.arizona.edu),
Xthe programs really are without cost of any kind.
X
XNote:  They aren't very fast especially when the -x switch is used.
X
X-Richard (goer at sophist.uchicago.edu)
X
SHAR_EOF
true || echo 'restore of README failed'
rm -f _shar_wnt_.tmp
fi
exit 0



More information about the Alt.sources mailing list