ASCII, EBCDIC, and other character sets

Erik E. Fair fair at ucbarpa.BERKELEY.EDU
Tue Oct 22 20:10:17 AEST 1985


Internationalization is a laudible goal. However, we have the weight of
history to deal with, both in terms of software already written, and in
terms of the various representation problems associated with a truly
international character set.

I wonder if any `old-timers' out there would describe how ASCII came
about in the first place. I'm too young to remember the standardisation
effort (or any of the pain that was involved for the people who had to
convert hardware or software), and I count myself fortunate that I have
never been exposed to IBM mainframes (thereby missing the joys of
EBCDIC). I'll bet that there was a lot of screaming at the time.

I expect that if we stick the extra European characters in an 8-bit
code, with ASCII as a subset, we'll end up looking like EBCDIC, in the
sense that our character set will be fragmented, and things that fall
out neatly with ASCII (e.g. sorting) will not be so neat any more.

As parochial as this sounds, I think that fooling with the existing
UNIX software will require far too much effort, and will cost too much
for a marginal return on investment. UNIX has ASCII ingrained into it
quite deeply (emphasis on the `A' in ASCII, for those who have
terminals that transliterate characters like {|} into whatever their
national characters are).

I think that the proper approach is to start over completely. New
character set, new hardware for the proper transmission and
representation of said character set, and new software for the
manipulation of said character set. Better a clean solution all
through, than a hack that everyone will waste time trying in vain to
maintain.

Software wise, there's no reason why the good ideas in UNIX can't be
used in the new international operating system. AT&T might even attempt
to keep the same interfaces (I doubt it can be done; character set is a
pretty fundamental assumption). Really, AT&T and its partners in crime
should view this as an opportunity to fix the things that they screwed
up in the first place.

Me? I'll continue to use UNIX in its current form (4 BSD, when I have
the option) until the new system is available, and you can bet that it
will support whatever I'm doing at that time better than `old' UNIX
did, or I won't buy it.

	Erik E. Fair	ucbvax!fair	fair at ucbarpa.BERKELEY.EDU



More information about the Comp.unix mailing list