[TUHS] Bell Foreign-Language UNIX Efforts

Steffen Nurpmeso steffen at sdaoden.eu
Tue Mar 21 01:44:30 AEST 2023


arnold at skeeve.com wrote in
 <202303200755.32K7tIeW023352 at freefriends.org>:
 |Rob Pike <robpike at gmail.com> wrote:
 |> (Speaking of design by committee, the multibyte stuff in C89 was \
 |> atrocious,
 |> and I heard was done in committee to get someone, perhaps the Japanese, \
 |> to
 |> sign off.)
 |
 |It's not lovely, but I wouldn't call it atrocious. It gets the job
 |done; code using it can handle multibyte encodings while being totally

No it does not.

 |character-set agnostic.  I speak from experience, gawk does this.

However note that even something like "uppercase this string"
cannot be done the right way, because a truly Unicode aware
operation needs to look at the entire string (sentence), because
there may be interdependencies that modify the result.  Therefore
the entire isw*() and tow*() series is simply wrong.  And
therefore gawk does this wrong, too.  (But the GNU environment
does have a solution, i think.)

 |(I use the "restartable" routins - mbrlen() and so on.)

Yes.

 |I understand that Unicode + UTF-8 solve the issue completely. But I'd

In fact to do it right you need something like ICU.
There are special number systems, they do not fit ISO C.
There are special grammatical rules to obey, which especially
hurts regarding everything truly collation aware.

(And then my brain simply runs away from the thinking that
invented strcoll(3) for anything beyond all-american ten inch.)

 |like to ask, in all seriousness and so that I can learn, given the world
 |as it was in 1989, how would you solve the problem? If you had designed
 |the C level routines, what would they have looked like?

P.S.: no, no, and one more no.
If you want to have a nice Monday, please have a look at NetBSD
current source code, lib/libc/gen/vis.c.  There you see how good
this interface "gets the job done".  And i saw it evolve as the
commits of Christos Zoulas flew by, ten years or so ago.
No.

 |Thanks,

Then again it all does not matter since IETF and more simply throw
one more thing upon the other, so that you need a JSON library for
a key=value list, and a HTTP, HTTP/2 and HTTP/3 library to
download it over TLS (i think the entire world now proxies all
protocols over :443, which makes it safer, and administration
easier! .. i have heard).  Why did you invent 16-bit ports by
then?  What were you thinking?  One is enough, and much safer!
That makes me wonder how OpenBSD could introduce two remotes holes
for only one port, .. but that likely is a different story.

Hysterical on a Monday, and that on Equinox.

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)


More information about the TUHS mailing list