[TUHS] Aleph Null in Software Practice & Experience.

Ralph Corderoy ralph at inputplus.co.uk
Mon May 2 19:55:14 AEST 2022


Hi Rob,

> The output of  "unicode 5d0-5e7" (robpike.io/cmd/unicode has the
> command) is fun.
>
> 05d0 א 05d1 ב 05d2 ג 05d3 ד
> 05d4 ה 05d5 ו 05d6 ז 05d7 ח
> 05d8 ט 05d9 י 05da ך 05db כ
> 05dc ל 05dd ם 05de מ 05df ן
> 05e0 נ 05e1 ס 05e2 ע 05e3 ף
> 05e4 פ 05e5 ץ 05e6 צ 05e7 ק
>
> For comparison, here is "unicode 3d0-3e7". It will be fun to watch how
> it's rendered.
>
> 03d0 ϐ 03d1 ϑ 03d2 ϒ 03d3 ϓ
> 03d4 ϔ 03d5 ϕ 03d6 ϖ 03d7 ϗ
> 03d8 Ϙ 03d9 ϙ 03da Ϛ 03db ϛ
> 03dc Ϝ 03dd ϝ 03de Ϟ 03df ϟ
> 03e0 Ϡ 03e1 ϡ 03e2 Ϣ 03e3 ϣ
> 03e4 Ϥ 03e5 ϥ 03e6 Ϧ 03e7 ϧ

In the terminal where I read and write email, they're all as if ‘0041 A’.
But save the email's text/plain to foo.txt and foo.html, add a little HTML
to foo.html, and the browser, here Firefox, presents the Hebrew in both as

    05d0 05 אd1 05 בd2 05 גd3 ד
    05d4 05 הd5 05 וd6 05 זd7 ח
    05d8 05 טd9 05 יda 05 ךdb כ
    05dc 05 לdd 05 םde 05 מdf ן
    05e0 05 נe1 05 סe2 05 עe3 ף
    05e4 05 פe5 05 ץe6 05 צe7 ק

due to the mix of Unicode's strong, weak, and neutral bi-directional
character types.

To see what I intend above needs a ‘broken’ renderer, like a terminal.
For those with more intelligent renderers, it's as if runes normally
drawn as

    00c0 À 00c1 Á 00c2 Â 00c3 Ã

became

    00c0 00 Àc1 00 Ác2 00 Âc3 Ã

Wrapping each of the Hebrew characters in the text and HTML files in
LRI...PDI,

    LRI  U+2066  Left-to-right isolate 
    PDI  U+2069  Pop directional isolate

so the first row becomes

    0030 0035 0064 0030  0020  2066 05d0 2069  0020
    0030 0035 0064 0031  0020  2066 05d1 2069  0020
    0030 0035 0064 0032  0020  2066 05d2 2069  0020
    0030 0035 0064 0033  0020  2066 05d3 2069  000a

has Firefox display the tables as intended.  Perhaps the unicode command
should do this to ensure correct display, especially if some terminals
ever start to improve?

I note that vim(1) here doesn't realise LRI and PDI are zero width
so the cursor position drifts past the end of the visible line.
ed(1) copes without a murmur.

-- 
Cheers, Ralph.


More information about the TUHS mailing list