[TUHS] Aleph Null in Software Practice & Experience.
Rob Pike
robpike at gmail.com
Mon May 2 20:03:08 AEST 2022
Under option, maybe. I'm not a fan of putting invisible characters
into a program designed to translate numbers into cut-and-pasteable
text. Plus, as you said, it just makes other things break, although
perhaps they should be encouraged not to.
-rob
On Mon, May 2, 2022 at 7:56 PM Ralph Corderoy <ralph at inputplus.co.uk> wrote:
>
> Hi Rob,
>
> > The output of "unicode 5d0-5e7" (robpike.io/cmd/unicode has the
> > command) is fun.
> >
> > 05d0 א 05d1 ב 05d2 ג 05d3 ד
> > 05d4 ה 05d5 ו 05d6 ז 05d7 ח
> > 05d8 ט 05d9 י 05da ך 05db כ
> > 05dc ל 05dd ם 05de מ 05df ן
> > 05e0 נ 05e1 ס 05e2 ע 05e3 ף
> > 05e4 פ 05e5 ץ 05e6 צ 05e7 ק
> >
> > For comparison, here is "unicode 3d0-3e7". It will be fun to watch how
> > it's rendered.
> >
> > 03d0 ϐ 03d1 ϑ 03d2 ϒ 03d3 ϓ
> > 03d4 ϔ 03d5 ϕ 03d6 ϖ 03d7 ϗ
> > 03d8 Ϙ 03d9 ϙ 03da Ϛ 03db ϛ
> > 03dc Ϝ 03dd ϝ 03de Ϟ 03df ϟ
> > 03e0 Ϡ 03e1 ϡ 03e2 Ϣ 03e3 ϣ
> > 03e4 Ϥ 03e5 ϥ 03e6 Ϧ 03e7 ϧ
>
> In the terminal where I read and write email, they're all as if ‘0041 A’.
> But save the email's text/plain to foo.txt and foo.html, add a little HTML
> to foo.html, and the browser, here Firefox, presents the Hebrew in both as
>
> 05d0 05 אd1 05 בd2 05 גd3 ד
> 05d4 05 הd5 05 וd6 05 זd7 ח
> 05d8 05 טd9 05 יda 05 ךdb כ
> 05dc 05 לdd 05 םde 05 מdf ן
> 05e0 05 נe1 05 סe2 05 עe3 ף
> 05e4 05 פe5 05 ץe6 05 צe7 ק
>
> due to the mix of Unicode's strong, weak, and neutral bi-directional
> character types.
>
> To see what I intend above needs a ‘broken’ renderer, like a terminal.
> For those with more intelligent renderers, it's as if runes normally
> drawn as
>
> 00c0 À 00c1 Á 00c2 Â 00c3 Ã
>
> became
>
> 00c0 00 Àc1 00 Ác2 00 Âc3 Ã
>
> Wrapping each of the Hebrew characters in the text and HTML files in
> LRI...PDI,
>
> LRI U+2066 Left-to-right isolate
> PDI U+2069 Pop directional isolate
>
> so the first row becomes
>
> 0030 0035 0064 0030 0020 2066 05d0 2069 0020
> 0030 0035 0064 0031 0020 2066 05d1 2069 0020
> 0030 0035 0064 0032 0020 2066 05d2 2069 0020
> 0030 0035 0064 0033 0020 2066 05d3 2069 000a
>
> has Firefox display the tables as intended. Perhaps the unicode command
> should do this to ensure correct display, especially if some terminals
> ever start to improve?
>
> I note that vim(1) here doesn't realise LRI and PDI are zero width
> so the cursor position drifts past the end of the visible line.
> ed(1) copes without a murmur.
>
> --
> Cheers, Ralph.
More information about the TUHS
mailing list