4.4BSD/usr/share/man/cat4/utf2.0

Compare this file to the similar file:
Show the results in this format:

UTF2(4)                     BSD Programmer's Manual                    UTF2(4)

NNAAMMEE
     UUTTFF22 - Universal character set Transformation Format encoding of runes

SSYYNNOOPPSSIISS
     EENNCCOODDIINNGG ""UUTTFF22""

DDEESSCCRRIIPPTTIIOONN
     The UUTTFF22 encoding is based on a proposed X-Open multibyte FSS-UCS-TF
     (File System Safe Universal Character Set Transformation Format) encoding
     as used in PPllaann 99 ffrroomm BBeellll LLaabbss.. Although it is capable of representing
     more than 16 bits, the current implementation is limited to 16 bits as
     defined by the Unicode Standard.

     UUTTFF22 representation is backwards compatible with ASCII, so 0x00-0x7f re-
     fer to the ASCII character set.  The multibyte encoding of runes between
     0x0080 and 0xffff consist entirely of bytes whose high order bit is set.
     The actual encoding is represented by the following table:

     [0x0000 - 0x007f] [00000000.0bbbbbbb] -> 0bbbbbbb
     [0x0080 - 0x03ff] [00000bbb.bbbbbbbb] -> 110bbbbb, 10bbbbbb
     [0x0400 - 0xffff] [bbbbbbbb.bbbbbbbb] -> 1110bbbb, 10bbbbbb, 10bbbbbb

     If more than a single representation of a value exists (for example,
     0x00; 0xC0 0x80; 0xE0 0x80 0x80) the shortest representation is always
     used (but the longer ones will be correctly decoded).

     The final three encodings provided by X-Open:

     [00000000.000bbbbb.bbbbbbbb.bbbbbbbb] ->
             11110bbb, 10bbbbbb, 10bbbbbb, 10bbbbbb

     [000000bb.bbbbbbbb.bbbbbbbb.bbbbbbbb] ->
             111110bb, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb

     [0bbbbbbb.bbbbbbbb.bbbbbbbb.bbbbbbbb] ->
             1111110b, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb

     which provides for the entire proposed ISO-10646 31 bit standard are cur-
     rently not implemented.

SSEEEE AALLSSOO
     mklocale(1),  setlocale(3)

4.4BSD                           June 4, 1993                                1