[TUHS] Discuss of style and design of computer programs from a user stand point

Michael Kjörling michael at kjorling.se
Sun May 7 23:54:07 AEST 2017


On 7 May 2017 11:42 +1000, from noel.hunt at gmail.com (Noel Hunt):
> I don't imagine it would be hard to re-write [uniq] to
> handle utf-8.

It does look like at least GNU coreutils 8.13 uniq is broken in that
regard, which frankly surprised me. That version isn't _that_ old.

    $ uniq --version
    uniq (GNU coreutils) 8.13
    Copyright (C) 2011 Free Software Foundation, Inc.
    License GPLv3+: GNU GPL version 3 or later
    <http://gnu.org/licenses/gpl.html>.
    This is free software: you are free to change and redistribute it.
    There is NO WARRANTY, to the extent permitted by law.

    Written by Richard M. Stallman and David MacKenzie.
    $ ( echo $'\u1234' ; echo $'\u2345' ; echo $'\u1234' )
    ሴ
    ⍅
    ሴ
    $ ( echo $'\u1234' ; echo $'\u2345' ; echo $'\u1234' ) | uniq
    ሴ
    $

-- 
Michael Kjörling • https://michael.kjorling.semichael at kjorling.se
                 “People who think they know everything really annoy
                 those of us who know we don’t.” (Bjarne Stroustrup)



More information about the TUHS mailing list