• Stephan Bergmann's avatar
    loplugin:salunicodeliteral · 5a4d0313
    Stephan Bergmann yazdı
    For the c-char in the u'...' literal, the preceding commits consistently use:
    
    * a simple-escape-sequence if the original code already used one
    * \0 for U+0000
    * the (\ escaped, for ' and \) source character matching U+0020..7E (even if it
      is not a basic source character)
    * a consistently four-digit hexadecimal-escape-sequence otherwise, \xNNNN
    
    For non-surrogate code points, the last case could probably also use \uNNNN
    universal-character-names.  However, for one, it isn't quite clear to me whether
    conversion of such to members of the execution chacacter set in character
    literals (in translation phase 5) is implementation-specific.  And for another,
    the current C++ standard references the dated (no pun intended) ISO/IEC
    10646-1:1993 specification, rather than the current ISO/IEC 10646:2014, and
    requires that a universal-characrer-name designate a character with a specific
    "character short name in ISO/IEC 10646", but I do not find a specification of a
    "short name" in ISO/IEC 10646:2014 and don't have access to ISO/IEC
    10646-1:1993, so am not sure whether that would e.g. cover noncharacters like
    U+FFFF.
    
    (The only exception is one occurrence of u'\x6C' in bestFitOpenSymbolToMSFont,
    filter/source/msfilter/util.cxx, where it is clear from the context that the
    value denotes neither a Unicode code point nor a UTF-16 code unit, but rather an
    index into the Wingdings font glyph table.)
    
    Change-Id: If36b94168428ba1e05977c370aceaa7e90131e90
    5a4d0313