Share via


Please avoid UTF-7

UTF-7 inherently some of the security issues that concern people about encodings.  For example, by shifting in & out of the base64 mode one can create multiple representations of the same string, enabling spoofing and other problems.

UTF-7 is primarily interesting for legacy mail and NNTP applications that don't properly handle native or MIME encoded UTF-8.  The need for new content to be encoded in UTF-7 is very low.  In particular UTF-7 should be avoided with any modern systems that are natively 8-bit.  For example XML files don't inherently have any limitations that would force the need for UTF-7, so there should be no need for UTF-7 in XML files.

Of course with any general rule there may be some exceptions, but I'd encourage you to support UTF-8 or UTF-16 and only use UTF-7 if you run into some system that can't support an 8-bit encoding.  If you run into such 7 bit limitations it should probably be a warning that some redesign might be necessary.  For mail this is being considered by the IETF's eai working group at https://www.ietf.org/html.charters/eai-charter.html

Comments

  • Anonymous
    May 04, 2007
    In some cases MLang (on which MSXML6 depends) can added extra ? to decoded UTF-7 data, which can cause