Encyclopedia  |   World Factbook  |   World Flags  |   Reference Tables  |   List of Lists     
   Academic Disciplines  |   Historical Timeline  |   Themed Timelines  |   Biographies  |   How-Tos     
Sponsor by The Tattoo Collection
UTF-7
Main Page | See live article | Alphabetical index

UTF-7

Unicode
series
Unicode
UTF-7
UTF-8
UTF-16
UTF-32
SCSU
Punycode
BiDi
BOM
Consortium
UCS
Han unification

UTF-7 (7-bit Unicode Transformation Format) is a variable-length character encoding that was proposed for representing Unicode-encoded text using a stream of ASCII characters, for example for use in MIME messages.

MIME requires that the encoding used to send e-mail is ASCII, so any e-mail that directly uses 8-bit or 16-bit Unicode encodings such as UTF-16 is invalid. Unicode encoded in UTF-7 can be sent in e-mail without using a separate transfer encoding, but still must be explicitly identified as the text character set. In addition, if used within e-mail headers such as "Subject:" UTF-7 must be contained in MIME encoded wordss identifying the character set. For these and other reasons UTF-7 for use in e-mail has been largely deprecated in favor of UTF-8.

A modified form of UTF-7 is currently used in the IMAP e-mail retrieval protocol.

Table of contents
1 Description
2 Examples
3 External links

Description

UTF-7 was first standardized as RFC 1642, A Mail-Safe Transformation Format of Unicode. This RFC has been obsoleted by RFC 2152.

Characters below 0x80 (hexadecimal notation) within the ASCII range (except for the + character) are encoded as-is. Any character above 0x80 is encoded with an escape sequence of a + byte followed by the UTF-16 representation, encoded in Modified Base64, and terminated with a - byte (which is consumed), carriage return or line feed (which are not consumed). Literal + characters are encoded as +-.

Examples

External links