Character encoding at its simplest is the method that maps binary data to their proper character equivalents. For example, in a standard, U.S. English document character, 65 is matched to a capital A. Most English fonts follow the American Standard Code for Information Interchange (ASCII) coding. So when aWeb designer inserts a capital A, he is assured that the user will see the A. There are, of course, plenty of caveats to that statement. The document must be encoded as English, the specified font must also be encoded as English, and the user agent must not interfere with either encoding.

Note:Document encoding is typically passed to the user agent in the Content-Type HTTP header, such as the following:

Content-Type: text/html; charset=EN-US

However, some user agents don’t correctly handle encoding in the HTTP header. If you need to explicitly declare a document’s encoding, you should use an appropriate meta tag in your document, similar to the following:

<meta http-equiv=“Content-Type” content=“text/html; charset=EN-US”>

So what happens when any of the necessary pieces are different or changed from what they were intended to be? For example, what if your document is viewed in Japan, where the requisite user agent font is in Japanese instead of English? In those cases, the document encoding helps ensure that the right characters are used. Most fonts have international characters encoded in them as well as their native character set.

When a non-native encoding is specified, the user agent tries to use the appropriate characters in the appropriate font. If appropriate characters cannot be found in the current font, alternate fonts can be used. However, none of this can be accomplished if the document does not declare its encoding. Without knowing the document encoding the user agent simply uses the character that corresponds to the character position arriving in the data stream. For example, a capital A gets translated to whatever character is 65th in the font the user agent is using.

