Producing web pages
* HTML CSS DHTML XHTML A to Z of tags        Accessibility          Design

Character set recognition

Character set recognition defines which character sets Internet Explorer recognises in the HTTP header of HTTP replies, and which character set it recognises in the <meta> tag. It also specifies which built-in character set translation the character set maps to.

Table of base charsets, display names, and aliases

In the following table, the base character set is the basic translation built into IE3. Aliases lists all other character set IDs that are recognised and can be represented without translation, using the "base charset" translation method. This does not, in all cases, mean that alias and base character set represent the same character set; the alias character set can be a subset of the base character set. Base charset is not a recognised name unless repeated in the "aliases" column.

Base Character

Display Name

Aliases

1252

Western

us-ascii, iso8859-1, ascii, iso_8859-1, iso-8859-1, ANSI_X3.4-1968, iso-ir-6, ANSI_X3.4-1986, ISO_646.irv:1991, ISO646-US, us, IBM367, cp367, csASCII, latin1, iso_8859-1:1987, iso-ir-100, ibm819, cp819

28592

Central European (ISO)

iso8859-2, iso-8859-2, iso_8859-2, latin2, iso_8859-2:1987, iso-ir-101, l2, csISOLatin2

1250

Central European (Windows)

windows-1250, x-cp1250

1251

Cyrillic (Windows)

windows-1251, x-cp1251

1253

Greek (Windows)

windows-1253

1254

Turkish (Windows)

windows-1254

932

Shift-JIS

shift_jis, x-sjis, ms_Kanji, csShiftJIS

EUC-JP

EUC

Extended_UNIX_Code_Packed_Format_for_Japanese, csEUCPkdFmtJapanese, x-euc-jp

JIS

JIS

csISO2022JP, iso-2022-jp

1257

windows-1257

 

950

Traditional Chinese (BIG5)

big5, csbig5, x-x-big5

936

Simplified Chinese

GB_2312-80, iso-ir-58, chinese, csISO58GB231280, csGB2312, gb2312

20866

Cyrillic (KOI8-R)

csKOI8R, koi8-r

949

Korean

ks_c_5601, ks_c_5601-1987, korean, csKSC56011987

Correct Usage

The correct usage is as specified in RFC 1341. For example:

<meta http-equiv="Content-Type" content="text/html; 
charset=Windows-1251"> 

This should be in or before <head> but certainly before <body>.

Priority

The following list shows the priorities of character set declarations that Internet Explorer will use.

Use any character set parameter passed in the HTTP content-type

Use the <meta> tag

Use the user preference for default document encoding

A frameset can have different character sets per frame.

Position of <meta .. charset=..> in the page

The <meta ... charset...> sequence can appear anywhere in the document before the <body> tag. In any case, it affects the whole document, including <title>, appearing before the <meta ... charset...> tag.