on the Curl Web Content Markup Language

on the Curl Web Content Markup and Programming Language from www.curl.com and www.curlap.com

Friday, August 5, 2011

Character encodings

I thought that it is useful to have this Curl 7.0 info in one place.  On my Windows PC, the procedure returns over 100 CharEncoding of various types.

The information is in two dumps: one by type from the debugger and one as terminal output (they are separated by a Curl comment below.)

The first dump I have reordered slightly to group the types.

{get-all-character-encodings}  || 103 items in {Array-of CharEncoding}

  type                          name

NoneCharEncoding      "none-specified"
ShiftJISCharEncoding  "shift-jis"
EUCJPCharEncoding  "euc-jp"
UTF8CharEncoding      "utf8"
UTF8CharEncoding       "utf8-with-byte-marker"
UTF16CharEncoding    "ucs2-big-endian"
UTF16CharEncoding    "ucs2-little-endian"
UTF16UnknownEndianCharEncoding   "ucs2-unknown-endian"
SingleByteCharEncoding    "ascii"
SingleByteCharEncoding    "iso-latin-1"
MappedSingleByteCharEncoding   "windows-latin-1"
MappedSingleByteCharEncoding  "iso-latin-2"
MappedSingleByteCharEncoding   "iso-latin-3"
MappedSingleByteCharEncoding  "iso-latin-4"
MappedSingleByteCharEncoding  "iso-cyrillic"
MappedSingleByteCharEncoding   "iso-greek"
MappedSingleByteCharEncoding  "iso-latin-5"
MappedSingleByteCharEncoding  "iso-latin-6"
MappedSingleByteCharEncoding  "iso-latin-7"
MappedSingleByteCharEncoding  "iso-latin-8"
MappedSingleByteCharEncoding  "iso-latin-9"
MappedSingleByteCharEncoding  "windows-latin-2"
MappedSingleByteCharEncoding  "windows-cyrillic"
MappedSingleByteCharEncoding  "windows-greek"
MappedSingleByteCharEncoding  "windows-turkish"
MappedSingleByteCharEncoding  "windows-baltic"
MappedSingleByteCharEncoding  "koi8-r"
MappedSingleByteCharEncoding  "koi8-u"
MappedSingleByteCharEncoding  "dos-cyrillic"
HostEncoding   "win32:21027" 21027
HostEncoding   "win32:52936" 52936
HostEncoding   "win32:50227" 50227
HostEncoding   "iso-8859-11"     874
HostEncoding  "win32:50220" 50220
HostEncoding  "win32:10003" 10003
HostEncoding  "win32:10007" 10007
HostEncoding  "win32:10079" 10079
HostEncoding  "win32:21866" 21866
HostEncoding  "win32:1254"    1254
HostEncoding  "win32:28605" 28605
HostEncoding  "win32:28597" 28597
HostEncoding  "win32:50229" 50229
HostEncoding  "win32:28592" 28592
HostEncoding  "win32:855"       855
HostEncoding  "win32:10000" 10000
HostEncoding  "win32:10008" 10008
HostEncoding  "win32:51949" 51949
HostEncoding  "win32:65000" 65000
HostEncoding  "win32:10082" 10082
HostEncoding  "win32:65001" 65001
HostEncoding  "win32:28603" 28603
HostEncoding  "gb2312"           936
HostEncoding  "win32:1257"    1257
HostEncoding  "win32:860"      860
HostEncoding  "win32:20000" 20000
HostEncoding  "win32:10006" 10006
HostEncoding  "gb2312-80"   20936
HostEncoding  "win32:10017" 10017
HostEncoding  "win32:866"        866
HostEncoding      "gb18030"   54936
HostEncoding      "win32:37"       37
HostEncoding      "win32:1253" 1253
HostEncoding      "win32:10029" 10029
HostEncoding      "win32:20949" 20949
HostEncoding      "win32:1026" 1026
HostEncoding      "win32:20127" 20127
HostEncoding      "johab"            1361
HostEncoding      "win32:28599" 28599
HostEncoding      "euc-kr"           949
HostEncoding      "win32:863"      863
HostEncoding      "win32:20932" 20932
HostEncoding      "win32:1252" 1252
HostEncoding     "win32:737"     737
HostEncoding     "win32:28594" 28594
HostEncoding     "win32:28591" 28591
HostEncoding     "win32:20866" 20866
HostEncoding     "win32:28595" 28595
HostEncoding     "win32:875"        875
HostEncoding     "win32:500"        500
HostEncoding     "win32:20290" 20290
HostEncoding     "win32:50225" 50225
HostEncoding     "win32:10010" 10010
HostEncoding     "win32:50222" 50222
HostEncoding     "win32:20261" 20261
HostEncoding     "win32:1251"   1251
HostEncoding     "win32:861"       861
HostEncoding     "win32:437"      437
HostEncoding     "win32:869"      869
HostEncoding     "windows-1255" 1255
HostEncoding     "win32:10002" 10002
HostEncoding     "win32:10081" 10081
HostEncoding     "windows-1258" 1258
HostEncoding     "win32:857"       857
HostEncoding     "win32:50221" 50221
HostEncoding     "win32:10001" 10001
HostEncoding     "win32:775"      775
HostEncoding     "win32:865"      865
HostEncoding     "win32:932"      932
HostEncoding     "win32:852"      852
HostEncoding     "win32:1250"   1250
HostEncoding     "windows-1256" 1256
HostEncoding     "win32:850"      850
HostEncoding     "big5"              950

|# ------------------------------------------------------------------------ #|

name                       display-name

none-specified         None
ascii                       US ASCII
iso-latin-1                Western European (ISO)
windows-latin-1        Western European (Windows)
utf8                         Unicode (UTF-8)
utf8-with-byte-marker    utf8-with-byte-marker
ucs2-big-endian           Unicode (UTF-16BE)
ucs2-little-endian         Unicode (UTF-16LE)
ucs2-unknown-endian  Unicode (UTF-16)
iso-latin-2       Central European (ISO)
iso-latin-3       Southern European (ISO)
iso-latin-4       Northern European (ISO)
iso-cyrillic      Cyrillic (ISO)
iso-greek       Greek (ISO)
iso-latin-5      Turkish (ISO)
iso-latin-6      Nordic (ISO)
iso-latin-7      Baltic (ISO)
iso-latin-8      Celtic (ISO)
iso-latin-9      Latin 9 (ISO)
windows-latin-2          Central European (Windows)
windows-cyrillic         Cyrillic (Windows)
windows-greek          Greek (Windows)
windows-turkish        Turkish (Windows)
windows-baltic   Baltic (Windows)
koi8-r               Cyrillic (KOI8-R)
koi8-u              Cyrillic (KOI8-U)
dos-cyrillic       Cyrillic (DOS)
shift-jis            Japanese (Shift-JIS)
euc-jp             Japanese (EUC)
win32:21027      21027 (Ext Alpha Lowercase)
win32:52936      52936 (HZ-GB2312 Simplified Chinese)
win32:50227      50227 (ISO-2022 Simplified Chinese)
iso-8859-11       Thai (ISO)
win32:50220      50220 (ISO-2022 Japanese with no halfwidth Katakana)
win32:10003      10003 (MAC - Korean)
win32:10007      10007 (MAC - Cyrillic)
win32:10079      10079 (MAC - Icelandic)
win32:21866      21866 (Ukrainian - KOI8-U)
win32:1254       1254  (ANSI - Turkish)
win32:28605      28605 (ISO 8859-15 Latin 9)
win32:28597      28597 (ISO 8859-7 Greek)
win32:50229      50229 (ISO-2022 Traditional Chinese)
win32:28592      28592 (ISO 8859-2 Central Europe)
win32:855         855   (OEM - Cyrillic)
win32:10000      10000 (MAC - Roman)
win32:10008      10008 (MAC - Simplified Chinese GB 2312)
win32:51949      51949 (EUC-Korean)
win32:65000      65000 (UTF-7)
win32:10082      10082 (MAC - Croatia)
win32:65001      65001 (UTF-8)
win32:28603      win32:28603
gb2312             Chinese Simplified (GB2312)
win32:1257       1257  (ANSI - Baltic)
win32:860         860   (OEM - Portuguese)
win32:20000     20000 (CNS - Taiwan)
win32:10006     10006 (MAC - Greek I)
gb2312-80        Chinese Simplified (GB2312-80)
win32:10017     10017 (MAC - Ukraine)
win32:866        866   (OEM - Russian)
gb18030          Chinese Simplified (GB18030)
win32:37          37    (IBM EBCDIC - U.S./Canada)
win32:1253      1253  (ANSI - Greek)
win32:10029    10029 (MAC - Latin II)
win32:20949     win32:20949
win32:1026      1026  (IBM EBCDIC - Turkish (Latin-5))
win32:20127     20127 (US-ASCII)
johab               Korean (Johab)
win32:28599     28599 (ISO 8859-9 Latin 5)
euc-kr              Korean (EUC)
win32:863        863   (OEM - Canadian French)
win32:20932     20932 (JIS X 0208-1990 & 0212-1990)
win32:1252       1252  (ANSI - Latin I)
win32:737        737   (OEM - Greek 437G)
win32:28594     28594 (ISO 8859-4 Baltic)
win32:28591     28591 (ISO 8859-1 Latin I)
win32:20866     20866 (Russian - KOI8)
win32:28595     28595 (ISO 8859-5 Cyrillic)
win32:875        875   (IBM EBCDIC - Modern Greek)
win32:500        500   (IBM EBCDIC - International)
win32:20290      20290 (IBM EBCDIC - Japanese Katakana Extended)
win32:50225      50225 (ISO-2022 Korean)
win32:10010      10010 (MAC - Romania)
win32:50222      50222 (ISO-2022 Japanese JIS X 0201-1989)
win32:20261      20261 (T.61)
win32:1251       1251  (ANSI - Cyrillic)
win32:861         861   (OEM - Icelandic)
win32:437         437   (OEM - United States)
win32:869         869   (OEM - Modern Greek)
windows-1255     Hebrew (Windows)
win32:10002      10002 (MAC - Traditional Chinese Big5)
win32:10081      10081 (MAC - Turkish)
windows-1258    Vietnamese (Windows)
win32:857         857   (OEM - Turkish)
win32:50221      50221 (ISO-2022 Japanese with halfwidth Katakana)
win32:10001      10001 (MAC - Japanese)
win32:775         775   (OEM - Baltic)
win32:865         865   (OEM - Nordic)
win32:932         932   (ANSI/OEM - Japanese Shift-JIS)
win32:852         852   (OEM - Latin II)
win32:1250       1250  (ANSI - Central Europe)
windows-1256   Arabic (Windows)
win32:850        850   (OEM - Multilingual Latin I)
big5                Chinese Traditional (Big5)
|| -------------------------------------------------------------

No comments:

Post a Comment