Проект OpenNet: MAN iconv_unicode (5) Форматы файлов (FreeBSD и Linux)

Интерактивная система просмотра системных руководств (man-ов)

iconv_unicode (5)

>> iconv_unicode (5) ( Solaris man: Форматы файлов )

NAME

iconv_unicode - code set conversion tables for Unicode

DESCRIPTION

The following code set conversions are supported:

                    CODE SET CONVERSIONS SUPPORTED
                   ------------------------------
 FROM Code Set                               TO Code Set
     Code              FROM          Target Code            TO
                       Filename                             Filename
                       Element                              Element
    
ISO 8859-1 (Latin 1)    8859-1            UTF-8               UTF-8
ISO 8859-2 (Latin 2)    8859-2            UTF-8               UTF-8
ISO 8859-3 (Latin 3)    8859-3            UTF-8               UTF-8
ISO 8859-4 (Latin 4)    8859-4            UTF-8               UTF-8
ISO 8859-5 (Cyrillic)   8859-5            UTF-8               UTF-8
ISO 8859-6 (Arabic)     8859-6            UTF-8               UTF-8
ISO 8859-7 (Greek)      8859-7            UTF-8               UTF-8
ISO 8859-8 (Hebrew)     8859-8            UTF-8               UTF-8
ISO 8859-9 (Latin 5)    8859-9            UTF-8               UTF-8
ISO 8859-10 (Latin 6)   8859-10           UTF-8               UTF-8
Japanese EUC            eucJP             UTF-8               UTF-8
Chinese/PRC EUC
(GB 2312-1980)          gb2312            UTF-8               UTF-8
ISO-2022                iso2022           UTF-8               UTF-8
Korean EUC              ko_KR-euc         Korean UTF-8        ko_KR-UTF-8
ISO-2022-KR             ko_KR-iso2022-7   Korean UTF-8        ko_KR_UTF-8
Korean Johap
(KS C 5601-1987)        ko_KR-johap       Korean UTF-8        ko_KR-UTF-8
Korean Johap
(KS C 5601-1992)        ko_KR-johap92     Korean UTF-8        ko_KR-UTF-8
Korean UTF-8            ko_KR-UTF-8       Korean EUC          ko_KR-euc
Korean UTF-8            ko_KR-UTF-8       Korean Johap        ko_KR-johap
                                         (KS C 5601-1987)        
Korean UTF-8            ko_KR-UTF-8       Korean Johap        ko_KR-johap92
                                         (KS C 5601-1992)
KOI8-R (Cyrillic)       KOI8-R            UCS-2               UCS-2
KOI8-R (Cyrillic)       KOI8-R            UTF-8               UTF-8
PC Kanji (SJIS)         PCK               UTF-8               UTF-8
PC Kanji (SJIS)         SJIS              UTF-8               UTF-8
UCS-2                   UCS-2             KOI8-R (Cyrillic)   KOI8-R
UCS-2                   UCS-2             UCS-4               UCS-4

                    CODE SET CONVERSIONS SUPPORTED
                   ------------------------------
 FROM Code Set                               TO Code Set
     Code              FROM          Target Code            TO
                       Filename                             Filename
                       Element                              Element
   
UCS-2              UCS-2           UTF-7                   UTF-7
UCS-2              UCS-2           UTF-8                   UTF-8
UCS-4              UCS-4           UCS-2                   UCS-2
UCS-4              UCS-4           UTF-16                  UTF-16
UCS-4              UCS-4           UTF-7                   UTF-7
UCS-4              UCS-4           UTF-8                   UTF-8
UTF-16             UTF-16          UCS-4                   UCS-4
UTF-16             UTF-16          UTF-8                   UTF-8
UTF-7              UTF-7           UCS-2                   UCS-2
UTF-7              UTF-7           UCS-4                   UCS-4
UTF-7              UTF-7           UTF-8                   UTF-8
UTF-8              UTF-8           ISO 8859-1 (Latin 1)    8859-1
UTF-8              UTF-8           ISO 8859-2 (Latin 2)    8859-2
UTF-8              UTF-8           ISO 8859-3 (Latin 3)    8859-3
UTF-8              UTF-8           ISO 8859-4 (Latin 4)    8859-4
UTF-8              UTF-8           ISO 8859-5 (Cyrillic)   8859-5
UTF-8              UTF-8           ISO 8859-6 (Arabic)     8859-6
UTF-8              UTF-8           ISO 8859-7 (Greek)      8859-7
UTF-8              UTF-8           ISO 8859-8 (Hebrew)     8859-8
UTF-8              UTF-8           ISO 8859-9 (Latin 5)    8859-9
UTF-8              UTF-8           ISO 8859-10 (Latin 6)   8859-10
UTF-8              UTF-8           Japanese EUC            eucJP
UTF-8              UTF-8           Chinese/PRC EUC         gb2312
                                  (GB 2312-1980)
UTF-8              UTF-8           ISO-2022                iso2022
UTF-8              UTF-8           KOI8-R (Cyrillic)       KOI8-R
UTF-8              UTF-8           PC Kanji (SJIS)         PCK
UTF-8              UTF-8           PC Kanji (SJIS)         SJIS
UTF-8              UTF-8           UCS-2                   UCS-2
UTF-8              UTF-8           UCS-4                   UCS-4
UTF-8              UTF-8           UTF-16                  UTF-16
UTF-8              UTF-8           UTF-7                   UTF-7
UTF-8              UTF-8           Chinese/PRC EUC         zh_CN.euc
                                  (GB 2312-1980)

                    CODE SET CONVERSIONS SUPPORTED
                   ------------------------------
 FROM Code Set                               TO Code Set
     Code              FROM          Target Code            TO
                       Filename                             Filename
                       Element                              Element
   
UTF-8                 UTF-8             ISO 2022-CN           zh_CN.iso2022-7
UTF-8                 UTF-8             Chinese/Taiwan Big5   zh_TW-big5
UTF-8                 UTF-8             Chinese/Taiwan  EUC   zh_TW-euc
                                       (CNS 11643-1992)
UTF-8                 UTF-8             ISO 2022-TW           zh_TW-iso2022-7
Chinese/PRC EUC       zh_CN.euc         UTF-8                 UTF-8
(GB 2312-1980)
ISO 2022-CN           zh_CN.iso2022-7   UTF-8                 UTF-8
Chinese/Taiwan Big5   zh_TW-big5        UTF-8                 UTF-8
Chinese/Taiwan  EUC   zh_TW-euc         UTF-8                 UTF-8
(CNS 11643-1992)
ISO 2022-TW           zh_TW-iso2022-7   UTF-8                 UTF-8

EXAMPLES

Example 1 The library module filename

In the conversion library, /usr/lib/iconv (see iconv(3C)), the library module filename is composed of two symbolic elements separated by the percent sign (%). The first symbol specifies the code set that is being converted; the second symbol specifies the target code, that is, the code set to which the first one is being converted.

In the conversion table above, the first symbol is termed the "FROM Filename Element". The second symbol, representing the target code set, is the "TO Filename Element".

For example, the library module filename to convert from the Korean EUC code set to the Korean UTF-8 code set is

ko_KR-euc%ko_KR-UTF-8

FILES

/usr/lib/iconv/*.so

: conversion modules

NOTES

ISO 8859 character sets using Latin alphabetic characters are distinguished as follows:

ISO 8859-1 (Latin 1)

For most West European languages, including:

Albanian Finnish Italian

Catalan French Norwegian

Danish

Dutch

English

Faeroese

ISO 8859-2 (Latin 2)

For most Latin-written Slavic and Central European languages:

Czech Polish Slovak

German Rumanian Slovene

Hungarian

ISO 8859-3 (Latin 3)

: Popularly used for Esperanto, Galician, Maltese, and Turkish.

ISO 8859-4 (Latin 4)

: Introduces letters for Estonian, Latvian, and Lithuanian. It is an incomplete predecessor of ISO 8859-10 (Latin 6).

ISO 8859-9 (Latin 5)

: Replaces the rarely needed Icelandic letters in ISO 8859-1 (Latin 1) with the Turkish ones.

ISO 8859-10 (Latin 6)

: Adds the last Inuit (Greenlandic) and Sami (Lappish) letters that were not included in ISO 8859-4 (Latin 4) to complete coverage of the Nordic area.

Партнёры:

Хостинг:

Закладки на сайте
Проследить за страницей

Created 1996-2024 by Maxim Chirkov
Добавить, Поддержать, Вебмастеру

Albanian	Finnish	Italian
Catalan	French	Norwegian
Danish
Dutch
English
Faeroese