The OpenNET Project / Index page

[ новости /+++ | форум | теги | ]

Интерактивная система просмотра системных руководств (man-ов)

 ТемаНаборКатегория 
 
 [Cписок руководств | Печать]

iconv_unicode (5)
  • >> iconv_unicode (5) ( Solaris man: Форматы файлов )
  •  

    NAME

    iconv_unicode - code set conversion tables for Unicode
     
    

    DESCRIPTION

    The following code set conversions are supported:

                        CODE SET CONVERSIONS SUPPORTED
                       ------------------------------
     FROM Code Set                               TO Code Set
         Code              FROM          Target Code            TO
                           Filename                             Filename
                           Element                              Element
        
    ISO 8859-1 (Latin 1)    8859-1            UTF-8               UTF-8
    ISO 8859-2 (Latin 2)    8859-2            UTF-8               UTF-8
    ISO 8859-3 (Latin 3)    8859-3            UTF-8               UTF-8
    ISO 8859-4 (Latin 4)    8859-4            UTF-8               UTF-8
    ISO 8859-5 (Cyrillic)   8859-5            UTF-8               UTF-8
    ISO 8859-6 (Arabic)     8859-6            UTF-8               UTF-8
    ISO 8859-7 (Greek)      8859-7            UTF-8               UTF-8
    ISO 8859-8 (Hebrew)     8859-8            UTF-8               UTF-8
    ISO 8859-9 (Latin 5)    8859-9            UTF-8               UTF-8
    ISO 8859-10 (Latin 6)   8859-10           UTF-8               UTF-8
    Japanese EUC            eucJP             UTF-8               UTF-8
    Chinese/PRC EUC
    (GB 2312-1980)          gb2312            UTF-8               UTF-8
    ISO-2022                iso2022           UTF-8               UTF-8
    Korean EUC              ko_KR-euc         Korean UTF-8        ko_KR-UTF-8
    ISO-2022-KR             ko_KR-iso2022-7   Korean UTF-8        ko_KR_UTF-8
    Korean Johap
    (KS C 5601-1987)        ko_KR-johap       Korean UTF-8        ko_KR-UTF-8
    Korean Johap
    (KS C 5601-1992)        ko_KR-johap92     Korean UTF-8        ko_KR-UTF-8
    Korean UTF-8            ko_KR-UTF-8       Korean EUC          ko_KR-euc
    Korean UTF-8            ko_KR-UTF-8       Korean Johap        ko_KR-johap
                                             (KS C 5601-1987)        
    Korean UTF-8            ko_KR-UTF-8       Korean Johap        ko_KR-johap92
                                             (KS C 5601-1992)
    KOI8-R (Cyrillic)       KOI8-R            UCS-2               UCS-2
    KOI8-R (Cyrillic)       KOI8-R            UTF-8               UTF-8
    PC Kanji (SJIS)         PCK               UTF-8               UTF-8
    PC Kanji (SJIS)         SJIS              UTF-8               UTF-8
    UCS-2                   UCS-2             KOI8-R (Cyrillic)   KOI8-R
    UCS-2                   UCS-2             UCS-4               UCS-4
    

                        CODE SET CONVERSIONS SUPPORTED
                       ------------------------------
     FROM Code Set                               TO Code Set
         Code              FROM          Target Code            TO
                           Filename                             Filename
                           Element                              Element
       
    UCS-2              UCS-2           UTF-7                   UTF-7
    UCS-2              UCS-2           UTF-8                   UTF-8
    UCS-4              UCS-4           UCS-2                   UCS-2
    UCS-4              UCS-4           UTF-16                  UTF-16
    UCS-4              UCS-4           UTF-7                   UTF-7
    UCS-4              UCS-4           UTF-8                   UTF-8
    UTF-16             UTF-16          UCS-4                   UCS-4
    UTF-16             UTF-16          UTF-8                   UTF-8
    UTF-7              UTF-7           UCS-2                   UCS-2
    UTF-7              UTF-7           UCS-4                   UCS-4
    UTF-7              UTF-7           UTF-8                   UTF-8
    UTF-8              UTF-8           ISO 8859-1 (Latin 1)    8859-1
    UTF-8              UTF-8           ISO 8859-2 (Latin 2)    8859-2
    UTF-8              UTF-8           ISO 8859-3 (Latin 3)    8859-3
    UTF-8              UTF-8           ISO 8859-4 (Latin 4)    8859-4
    UTF-8              UTF-8           ISO 8859-5 (Cyrillic)   8859-5
    UTF-8              UTF-8           ISO 8859-6 (Arabic)     8859-6
    UTF-8              UTF-8           ISO 8859-7 (Greek)      8859-7
    UTF-8              UTF-8           ISO 8859-8 (Hebrew)     8859-8
    UTF-8              UTF-8           ISO 8859-9 (Latin 5)    8859-9
    UTF-8              UTF-8           ISO 8859-10 (Latin 6)   8859-10
    UTF-8              UTF-8           Japanese EUC            eucJP
    UTF-8              UTF-8           Chinese/PRC EUC         gb2312
                                      (GB 2312-1980)
    UTF-8              UTF-8           ISO-2022                iso2022
    UTF-8              UTF-8           KOI8-R (Cyrillic)       KOI8-R
    UTF-8              UTF-8           PC Kanji (SJIS)         PCK
    UTF-8              UTF-8           PC Kanji (SJIS)         SJIS
    UTF-8              UTF-8           UCS-2                   UCS-2
    UTF-8              UTF-8           UCS-4                   UCS-4
    UTF-8              UTF-8           UTF-16                  UTF-16
    UTF-8              UTF-8           UTF-7                   UTF-7
    UTF-8              UTF-8           Chinese/PRC EUC         zh_CN.euc
                                      (GB 2312-1980)
    

                        CODE SET CONVERSIONS SUPPORTED
                       ------------------------------
     FROM Code Set                               TO Code Set
         Code              FROM          Target Code            TO
                           Filename                             Filename
                           Element                              Element
       
    UTF-8                 UTF-8             ISO 2022-CN           zh_CN.iso2022-7
    UTF-8                 UTF-8             Chinese/Taiwan Big5   zh_TW-big5
    UTF-8                 UTF-8             Chinese/Taiwan  EUC   zh_TW-euc
                                           (CNS 11643-1992)
    UTF-8                 UTF-8             ISO 2022-TW           zh_TW-iso2022-7
    Chinese/PRC EUC       zh_CN.euc         UTF-8                 UTF-8
    (GB 2312-1980)
    ISO 2022-CN           zh_CN.iso2022-7   UTF-8                 UTF-8
    Chinese/Taiwan Big5   zh_TW-big5        UTF-8                 UTF-8
    Chinese/Taiwan  EUC   zh_TW-euc         UTF-8                 UTF-8
    (CNS 11643-1992)
    ISO 2022-TW           zh_TW-iso2022-7   UTF-8                 UTF-8
    

     

    EXAMPLES

    Example 1 The library module filename

    In the conversion library, /usr/lib/iconv (see iconv(3C)), the library module filename is composed of two symbolic elements separated by the percent sign (%). The first symbol specifies the code set that is being converted; the second symbol specifies the target code, that is, the code set to which the first one is being converted.

    In the conversion table above, the first symbol is termed the "FROM Filename Element". The second symbol, representing the target code set, is the "TO Filename Element".

    For example, the library module filename to convert from the Korean EUC code set to the Korean UTF-8 code set is

    ko_KR-euc%ko_KR-UTF-8

     

    FILES

    /usr/lib/iconv/*.so

    conversion modules

     

    SEE ALSO

    iconv(1), iconv(3C), iconv(5)

    Chernov, A., Registration of a Cyrillic Character Set, RFC 1489, RELCOM Development Team, July 1993.

    Chon, K., H. Je Park, and U. Choi, Korean Character Encoding for Internet Messages, RFC 1557, Solvit Chosun Media, December 1993.

    Goldsmith, D., and M. Davis, UTF-7 - A Mail-Safe Transformation Format of Unicode, RFC 1642, Taligent, Inc., July 1994.

    Lee, F., HZ - A Data Format for Exchanging Files of Arbitrarily Mixed Chinese and ASCII characters, RFC 1843, Stanford University, August 1995.

    Murai, J., M. Crispin, and E. van der Poel, Japanese Character Encoding for Internet Messages, RFC 1468, Keio University, Panda Programming, June 1993.

    Nussbacher, H., and Y. Bourvine, Hebrew Character Encoding for Internet Messages, RFC 1555, Israeli Inter-University, Hebrew University, December 1993.

    Ohta, M., Character Sets ISO-10646 and ISO-10646-J-1, RFC 1815, Tokyo Institute of Technology, July 1995.

    Ohta, M., and K. Handa, ISO-2022-JP-2: Multilingual Extension of ISO-2022-JP, RFC 1554, Tokyo Institute of Technology, December 1993.

    Reynolds, J., and J. Postel, ASSIGNED NUMBERS, RFC 1700, University of Southern California/Information Sciences Institute, October 1994.

    Simonson, K., Character Mnemonics & Character Sets, RFC 1345, Rationel Almen Planlaegning, June 1992.

    Spinellis, D., Greek Character Encoding for Electronic Mail Messages, RFC 1947, SENA S.A., May 1996.

    The Unicode Consortium, The Unicode Standard, Version 2.0, Addison Wesley Developers Press, July 1996.

    Wei, Y., Y. Zhang, J. Li, J. Ding, and Y. Jiang, ASCII Printable Characters-Based Chinese Character Encoding for Internet Messages, RFC 1842, AsiaInfo Services Inc., Harvard University, Rice University, University of Maryland, August 1995.

    Yergeau, F., UTF-8, a transformation format of Unicode and ISO 10646, RFC 2044, Alis Technologies, October 1996.

    Zhu, H., D. Hu, Z. Wang, T. Kao, W. Chang, and M. Crispin, Chinese Character Encoding for Internet Messages, RFC 1922, Tsinghua University, China Information Technology Standardization Technical Committee (CITS), Institute for Information Industry (III), University of Washington, March 1996.  

    NOTES

    ISO 8859 character sets using Latin alphabetic characters are distinguished as follows:

    ISO 8859-1 (Latin 1)

    For most West European languages, including:

    AlbanianFinnishItalian
    CatalanFrenchNorwegian
    Danish
    Dutch
    English
    Faeroese

    ISO 8859-2 (Latin 2)

    For most Latin-written Slavic and Central European languages:

    CzechPolishSlovak
    GermanRumanianSlovene
    Hungarian

    ISO 8859-3 (Latin 3)

    Popularly used for Esperanto, Galician, Maltese, and Turkish.

    ISO 8859-4 (Latin 4)

    Introduces letters for Estonian, Latvian, and Lithuanian. It is an incomplete predecessor of ISO 8859-10 (Latin 6).

    ISO 8859-9 (Latin 5)

    Replaces the rarely needed Icelandic letters in ISO 8859-1 (Latin 1) with the Turkish ones.

    ISO 8859-10 (Latin 6)

    Adds the last Inuit (Greenlandic) and Sami (Lappish) letters that were not included in ISO 8859-4 (Latin 4) to complete coverage of the Nordic area.


     

    Index

    NAME
    DESCRIPTION
    EXAMPLES
    FILES
    SEE ALSO
    NOTES


    Поиск по тексту MAN-ов: 




    Партнёры:
    PostgresPro
    Inferno Solutions
    Hosting by Hoster.ru
    Хостинг:

    Закладки на сайте
    Проследить за страницей
    Created 1996-2024 by Maxim Chirkov
    Добавить, Поддержать, Вебмастеру