eucTW man page on DigitalUNIX

Man page or keyword search:  
man Server   12896 pages
apropos Keyword Search (all sections)
Output format
DigitalUNIX logo
[printable version]

eucTW(5)							      eucTW(5)

NAME
       eucTW - A character encoding system (codeset) for Traditional Chinese

DESCRIPTION
       The  Taiwanese  EUC (Extended UNIX Code), or eucTW, codeset consists of
       the following character sets: ASCII CNS 11643 (Plane 1 to Plane 16)

       Taiwanese EUC uses a combination of single-byte data and 2-byte data to
       represent   ASCII  characters,  symbols,	 and  ideographic  characters.
       Because too many character planes were  included,  Taiwanese  EUC  uses
       different leading codes to designate different character planes.

       ASCII  characters are represented in the form of single byte 7-bit data
       in Taiwanese EUC; that is, the most significant bit (MSB) of  the  byte
       that represents an ASCII character is always set off. For more informa‐
       tion, refer to ascii(5).

       Although the standard Taiwanese EUC  codeset  includes  all  characters
       defined	by  the	 CNS 11643-1992 standard, the operating system's eucTW
       implementation currently supports the following: Characters defined  in
       the first and second planes of CNS 11643 The EDPC Recommended Character
       Set (refer to dechanyu(5) for  more  information)  CNS  11643-1986  and
       DTSCS  characters  that	have  been  remapped into the third and fourth
       character planes by the CNS 11643-1992 standard

       Characters that were added to CNS  11643-1986  by  the  CNS  11643-1992
       standard are not supported.

       The  characters	that  are  defined  in	plane  1  and  plane  2 of CNS
       11643-1992 and that are the same as those defined in CNS 11643-1986 are
       as follows:

       ────────────────────────────────────────────────────────────────────
       Character Plane	 Character Type		      Number of Characters
       ────────────────────────────────────────────────────────────────────
       1		 Special characters	      651
			 Control characters	      33
			 Frequently-used characters   5401
       2		 Less frequently-used char‐   7650
			 acters
       ────────────────────────────────────────────────────────────────────

       The characters defined in plane 3 and plane 4 of CNS 11643-1992 are  as
       follows:

       ──────────────────────────────────────────────────────────────────────────
       Character Plane	 Character Type				  Number      of
								  Characters
       ──────────────────────────────────────────────────────────────────────────
       3		 Rarely-used characters (EDPC Part I)	  6148
       4		 Used for  residency  system,  ISO  2nd	  7298
			 edition  DIS 10646 Han characters, 171
			 EDPC Part II Characters
       ──────────────────────────────────────────────────────────────────────────

       The characters that have been remapped into the third and fourth	 char‐
       acter planes of CNS 11643-1992 as specified by the EDPC are as follows:

       ─────────────────────────────────────────────────────────
       EDPC Characters	 Character Plane   Number of Characters
       ─────────────────────────────────────────────────────────
       Part I		 Plane 3	   6148
       Part II		 Plane 4	   171
       ─────────────────────────────────────────────────────────

   Taiwanese EUC Encoding
       Except  for  characters in the first plane of CNS 11643-1986, Taiwanese
       EUC makes use of a leading code (the 8-bit Single-Shift 2 control char‐
       acter  (SS2) and an additional byte) to designate characters to a char‐
       acter plane.

       The position of a character on a plane is specified by two  bytes.  The
       first  byte  determines	the character's row number and the second byte
       determines the character's column number. The MSB of both bytes is  set
       on.

       The following table shows the encoding of Taiwanese EUC characters:

       ───────────────────────────────────────────────────────
       CNS 11643-1986 Code Plane   Leading Code	  Code Range
       ───────────────────────────────────────────────────────
       1			   [nil]	  A1A1 - FEFE
       2			   SS2 A2	  A1A1 - FEFE
       3			   SS2 A3	  A1A1 - FEFE
       4			   SS2 A4	  A1A1 - FEFE
       5			   SS2 A5	  A1A1 - FEFE
       6			   SS2 A6	  A1A1 - FEFE
       7			   SS2 A7	  A1A1 - FEFE
       8			   SS2 A8	  A1A1 - FEFE
       9			   SS2 A9	  A1A1 - FEFE
       10			   SS2 AA	  A1A1 - FEFE
       11			   SS2 AB	  A1A1 - FEFE
       12			   SS2 AC	  A1A1 - FEFE
       13			   SS2 AD	  A1A1 - FEFE
       14			   SS2 AE	  A1A1 - FEFE
       15			   SS2 AF	  A1A1 - FEFE
       16			   SS2 B0	  A1A1 - FEFE
       ───────────────────────────────────────────────────────

   Codeset Conversion
       The following codeset converter pairs are available for converting Tra‐
       ditional Chinese characters between eucTW and other  encoding  formats.
       Refer  to iconv_intro(5) for an introduction to codeset conversion. For
       more information about the other codeset for which eucTW is  the	 input
       or  output,  see	 the  reference	 page  specified  in  the  list	 item.
       big5_eucTW, eucTW_big5

	      Converting from and to the Big-5 codeset: big5(5).

	      Note that Big-5 encoding is equivalent to	 the  Microsoft	 code-
	      page  format used on PCs for Traditional Chinese. You can there‐
	      fore use this set of converters to convert  Traditional  Chinese
	      text between the eucTW and PC code-page formats. For information
	      about how the operating  system  supports	 PC  code  pages,  see
	      code_page(5).  dechanyu_eucTW, eucTW_dechanyu

	      Converting  from	and  to	 the  DEC  Hanyu codeset: dechanyu(5).
	      dechanzi_eucTW, eucTW_dechanzi

	      Converting from and  to  the  DEC	 Hanzi	codeset:  dechanzi(5).
	      sbig5_eucTW, eucTW_sbig5

	      Converting from and to the Shift Big-5 codeset: sbig5(5).	 tele‐
	      code_eucTW, eucTW_telecode

	      Converting  from	and  to	 the  Telecode	codeset:  telecode(5).
	      UTF-16_eucTW, eucTW_UTF-16

	      Converting  from and to UTF-16 format: Unicode(5).  UCS-4_eucTW,
	      eucTW_UCS-4

	      Converting from and to UCS-4 format:  Unicode(5).	  UTF-8_eucTW,
	      eucTW_UTF-8

	      Converting from and to UTF--8 format: Unicode(5).

   Fonts for Taiwanese EUC
       For  both  display  devices and printers, the operating system supports
       Taiwanese EUC through internal conversion to DEC Hanyu code and use  of
       DEC Hanyu fonts (see dechanyu(5)).

       For   general  information  on  printing	 non-English  text,  refer  to
       i18n_printing(5).

SEE ALSO
       Commands: locale(1)

       Others:	ascii(5),  big5(5),  Chinese(5),  code_page(5),	  dechanzi(5),
       GBK(5), iconv_intro(5), i18n_intro(5), i18n_printing(5), l10n_intro(5),
       sbig5(5), telecode(5), Unicode(5)

								      eucTW(5)
[top]

List of man pages available for DigitalUNIX

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net