dechanzi man page on DigitalUNIX

Man page or keyword search:  
man Server   12896 pages
apropos Keyword Search (all sections)
Output format
DigitalUNIX logo
[printable version]

dechanzi(5)							   dechanzi(5)

NAME
       dechanzi - A character encoding system (codeset) for Simplified Chinese

DESCRIPTION
       The  DEC	 Hanzi	(dechanzi) codeset consists of the following character
       sets: ASCII GB2312-80 Extended GB

       DEC Hanzi uses a 2-byte data representation for symbols and ideographic
       characters that are defined in GB2312-80.

   ASCII Characters
       All  ASCII characters are represented in the form of single-byte, 7-bit
       data in the DEC Hanzi codeset; that is, the most significant bit	 (MSB)
       of  the	byte that represents an ASCII character is always set off. For
       more information on ASCII characters, refer to ascii(5).

   GB2312-80 Characters
       The code table for GB2312-80 characters is divided  into	 94  rows(Qu),
       numbered from 1 to 94. Each row has 94 columns(Wei), also numbered from
       1 to 94. The code table defines a total of 7445	characters,  of	 which
       6763 are Chinese characters. Chinese characters are grouped as follows:
       Graphic symbols

	      There are 682 graphic symbols, which occupy rows 1 to 9  in  the
	      code table.  Frequently used (Level 1) characters

	      There  are 3755 frequently used characters, which occupy rows 16
	      to 55 in the code table.	Less frequently used (Level 2) charac‐
	      ters

	      There  are  3008	less  frequently used characters, which occupy
	      rows 56-87 in the code table.

       To differentiate GB2312-80 character codes from ASCII and  Extended  GB
       character  codes, the most significant bit (MSB) of both the first byte
       and the second byte are set on. The following formulas show how to cal‐
       culate the value for a GB2312-80 character from its row and column num‐
       bers:

       1st byte = A0 + Row number
       2nd byte = A0 + Column number

       For example, if a GB2312-80 character is in the	first  column  of  the
       16th  row,  the	character's value is B0A1, which is calculated as fol‐
       lows:

       1st byte = A0(hex) + 16 = B0(hex)
       2nd byte = A0(hex) + 01 = A1(hex)

   Extended GB Characters
       The Extended GB code table is similar to the GB2312 code table  and  is
       divided	into  94  rows and 94 columns (8894 code points). However, the
       Extended GB code table provides code points for user-defined characters
       (UDC).  The  8836 code points in this table are divided into two areas:
       User-defined area

	      This area spans rows 1 to 87  and	 provides  8178	 code  points.
	      User-defined (reserved) area

	      This area spans rows 88 to 94 and provides 658 code points. This
	      area is where users can define special  and  long-lasting	 user-
	      defined characters.

       To  differentiate  Extended  GB	codes  from  ASCII codes and GB2312-80
       codes, the most significant bit (MSB) of the first byte is set on while
       that of the second byte is set off. The following formulas show how the
       code value of an Extended GB character is calculated from its  row  and
       column numbers:

       1st byte = A0 + Row number
       2nd byte = 20 + Column number

       For  example,  if  a character is positioned at the first column of the
       16th row on the GB2312-80 code plane, the character's  value  is	 B021,
       which is calculated as follows:

       1st byte = A0(hex) + 16 = B0(hex)
       2nd byte = 20(hex) + 01 = 21(hex)

   Codeset Conversion
       The following codeset converter pairs are available for converting Sim‐
       plified Chinese characters between dechanzi and other encoding formats.
       Refer  to iconv_intro(5) for an introduction to codeset conversion. For
       more information about the other codeset	 for  which  dechanzi  is  the
       input  or  output,  see	the reference page specified in the list item.
       big5_dechanzi, dechanzi_big5

	      Converting   from	  and	to   the   Big-5   codeset:    big5(5)
	      dechanyu_dechanzi, dechanzi_dechanyu

	      Converting  from	and  to	 the  DEC  Hanyu  codeset: dechanyu(5)
	      eucTW_dechanzi, dechanzi_eucTW

	      Converting from and to Taiwanese Extended	 UNIX  Code:  eucTW(5)
	      UTF-16_dechanzi, dechanzi_UTF-16

	      Converting from and to UTF-16 format: Unicode(5) UCS-4_dechanzi,
	      dechanzi_UCS-4

	      Converting from and to UCS-4 format: Unicode(5)  UTF-8_dechanzi,
	      dechanzi_UTF-8

	      Converting from and to UTF-8 format: Unicode(5)

       DEC  Hanzi  encoding  is	 identical  to	the Microsoft code-page format
       (cp936) used for Simplified Chinese characters on PC systems.  However,
       DEC  Hanzi  supports  fewer characters than supported by the code page.
       Therefore, using converters with dechanzi in the converter name to con‐
       vert  between  cp936  and  other	 formats can result in some data loss.
       Refer to code_page(5) for more information about PC code pages.

   DEC Hanzi Fonts
       The operating system provides both screen and  printer  fonts  for  DEC
       Hanzi  characters.  The operating system also provides bit map fonts in
       addition to the TrueType fonts described in this section.  For  a  com‐
       plete  description of DEC Hanzi fonts, see the document, Technical Ref‐
       erence for Using Chinese Features.

       The following set of Simplified Chinese TrueType fonts are installed as
       the  operating  system  default fonts for DEC Hanzi: -css_dongwen-fang‐
       song-medium-r-normal--0-0-0-0-c-0-gb2312.1980-0	-css_dongwen-fangsong-
       medium-r-normal--0-0-0-0-c-0-gb2312.1980-1	-css_dongwen-fangsong-
       medium-r-normal--0-0-0-0-c-0-iso8859-1

       -css_dongwen-heiti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-0
       -css_dongwen-heiti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-1
       -css_dongwen-heiti-medium-r-normal--0-0-0-0-c-0-iso8859-1

       -css_dongwen-kaiti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-0
       -css_dongwen-kaiti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-1
       -css_dongwen-kaiti-medium-r-normal--0-0-0-0-c-0-iso8859-1

       -css_dongwen-songti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-0
       -css_dongwen-songti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-1
       -css_dongwen-songti-medium-r-normal--0-0-0-0-c-0-iso8859-1

       The following set of Simplified Chinese TrueType fonts are available as
       an	installation	  option:      -huatian-fangsong-medium-r-nor‐
       mal--0-0-0-0-c-0-gb2312.1980-0	       -huatian-fangsong-medium-r-nor‐
       mal--0-0-0-0-c-0-gb2312.1980-1	       -huatian-fangsong-medium-r-nor‐
       mal--0-0-0-0-m-0-iso8859-1

       -huatian-heiti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-0     -huatian-
       heiti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-1 -huatian-heiti-medium-
       r-normal--0-0-0-0-m-0-iso8859-1

       -huatian-kaiti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-0     -huatian-
       kaiti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-1 -huatian-kaiti-medium-
       r-normal--0-0-0-0-m-0-iso8859-1

       -huatian-songti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-0    -huatian-
       songti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-1      -huatian-songti-
       medium-r-normal--0-0-0-0-m-0-iso8859-1

       With either the default or optional font	 sets  installed,  the	SongTi
       fonts are the default screen fonts for the DEC Hanzi codeset.

       The  operating  system  provides the following PostScript printer fonts
       for DEC Hanzi characters: Hei-GB2312-80 XiSong-GB2312-80

       For general information on  printing  Asian  language  text,  refer  to
       i18n_printing(5).

SEE ALSO
       Commands: locale(1)

       Others:	 ascii(5),  big5(5),  Chinese(5),  code_page(5),  dechanyu(5),
       eucTW(5),   GB18030(5),	 GBK(5),   i18n_intro(5),    i18n_printing(5),
       iconv_intro(5), l10n_intro(5), sbig5(5), telecode(5), Unicode(5)

								   dechanzi(5)
[top]

List of man pages available for DigitalUNIX

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net