
NLSTRANS - NLS Translation Utility



Starting the Translation Utility
--------------------------------

nlstrans [-v] <inputfile>

    -v turns on the verbose mode.  This switch is optional.

    <inputfile> is the name of the input file containing variations
                of the commands listed below.




Command Legend
--------------

<cpnum>       - The code page number (in decimal).
<langstr>     - The language string identifying the language.
<lcid>        - The locale id identifying the locale information.
<num entries> - The number of entries to follow (in decimal).
<mbchar>      - The multibyte character (in hexadecimal).
<wchar>       - The wide character (in hexadecimal).
<lowrange>    - The low end of the DBCS range (in hexidecimal).
<highrange>   - The high end of the DBCS range (in hexidecimal).
<maxcharlen>  - The maximum length, in bytes, of a character (in decimal).
<defaultchar> - The default character (in hexadecimal).
<dc_unitrans> - The unicode translation of the default character (in hex).
<ctype1>      - The character type 1 information (in hexidecimal).
<ctype2>      - The character type 2 information (in hexidecimal).
<ctype3>      - The character type 3 information (in hexidecimal).
<upper>       - The upper case wide character (in hexadecimal).
<lower>       - The lower case wide character (in hexadecimal).
<digit>       - The digit to translate to ascii (in hexadecimal).
<ascii>       - The ascii translation (in hexadecimal).
<czone>       - The compatibility zone character to translate (in hex).
<katakana>    - The katakana character to translate (in hex).
<hiragana>    - The hiragana character to translate (in hex).
<half width>  - The half width character to translate (in hex).
<full width>  - The full width character to translate (in hex).
<precomp>     - The precomposed character (in hexidecimal).
<base>        - The base character for the given precomposed form (in hex).
<nonspace>    - The nonspace character for the given precomposed form (in hex).
<code pt>     - The Unicode code point (in hexidecimal).
<SM>          - The script member (in hex).
<AW>          - The alphanumeric weight (in hex).
<DW>          - The diacritic weight (in hex).
<CW>          - The case weight (in hex).
<COMP>        - The compression value - 0, 1, 2, or 3 (in hex).




Commands
--------

(1) Code Page Specific Translation Tables

    - A semicolon may be used to denote a comment.  The comment will be
      read until the end of the current line.  So, once a semicolon is
      used, the rest of the current line is ignored.


    CODEPAGE <cpnum>

      - Starts the code page specific section.

      - Use the ENDCODEPAGE keyword to end the code page specific section.

      - Only the following keywords may be used between this keyword and
        the ENDCODEPAGE keyword:

          - CPINFO
          - MBTABLE
          - GLYPHTABLE
          - DBCSRANGE
          - WCTABLE


    ENDCODEPAGE

      - Ends the code page specific section.

      - Only used following the CODEPAGE keyword.


    CPINFO <maxcharlen> <defaultchar> <dc_unitrans>

      - The code page information.

      - This table MUST appear FIRST in the data file.


    MBTABLE <num entries>

      - The multibyte translation table.

      - The table to follow should be in the format:

          <mbchar> <wchar>

      - The maximum <num entries> should be 256.


    GLYPHTABLE <num entries>

      - The glyph character multibyte translation table.

      - The table to follow should be in the format:

          <mbchar> <wchar>

      - The maximum <num entries> should be 256.

      - This table MUST appear AFTER the MBTABLE in the data file.


    DBCSRANGE <num entries>

      - The DBCS ranges.

      - The table to follow should be in the format:

          <lowrange> <highrange>


        DBCSTABLE <num entries>

          - The DBCS translation table.

          - The table to follow should be in the format:

              <mbchar> <wchar>

          - The maximum <num entries> should be 256.

      - The DBCS tables MUST immediately follow their ranges and must
        include the DBCSTABLE keyword.  The tables MUST also be in the
        order in which they appear in the range (lowest first, highest last).


    WCTABLE <num entries>

      - The wide character translation table.

      - The table to follow should be in the format:

          <wchar> <mbchar>



(2) Language Specific Translation Tables

    - A semicolon may be used to denote a comment.  The comment will be
      read until the end of the current line.  So, once a semicolon is
      used, the rest of the current line is ignored.


    LANGUAGE <langstr>

      - Starts the language specific section.

      - Use the ENDLANGUAGE keyword to end the language specific section.

      - Only the following keywords may be used between this keyword and
        the ENDLANGUAGE keyword:

          - UPPERCASE
          - LOWERCASE


    ENDLANGUAGE

      - Ends the language specific section.

      - Only used following the LANGUAGE keyword.


    UPPERCASE <num entries>

      - The upper case translation table.

      - The table to follow should be in the format:

          <lower> <upper>


    LOWERCASE <num entries>

      - The lower case translation table.

      - The table to follow should be in the format:

          <upper> <lower>


    EXCEPTION <num entries>

      - The exception table for linguistic casing.

      - This table contains all exceptions to the default table on
        a per locale id basis in order to get proper linguistic
        casing.

      - The 0x00000000 locale id is used to make changes to the default
        table for *all* locales.  These exceptions will become part of the
        default linguistic casing table.

      - All entries in the exception table must exist in some form
        in the default table.  If there is no translation desired in
        the default table, then enter the code point as upper/lower
        casing to itself.

      - The table to follow should be in the format (for each lcid):

          LCID <lcid> <num upcase entries> <num locase entries>

            UPPERCASE

              <lower> <upper>

            LOWERCASE

              <upper> <lower>



(3) Locale Specific Translation Tables

    - NO COMMENTS will be accepted at anytime between the LOCALE and
      ENDLOCALE keywords and the CALENDAR and ENDCALENDAR keywords.
      A semicolon on a line will be used as part of the locale or
      calendar information, as well as any characters after the
      semicolon on the same line.


    LOCALE <num entries>

      - Starts the locale specific section.

      - Use the ENDLOCALE keyword to end the entire locale specific section.

      - Each set of locale information to follow should be in the format:


        BEGINLOCALE <lcid>

          - The locale information.  The order of the information is
            given below.

          - The table to follow should be in the format:

              <keyword> <info>

            or in some cases:

              <keyword> <num> <info>
                              <info>
                              ...

            where

            <keyword> is the keyword for the given information.
                      This string is ignored.

            <num>     is the number of entries for the keyword.  This means
                      there will be 'num' number of entries, where each
                      entry MUST BE on a separate line.  The keywords that
                      require the 'num' field are noted in the list of items
                      below.

            <info>    is the information to store in the data file.  All
                      information will be stored as a Unicode string.

                      The escape sequence "\x" may be used to designate hex
                      values above 0x00ff, but ALL 4 digits of the Unicode
                      character MUST exist for this to work properly.

                      If the backslash character is to appear in the given
                      string (it's not part of an escape sequence), then
                      two backslashes must be used in succession.

                      White space (space and tab) is stripped from both the
                      front and the back of the string unless specifically
                      noted with the escape sequence.  All other white space
                      is preserved.

                      To include TWO separate null-terminated strings for
                      one LCTYPE, the strings must be separated by \xffff.
                      This will be changed to 0x0000 in the binary file.
                      Currently, the second string will only be used by
                      the SMONTHNAME LCType information in the
                      GetDateFormatW api (Russian month names have different
                      grammar).

        This section must have the following information (IN THE GIVEN
        ORDER) following the BEGINLOCALE keyword.


           ILANGUAGE
           SENGLANGUAGE
           SABBREVLANGNAME
           SISO639LANGNAME
           SNATIVELANGNAME

           ICOUNTRY
           SENGCOUNTRY
           SABBREVCTRYNAME
           SISO3166CTRYNAME
           SNATIVECTRYNAME

           IDEFAULTLANGUAGE
           IDEFAULTCOUNTRY
           IDEFAULTANSICODEPAGE
           IDEFAULTOEMCODEPAGE

           SLIST
           IMEASURE

           SDECIMAL
           STHOUSAND
           SGROUPING
           IDIGITS
           ILZERO
           INEGNUMBER
           SNATIVEDIGITS

           SCURRENCY
           SINTLSYMBOL
           SMONDECIMALSEP
           SMONTHOUSANDSEP
           SMONGROUPING
           ICURRDIGITS
           IINTLCURRDIGITS
           ICURRENCY
           INEGCURR
           SPOSITIVESIGN
           SNEGATIVESIGN

           STIMEFORMAT         <num>
           STIME
           ITIME
           ITLZERO
           ITIMEMARKPOSN
           S1159
           S2359

           SSHORTDATE          <num>
           SDATE
           IDATE
           ICENTURY
           IDAYLZERO
           IMONLZERO

           SLONGDATE           <num>
           ILDATE

           ICALENDARTYPE
           IOPTIONALCALENDAR   <num>  (use \xffff for localized calendar name)

           IFIRSTDAYOFWEEK
           IFIRSTWEEKOFYEAR

           SDAYNAME1
           SDAYNAME2
           SDAYNAME3
           SDAYNAME4
           SDAYNAME5
           SDAYNAME6
           SDAYNAME7

           SABBREVDAYNAME1
           SABBREVDAYNAME2
           SABBREVDAYNAME3
           SABBREVDAYNAME4
           SABBREVDAYNAME5
           SABBREVDAYNAME6
           SABBREVDAYNAME7

           SMONTHNAME1
           SMONTHNAME2
           SMONTHNAME3
           SMONTHNAME4
           SMONTHNAME5
           SMONTHNAME6
           SMONTHNAME7
           SMONTHNAME8
           SMONTHNAME9
           SMONTHNAME10
           SMONTHNAME11
           SMONTHNAME12
           SMONTHNAME13

           SABBREVMONTHNAME1
           SABBREVMONTHNAME2
           SABBREVMONTHNAME3
           SABBREVMONTHNAME4
           SABBREVMONTHNAME5
           SABBREVMONTHNAME6
           SABBREVMONTHNAME7
           SABBREVMONTHNAME8
           SABBREVMONTHNAME9
           SABBREVMONTHNAME10
           SABBREVMONTHNAME11
           SABBREVMONTHNAME12
           SABBREVMONTHNAME13

           FONTSIGNATURE



    ENDLOCALE

      - Ends the locale specific section.

      - Only used following the LOCALE keyword.



    CALENDAR <num entries>

      - Starts the calendar specific section.

      - Use the ENDCALENDAR keyword to end the entire calendar specific section.

      - Each set of calendar information to follow should be in the format:


        BEGINCALENDAR <calendarid>

          - The calendar information.  The order of the information is
            given below.

          - The table to follow should be in the format:

              <keyword> <info>

            or in some cases:

              <keyword> <num> <info>
                              <info>
                              ...

            where

            <keyword> is the keyword for the given information.
                      This string is ignored.

            <num>     is the number of entries for the keyword.  This means
                      there will be 'num' number of entries, where each
                      entry MUST BE on a separate line.  The keywords that
                      require the 'num' field are noted in the list of items
                      below.

            <info>    is the information to store in the data file.  All
                      information will be stored as a Unicode string.

                      The escape sequence "\x" may be used to designate hex
                      values above 0x00ff, but ALL 4 digits of the Unicode
                      character MUST exist for this to work properly.

                      If the backslash character is to appear in the given
                      string (it's not part of an escape sequence), then
                      two backslashes must be used in succession.

                      White space (space and tab) is stripped from both the
                      front and the back of the string unless specifically
                      noted with the escape sequence.  All other white space
                      is preserved.

                      To include TWO separate null-terminated strings for
                      one LCTYPE, the strings must be separated by \xffff.
                      This will be changed to 0x0000 in the binary file.
                      Currently, the second string will only be used by
                      the SMONTHNAME LCType information in the
                      GetDateFormatW api (Russian month names have different
                      grammar).

        This section must have the following information (IN THE GIVEN
        ORDER) following the BEGINCALENDAR keyword.

           SCALENDAR

           SERARANGES        <num>   (use \xffff for era string)

           SSHORTDATE
           SLONGDATE

           IF_NAMES

           SDAYNAME1
           SDAYNAME2
           SDAYNAME3
           SDAYNAME4
           SDAYNAME5
           SDAYNAME6
           SDAYNAME7

           SABBREVDAYNAME1
           SABBREVDAYNAME2
           SABBREVDAYNAME3
           SABBREVDAYNAME4
           SABBREVDAYNAME5
           SABBREVDAYNAME6
           SABBREVDAYNAME7

           SMONTHNAME1
           SMONTHNAME2
           SMONTHNAME3
           SMONTHNAME4
           SMONTHNAME5
           SMONTHNAME6
           SMONTHNAME7
           SMONTHNAME8
           SMONTHNAME9
           SMONTHNAME10
           SMONTHNAME11
           SMONTHNAME12
           SMONTHNAME13

           SABBREVMONTHNAME1
           SABBREVMONTHNAME2
           SABBREVMONTHNAME3
           SABBREVMONTHNAME4
           SABBREVMONTHNAME5
           SABBREVMONTHNAME6
           SABBREVMONTHNAME7
           SABBREVMONTHNAME8
           SABBREVMONTHNAME9
           SABBREVMONTHNAME10
           SABBREVMONTHNAME11
           SABBREVMONTHNAME12
           SABBREVMONTHNAME13



    ENDCALENDAR

      - Ends the calendar specific section.

      - Only used following the CALENDAR keyword.



(4) Locale Independent (Unicode) Translation Tables

    - A semicolon may be used to denote a comment.  The comment will be
      read until the end of the current line.  So, once a semicolon is
      used, the rest of the current line is ignored.


    UNICODE

      - Starts the unicode section.

      - Use the ENDUNICODE keyword to end the unicode section.

      - Only the following keywords may be used between this keyword and
        the ENDUNICODE keyword:

          - ASCIIDIGITS
          - FOLDCZONE
          - COMP
          - HIRAGANA
          - KATAKANA
          - HALFWIDTH
          - FULLWIDTH


    ENDUNICODE

      - Ends the unicode section.

      - Only used following the UNICODE keyword.


    ASCIIDIGITS <num entries>

      - The ascii digits translation table.

      - The table to follow should be in the format:

          <digit> <ascii>


    FOLDCZONE <num entries>

      - The fold compatibility zone translation table.

      - The table to follow should be in the format:

          <czone> <ascii>


    HIRAGANA <num entries>

      - The Katakana to Hiragana translation table.

      - The table to follow should be in the format:

          <katakana> <hiragana>


    KATAKANA <num entries>

      - The Hiragana to Katakana translation table.

      - The table to follow should be in the format:

          <hiragana> <katakana>


    HALFWIDTH <num entries>

      - The Full Width to Half Width translation table.

      - The table to follow should be in the format:

          <full width> <half width>


    FULLWIDTH <num entries>

      - The Half Width to Full Width translation table.

      - The table to follow should be in the format:

          <half width> <full width>


    COMP <num entries>

      - The precomposed and composite translation tables.  Both versions
        of the table will be built from this data.

      - The table to follow should be in the format:

          <precomp> <base> <nonspace>



(5) Character Type Translation Tables

    - A semicolon may be used to denote a comment.  The comment will be
      read until the end of the current line.  So, once a semicolon is
      used, the rest of the current line is ignored.


    CTYPE <num entries>

      - The character type translation table.

      - The table to follow should be in the format:

          <wchar> <ctype1> <ctype2> <ctype3>



(6) SortKey Translation Tables

    - A semicolon may be used to denote a comment.  The comment will be
      read until the end of the current line.  So, once a semicolon is
      used, the rest of the current line is ignored.


    SORTKEY

      - Starts the sortkey section.  This is the default sortkey table.


    ENDSORTKEY

      - Ends the sortkey section.

      - Only used following the SORTKEY keyword.


    DEFAULT <num entries>

      - The default sortkey translation table.

      - Contains the weights on a per code point basis.

      - The table to follow should be in the format:

          <code pt>  <SM> <AW> <DW> <CW> <COMP>



(7) Sort Tables Translation Tables

    - A semicolon may be used to denote a comment.  The comment will be
      read until the end of the current line.  So, once a semicolon is
      used, the rest of the current line is ignored.


    SORTTABLES

      - Starts the sorttables section.  This section contains all
        sorting tables except the default sortkey table.

      - Use the ENDSORTTABLES keyword to end the sort tables section.

      - Only the following keywords may be used between this keyword and
        the ENDSORTTABLES keyword:

          - REVERSEDIACRITICS
          - DOUBLECOMPRESSION
          - IDEOGRAPH_LCID_EXCEPTION
          - MULTIPLEWEIGHTS
          - EXPANSION
          - EXCEPTION
          - COMPRESSION


    ENDSORTTABLES

      - Ends the sorttables section.

      - Only used following the SORTTABLES keyword.


    REVERSEDIACRITICS <num entries>

      - The reverse diacritics table.

      - This table contains all locale ids that require diacritics
        to be sorted from right to left (instead of left to right).

      - The table to follow should be in the format:

          <lcid>


    DOUBLECOMPRESSION <num entries>

      - The double compression table.

      - This table contains all locale ids that require special handling
        of the compression characters (eg. Hungarian).

      - The table to follow should be in the format:

          <lcid>


    IDEOGRAPH_LCID_EXCEPTION <num entries>

      - The ideograph lcid exception table.

      - This table contains all locale ids that require ideographs to be
        sorted other than in their Unicode ordering.  The name of the file
        containing the ideograph exceptions is also given here.

      - The file name may be no more than 8 characters in length.  The
        extension ".nls" will be added to the file name.

      - The table to follow should be in the format:

          <lcid>  <file name>


    MULTIPLEWEIGHTS <num entries>

      - The multiple weights table.

      - This table contains a list of all scripts that need multiple
        script members to represent the entire script (256 alphanumeric
        weights is not enough).

      - The table to follow should be in the format:

          <first script member> <number of script members in range>


    EXPANSION <num entries>

      - The expansion (ligature) table.

      - This table contains all possible expansion options for every
        locale, so there is no need to distinguish between the
        different locales.

      - The sortkey table will contain the index into this table in
        the AW field.  For that reason, this table MUST be in the
        correct order used by the sortkey default table and the
        exception table.

      - The maximum number of entries allowed in this table is 256.

      - The table to follow should be in the format:

          <expansion code pt> <code pt 1> <code pt 2>


    EXCEPTION <num entries>

      - The exception table.

      - This table contains all exceptions to the default table on
        a per locale id basis.

      - The table to follow should be in the format:

          LCID <lcid> <num entries>

            <code pt>  <SM> <AW> <DW> <CW> <COMP>


    COMPRESSION <num entries>

      - The compression table.

      - This table contains all compressions, both three to one and
        two to one, on a per locale id basis.

      - The table to follow should be in the format:

          LCID <lcid>

            TWO <num entries>

              <code pt 1>  <code pt 2>  <SM> <AW> <DW  <CW>

            THREE <num entries>

              <code pt 1>  <code pt 2>  <code pt 3> <SM> <AW> <DW> <CW>



(8) Ideograph Exception Tables

    - A semicolon may be used to denote a comment.  The comment will be
      read until the end of the current line.  So, once a semicolon is
      used, the rest of the current line is ignored.


    IDEOGRAPH_EXCEPTION  <num entries>  <file name>

      - The ideograph exception table.

      - The table to follow should be in the format:

          <code pt>  <SM> <AW>





Sample Files
------------

All sample files shown below are not real files.  They are simply meant
to show the syntax of the different data files.


(1) Sample Code Page File


    CODEPAGE 12

      CPINFO  1  0x7F  0x2302

      MBTABLE 11

        0x00    0x0000
        0x01    0x0001
        0x02    0x0002
        0x7F    0x2302
        0xB0    0x2591
        0xB1    0x2592
        0xB2    0x2593
        0xB3    0x2502
        0xB4    0x2524
        0xB5    0x2561
        0xB6    0x2562

      GLYPHTABLE 2

        0x01    0x263A
        0x02    0x263B

      DBCSRANGE 2

        0x51  0x51

          DBCSTABLE 1

            0x71  0x0025

        0x80  0x81

          DBCSTABLE 1

            0x3e  0x003e

          DBCSTABLE 2

            0x3f  0x003f
            0x40  0x0040

      WCTABLE 11

        0x0000  0x00
        0x0001  0x01
        0x0002  0x02
        0x2302  0x7F
        0x2502  0xB3
        0x2524  0xB4
        0x2561  0xB5
        0x2562  0xB6
        0x2591  0xB0
        0x2592  0xB1
        0x2593  0xB2

    ENDCODEPAGE



(2) Sample Language File


    LANGUAGE INTL

      UPPERCASE 9

        0x0061	0x0041
        0x0062	0x0042
        0x0063	0x0043
        0x0064	0x0044
        0x0065	0x0045
        0x0066	0x0046
        0x0067	0x0047
        0x0068	0x0048
        0x0069	0x0049
        0xff41  0xff41            ; placeholder for exception
        0xff42  0xff22            ; placeholder for exception

      LOWERCASE 9

        0x0041	0x0061
        0x0042	0x0062
        0x0043	0x0063
        0x0044	0x0064
        0x0045	0x0065
        0x0046	0x0066
        0x0047	0x0067
        0x0048	0x0068
        0x0049	0x0069
        0xff21  0xff21            ; placeholder for exception

    ENDLANGUAGE


    EXCEPTION 2

      LCID 0x00000000 2 1         ; default linguistic table

        UPPERCASE

          0xff41  0xff21
          0xff42  0xff22

        LOWERCASE

          0xff21  0xff41

      LCID 0x0000041f 2 2         ; Turkish

        UPPERCASE

          0x0069  0x0130
          0x0131  0x0049

        LOWERCASE

          0x0049  0x0131
          0x0130  0x0069



(3) Sample Locale File


    LOCALE 1

      BEGINLOCALE 0409           ; English - United States

        ILANGUAGE              0409
        SENGLANGUAGE           English
        SABBREVLANGNAME        ENU
        SISO639LANGNAME        EN
        SNATIVELANGNAME        English

        ICOUNTRY               1
        SENGCOUNTRY            United States
        SABBREVCTRYNAME        USA
        SISO3166CTRYNAME       US
        SNATIVECTRYNAME        United States

        IDEFAULTLANGUAGE       0409
        IDEFAULTCOUNTRY        1
        IDEFAULTANSICODEPAGE   1252
        IDEFAULTOEMCODEPAGE    437

        SLIST                  ,
        IMEASURE               1

        SDECIMAL               .
        STHOUSAND              ,
        SGROUPING              3;0
        IDIGITS                2
        ILZERO                 1
        INEGNUMBER             1
        SNATIVEDIGITS          0123456789

        SCURRENCY              $
        SINTLSYMBOL            USD
        SMONDECIMALSEP         .
        SMONTHOUSANDSEP        ,
        SMONGROUPING           3;0
        ICURRDIGITS            2
        IINTLCURRDIGITS        2
        ICURRENCY              0
        INEGCURR               0
        SPOSITIVESIGN          \x0000
        SNEGATIVESIGN          -

        STIMEFORMAT        4   h:mm:ss tt
                               hh:mm:ss tt
                               H:mm:ss
                               HH:mm:ss
        STIME                  :
        ITIME                  0
        ITLZERO                0
        ITIMEMARKPOSN          0
        S1159                  AM
        S2359                  PM

        SSHORTDATE         6   M/d/yy
                               M/d/yyyy
                               MM/dd/yy
                               MM/dd/yyyy
                               yy/MM/dd
                               dd-MMM-yy
        SDATE                  /
        IDATE                  0
        ICENTURY               0
        IDAYLZERO              0
        IMONLZERO              0

        SLONGDATE          4   dddd, MMMM dd, yyyy
                               MMMM dd, yyyy
                               dddd, dd MMMM, yyyy
                               dd MMMM, yyyy
        ILDATE                 0

        ICALENDARTYPE          1
        IOPTIONALCALENDAR  2   0\xffff
                               1\xffffGregorian Calendar

        IFIRSTDAYOFWEEK        6
        IFIRSTWEEKOFYEAR       0

        SDAYNAME1              Monday
        SDAYNAME2              Tuesday
        SDAYNAME3              Wednesday
        SDAYNAME4              Thursday
        SDAYNAME5              Friday
        SDAYNAME6              Saturday
        SDAYNAME7              Sunday

        SABBREVDAYNAME1        Mon
        SABBREVDAYNAME2        Tue
        SABBREVDAYNAME3        Wed
        SABBREVDAYNAME4        Thu
        SABBREVDAYNAME5        Fri
        SABBREVDAYNAME6        Sat
        SABBREVDAYNAME7        Sun

        SMONTHNAME1            January
        SMONTHNAME2            February
        SMONTHNAME3            March
        SMONTHNAME4            April
        SMONTHNAME5            May
        SMONTHNAME6            June
        SMONTHNAME7            July
        SMONTHNAME8            August
        SMONTHNAME9            September
        SMONTHNAME10           October
        SMONTHNAME11           November
        SMONTHNAME12           December
        SMONTHNAME13           \x0000

        SABBREVMONTHNAME1      Jan
        SABBREVMONTHNAME2      Feb
        SABBREVMONTHNAME3      Mar
        SABBREVMONTHNAME4      Apr
        SABBREVMONTHNAME5      May
        SABBREVMONTHNAME6      Jun
        SABBREVMONTHNAME7      Jul
        SABBREVMONTHNAME8      Aug
        SABBREVMONTHNAME9      Sep
        SABBREVMONTHNAME10     Oct
        SABBREVMONTHNAME11     Nov
        SABBREVMONTHNAME12     Dec
        SABBREVMONTHNAME13     \x0000

        FONTSIGNATURE          \x00af\x8000\x38cb\x0000\x0000\x0000\x0000\x0000\x0001\x0000\x0000\x8000\x00ff\x003f\x0000\xffff

    ENDLOCALE


    CALENDAR   5


      BEGINCALENDAR  0

        SCALENDAR              0

        SERARANGES             0

        SSHORTDATE             \x0000
        SLONGDATE              \x0000

        IF_NAMES               0


      BEGINCALENDAR  1

        SCALENDAR              1

        SERARANGES             0

        SSHORTDATE             MM/dd/yy
        SLONGDATE              dddd, MMMM dd, yyyy

        IF_NAMES               1

        SDAYNAME1              Monday
        SDAYNAME2              Tuesday
        SDAYNAME3              Wednesday
        SDAYNAME4              Thursday
        SDAYNAME5              Friday
        SDAYNAME6              Saturday
        SDAYNAME7              Sunday

        SABBREVDAYNAME1        Mon
        SABBREVDAYNAME2        Tue
        SABBREVDAYNAME3        Wed
        SABBREVDAYNAME4        Thu
        SABBREVDAYNAME5        Fri
        SABBREVDAYNAME6        Sat
        SABBREVDAYNAME7        Sun

        SMONTHNAME1            January
        SMONTHNAME2            February
        SMONTHNAME3            March
        SMONTHNAME4            April
        SMONTHNAME5            May
        SMONTHNAME6            June
        SMONTHNAME7            July
        SMONTHNAME8            August
        SMONTHNAME9            September
        SMONTHNAME10           October
        SMONTHNAME11           November
        SMONTHNAME12           December
        SMONTHNAME13           \x0000

        SABBREVMONTHNAME1      Jan
        SABBREVMONTHNAME2      Feb
        SABBREVMONTHNAME3      Mar
        SABBREVMONTHNAME4      Apr
        SABBREVMONTHNAME5      May
        SABBREVMONTHNAME6      Jun
        SABBREVMONTHNAME7      Jul
        SABBREVMONTHNAME8      Aug
        SABBREVMONTHNAME9      Sep
        SABBREVMONTHNAME10     Oct
        SABBREVMONTHNAME11     Nov
        SABBREVMONTHNAME12     Dec
        SABBREVMONTHNAME13     \x0000


      BEGINCALENDAR  2

        SCALENDAR              2

        SERARANGES          4  1989\xffff\x337b
                               1926\xffff\x337c
                               1912\xffff\x337d
                               1868\xffff\x337e

        SSHORTDATE             yy/MM/dd
        SLONGDATE              gg yyyy'\x5e74'M'\x6708'd'\x65e5'

        IF_NAMES               0


      BEGINCALENDAR  3

        SCALENDAR              3

        SERARANGES          2  1911\xffffA.D.
                               0\xffffB.C.

        SSHORTDATE             yy/MM/dd
        SLONGDATE              gg yyyy'\x5e74'M'\x6708'd'\x65e5'

        IF_NAMES               0


      BEGINCALENDAR  4

        SCALENDAR              4

        SERARANGES          2  1911\xffffA.D.
                               0\xffffB.C.

        SSHORTDATE             yy/MM/dd
        SLONGDATE              gg yyyy'\x5e74'M'\x6708'd'\x65e5'

        IF_NAMES               0


    ENDCALENDAR



(4) Sample Unicode File


    UNICODE

      ASCIIDIGITS 3	

        0x00B2  0x0032
        0x00B3  0x0033
        0x00B9  0x0031

      FOLDCZONE 4

        0xff01  0x0021
        0xff02  0x0022
        0xff03  0x0023
        0xff04  0x0024

      COMP 5

        0x00C0  0x0041  0x0300
        0x00C8  0x0045  0x0300
        0x00CC  0x0049  0x0300
        0x00D1  0x004E  0x0303
        0x00D2  0x004F  0x0300

      HIRAGANA 3

        0x30a1  0x3041
        0xff67  0x3041
        0x30a2  0x3042

      KATAKANA 4

        0x3041  0x30a1
        0x3042  0x30a2
        0x3043  0x30a3
        0x3044  0x30a4

      HALFWIDTH 3

        0x30d2  0xff8b
        0x30d5  0xff8c
        0x30d8  0xff8d

      FULLWIDTH 4

        0xff61  0x3002
        0xff62  0x300c
        0xff63  0x300d
        0xff64  0x3001

    ENDUNICODE



(5) Sample Character Type File


    CTYPES 12

      0x0000  0x0020  0x0000  0x0000
      0x0009  0x0068  0x0009  0x0000
      0x0020  0x0048  0x000A  0x0000
      0x0021  0x0010  0x000B  0x0008
      0x002F  0x0010  0x0003  0x0008
      0x0030  0x0084  0x0003  0x0000
      0x0041  0x0181  0x0001  0x0000
      0x0048  0x0101  0x0001  0x0000
      0x0061  0x0182  0x0001  0x0000
      0x0067  0x0102  0x0001  0x0000
      0x00BF  0x0010  0x000B  0x0008
      0x00C0  0x0101  0x0001  0x0003



(6) Sample Sortkey File


    SORTKEY

      DEFAULT 4

        0x0030  2  4  2  2  0
        0x0031  2  5  2  2  0
        0x0065  2  7  2  3  2
        0x0066  2  8  2  3  3

    ENDSORTKEY



(7) Sample Sort Tables File


    SORTTABLES

      REVERSEDIACRITICS  4

        0x0000040c
        0x0000080c
        0x00000c0c
        0x0000100c


      DOUBLECOMPRESSION  1

        0x0000040e


      IDEOGRAPH_LCID_EXCEPTION  4

        0x00010404  big5
        0x00010804  big5
        0x00010411  xjis
        0x00010412  ksc


      MULTIPLEWEIGHTS  1

        36  10


      EXPANSION  2

        0x00c6  0x0041  0x0045
        0x00e6  0x0061  0x0065


      EXCEPTION  2

        LCID  0x0000040a  2

          0x0065  2  7  2  3  2
          0x0066  2  8  2  3  3


        LCID  0x0000040c  2
        LCID  0x0000080c

          0x0030  2  4  2  2  0
          0x0031  2  5  2  2  0


      COMPRESSION  2

        LCID 0x0000040a
        LCID 0x0000080a

          TWO  2

            0x0043  0x0048  2  4  2  3
            0x0063  0x0068  2  4  2  2

          THREE  1

            0x0043  0x0048  0x0049  2  4  2  3


        LCID 0x0000080c

          TWO  1

            0x0063  0x0068  2  4  2  2

          THREE  0


    ENDSORTTABLES



(8) Sample Ideograph Exceptions File


    IDEOGRAPH_EXCEPTION  4  xjis

      0xfa22  185  243
      0xfa23  185  244
      0xfa24  185  245
      0xfa25  185  246

