泰山OFFICE技术讲座:JDK字体支持编码的研究1

Posted 柳鲲鹏

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了泰山OFFICE技术讲座:JDK字体支持编码的研究1相关的知识,希望对你有一定的参考价值。

  JDK的字体支持编码。是怎样支持的?支持情况如何?为此专门研究了一下。

  只研究TrueType字体格式。代码文件是sun.font.TrueTypeFont。

  • 字段defaultCodePage

是JDK默认编码?

  • 字段ulUnicodeRange1/ulUnicodeRange2/ulUnicodeRange3/ulUnicodeRange4

应该是字体支持的unicode范围?

  • 字段ulCodePageRange1/ulCodePageRange2

编码集?

  • 字段supportsCJK

CJK是中日韩的缩写

  • 字段supportsJA

JA是日文的意思?

  • 字段encoding_mapping
    static final String encoding_mapping[] = 
        "cp1252",    /*  0:Latin 1  */
        "cp1250",    /*  1:Latin 2  */
        "cp1251",    /*  2:Cyrillic */
        "cp1253",    /*  3:Greek    */
        "cp1254",    /*  4:Turkish/Latin 5  */
        "cp1255",    /*  5:Hebrew   */
        "cp1256",    /*  6:Arabic   */
        "cp1257",    /*  7:Windows Baltic   */
        "",          /*  8:reserved for alternate ANSI */
        "",          /*  9:reserved for alternate ANSI */
        "",          /* 10:reserved for alternate ANSI */
        "",          /* 11:reserved for alternate ANSI */
        "",          /* 12:reserved for alternate ANSI */
        "",          /* 13:reserved for alternate ANSI */
        "",          /* 14:reserved for alternate ANSI */
        "",          /* 15:reserved for alternate ANSI */
        "ms874",     /* 16:Thai     */
        "ms932",     /* 17:JIS/Japanese */
        "gbk",       /* 18:PRC GBK Cp950  */
        "ms949",     /* 19:Korean Extended Wansung */
        "ms950",     /* 20:Chinese (Taiwan, Hongkong, Macau) */
        "ms1361",    /* 21:Korean Johab */
        "",          /* 22 */
        "",          /* 23 */
        "",          /* 24 */
        "",          /* 25 */
        "",          /* 26 */
        "",          /* 27 */
        "",          /* 28 */
        "",          /* 29 */
        "",          /* 30 */
        "",          /* 31 */
    ;

东亚字体:日文ms932,中文gbk/ms950,韩文ms949/ms1361。如果是18030会改名为gbk,简化判断。

  • 字段language
    private static final String languages[][] = 

        /* cp1252/Latin 1 */
         "en", "ca", "da", "de", "es", "fi", "fr", "is", "it",
          "nl", "no", "pt", "sq", "sv", ,

         /* cp1250/Latin2 */
         "cs", "cz", "et", "hr", "hu", "nr", "pl", "ro", "sk",
          "sl", "sq", "sr", ,

        /* cp1251/Cyrillic */
         "bg", "mk", "ru", "sh", "uk" ,

        /* cp1253/Greek*/
         "el" ,

         /* cp1254/Turkish,Latin 5 */
         "tr" ,

         /* cp1255/Hebrew */
         "he" ,

        /* cp1256/Arabic */
         "ar" ,

         /* cp1257/Windows Baltic */
         "et", "lt", "lv" ,

        /* ms874/Thai */
         "th" ,

         /* ms932/Japanese */
         "ja" ,

        /* gbk/Chinese (PRC GBK Cp950) */
         "zh", "zh_CN", ,

        /* ms949/Korean Extended Wansung */
         "ko" ,

        /* ms950/Chinese (Taiwan, Hongkong, Macau) */
         "zh_HK", "zh_TW", ,

        /* ms1361/Korean Johab */
         "ko" ,
    ;

与东亚相关:

         /* ms932/Japanese */
        "ja" ,

        /* gbk/Chinese (PRC GBK Cp950) */
        "zh", "zh_CN", ,

        /* ms949/Korean Extended Wansung */
        "ko" ,

        /* ms950/Chinese (Taiwan, Hongkong, Macau) */
        "zh_HK", "zh_TW", ,

        /* ms1361/Korean Johab */
        "ko" ,

  • 字段codePages
    private static final String codePages[] = 
        "cp1252",
        "cp1250",
        "cp1251",
        "cp1253",
        "cp1254",
        "cp1255",
        "cp1256",
        "cp1257",
        "ms874",
        "ms932",
        "gbk",
        "ms949",
        "ms950",
        "ms1361",
    ;

现在看看产生字体时,如何处理编码信息的。

以上是关于泰山OFFICE技术讲座:JDK字体支持编码的研究1的主要内容,如果未能解决你的问题,请参考以下文章

泰山OFFICE技术讲座:字体的缩放研究及效果

泰山OFFICE技术讲座:中英文间隔,间隔以哪个字体为准?

泰山OFFICE技术讲座:中英文间隔,间隔以哪个字体为准?

泰山OFFICE技术讲座:字体的位置研究1

泰山OFFICE技术讲座:字体的间距研究1

泰山OFFICE技术讲座:等线字体高度的深入研究