String 字符串中含有 Unicode 编码时,转为UTF-8

Posted lemon-flm

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了String 字符串中含有 Unicode 编码时,转为UTF-8相关的知识,希望对你有一定的参考价值。

1、单纯的Unicode 转码

String a = "u53efu4ee5u6ce8u518c";
a = new String(a.getBytes("UTF-16"),"Unicode");

 

 

2、String 字符串中含有 Unicode 编码时,转为UTF-8

public static String decodeUnicode(String theString) {    
        char aChar;    
        int len = theString.length();    
        StringBuffer outBuffer = new StringBuffer(len);    
        for (int x = 0; x < len;) {    
            aChar = theString.charAt(x++);    
            if (aChar == ‘\‘) {    
                aChar = theString.charAt(x++);    
                if (aChar == ‘u‘) {    
                    // Read the xxxx    
                    int value = 0;    
                    for (int i = 0; i < 4; i++) {    
                        aChar = theString.charAt(x++);    
                        switch (aChar) {    
                        case ‘0‘:    
                        case ‘1‘:    
                        case ‘2‘:    
                        case ‘3‘:    
                        case ‘4‘:    
                        case ‘5‘:    
                        case ‘6‘:    
                        case ‘7‘:    
                        case ‘8‘:    
                        case ‘9‘:    
                            value = (value << 4) + aChar - ‘0‘;    
                            break;    
                        case ‘a‘:    
                        case ‘b‘:    
                        case ‘c‘:    
                        case ‘d‘:    
                        case ‘e‘:    
                        case ‘f‘:    
                            value = (value << 4) + 10 + aChar - ‘a‘;    
                            break;    
                        case ‘A‘:    
                        case ‘B‘:    
                        case ‘C‘:    
                        case ‘D‘:    
                        case ‘E‘:    
                        case ‘F‘:    
                            value = (value << 4) + 10 + aChar - ‘A‘;    
                            break;    
                        default:    
                            throw new IllegalArgumentException(    
                                    "Malformed   \uxxxx   encoding.");    
                        }    
        
                    }    
                    outBuffer.append((char) value);    
                } else {    
                    if (aChar == ‘t‘)    
                        aChar = ‘	‘;    
                    else if (aChar == ‘r‘)    
                        aChar = ‘
‘;    
                    else if (aChar == ‘n‘)    
                        aChar = ‘
‘;    
                    else if (aChar == ‘f‘)    
                        aChar = ‘f‘;    
                    outBuffer.append(aChar);    
                }    
            } else    
                outBuffer.append(aChar);    
        }    
        return outBuffer.toString();    
    }

 

以上是关于String 字符串中含有 Unicode 编码时,转为UTF-8的主要内容,如果未能解决你的问题,请参考以下文章

判断一个字符串中是否含有中文字符:

记事本里出现该文件含有Unicode格式的字符,当保存为ANSI编码的文本时,该字符将丢失。怎么回事啊??

为啥记事本每次保存都说该文件含有unicode格式的字符?

TXT文档里有Unicode字符导致打开时为乱码怎么办啊?

保存文本时怎么有unicode啥的

我的记事本不能保存文件,请问是怎么回事