gbk转utf-8

Posted 小明快点跑

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了gbk转utf-8相关的知识,希望对你有一定的参考价值。

1、文件转码:使用脚本
 
gbk转u8的脚本文件:
#!/bin/bash

FILE_SUFFIX="java xml html vm js"
# FILE_SUFFIX="vm"
file_names=""
for x in $FILE_SUFFIX
do
	file_names=`find . -name "*.$x" | xargs file -I | grep -v utf-8 | awk -F " |:" ‘{print $1}‘`
	for file_name in $file_names
	do
		# echo $file_name
		iconv -f cp936 -t UTF-8 $file_name >$file_name".new" &&
	    mv -f "$file_name.new" "$file_name"
	done
	echo "$x ok"

done


find . -name "*.xml" | xargs sed -i "" "/<?xml/s/GBK/UTF-8/g"
find . -name "*.xml" | xargs sed -i "" "/<?xml/s/GB2312/UTF-8/g"

echo "xml head is ok!"

find . -name "pom.xml" | xargs sed -i "" "/<encoding>/s/GBK/UTF-8/g"
find . -name "pom.xml" | xargs sed -i "" "/<encoding>/s/GB2312/UTF-8/g"
find . -name "pom.xml" | xargs sed -i "" "/project.build.sourceEncoding/s/GBK/UTF-8/g"
find . -name "pom.xml" | xargs sed -i "" "/project.reporting.outputEncoding/s/GBK/UTF-8/g"
find . -name "pom.xml" | xargs sed -i "" "s/pop-vender-common-pageframe/pop-vender-common-pageframe-utf8/g"

echo "pom.xml is ok!"

find . -name "*.properties" | xargs sed -i "" "/input.encoding/s/GBK/UTF-8/g"
find . -name "*.properties" | xargs sed -i "" "/output.encoding/s/GBK/UTF-8/g"

echo "velocity properties is OK!"

find . -name "strut*.xml" | xargs sed -i "" ‘/struts.i18n.encoding/s/GBK/UTF-8/g‘

echo "struts xml is ok!"

find . -name "*.vm" | xargs sed -i "" "s/\/common\/js\/jdmsg\/jd-msg.js/\/common\/js\/jdmsg\/jd-msg-utf8.js/g"
find . -name "*.vm" | xargs sed -i "" "/\/ui.datepicker.js/s/<script t/<script charset=\"GBK\" t/g"
find . -name "*.vm" | xargs sed -i "" "/\/jquery-calendar.js/s/<script t/<script charset=\"GBK\" t/g"
echo "vm is ok"

echo "finished"
# echo $file_names
2、文件转码后,本地环境改成utf-8环境,可能会有部分乱码文件,手动修复
3、含有中文js引用增加charset="gbk"
   如依赖:static.360buying.com、shop.jd.com
4、打包编译编码:替换成UTF-8
5、xml设置格式:以前可能为gbk或gb2312,改成utf-8
6、web.xml转成UTF-8,请求拦截器字符编码
   如使用spring配置
    <!--Character Encoding filter(字符集拦截转换) -->
    <filter>
        <filter-name>charsetFilter</filter-name>
        <filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
        <init-param>
            <param-name>encoding</param-name>
            <param-value>UTF-8</param-value>
        </init-param>
        <init-param>
            <param-name>forceEncoding</param-name>
            <param-value>true</param-value>
        </init-param>
    </filter>
7、代码GBK使用
   主要是代码里面写死GBK方式
   如string.getBytes("GBK")替换
 
8、jdurl配置编码
增加<property name="charsetName" value="utf-8"/>
避税分页中含有中文乱码
  如:jdurl的
编码设置:
   <bean class="com.jd.pop.component.url.PopJdUrl">
         <property name="url" value="${pop-vender.login.address}"/>
         <property name="charsetName" value="utf-8"/>
   </bean>
 
大概是这8条
 
重点在后面:
这时你会发现,页面的get请求请求服务器时还是会出现乱码,别慌,那是因为tomcat的编码你还没有设置.
 
利用request.setCharacterEncoding("UTF-8");来设置Tomcat接收请求的编码格式,只对POST方式提交的数据有效,对GET方式提交的数据无效!
要设置GET的编码,可以修改server.xml文件中,相应的端口的Connector的属性:URIEncoding="UTF-8",这样,GET方式提交的数据才会被正确解码。
  <Connector port="8080" protocol="HTTP/1.1"
               connectionTimeout="20000"
               redirectPort="8443" URIEncoding="UTF-8" />
 
这样就ok了!!!

以上是关于gbk转utf-8的主要内容,如果未能解决你的问题,请参考以下文章

理解并解决GBK转UTF-8奇数中文乱码(转)

gbk 转 UTF-8

oracle编码gbk加载utf-8文件需要转码么?

gbk转utf-8

文件编码转换(GBK转UTF-8)

utf-8编码的页面如何转成gbk编码的页面(急)