如何设置数据库的LC_COLLATE, LC_CTYPE, ENCODING, TEMPLATE

Posted 2020-09-14

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了如何设置数据库的LC_COLLATE, LC_CTYPE, ENCODING, TEMPLATE相关的知识，希望对你有一定的参考价值。

背景

PostgreSQL实例支持创建多个数据库，创建数据库时，可以指定模板库，并为每个数据库设置不同的字符集、本地化collate, 货币等属性。

接下来举例讲述CREATE DATABASE的具体使用方法。

一、CREATE DATABASE语法

Command:     CREATE DATABASE  
Description: create a new database  
Syntax:  
CREATE DATABASE name  
    [ [ WITH ] [ OWNER [=] user_name ]  
           [ TEMPLATE [=] template ]  
           [ ENCODING [=] encoding ]  
           [ LC_COLLATE [=] lc_collate ]  
           [ LC_CTYPE [=] lc_ctype ]  
           [ TABLESPACE [=] tablespace_name ]  
           [ CONNECTION LIMIT [=] connlimit ] ]

二、克隆模板库(从指定模板创建数据库)

例子

以test数据库为模板，克隆一个名为test01的数据库。

test=> create database test01 with template test;  
CREATE DATABASE

如果不指定模板，则默认的模板为template1。

注意

从指定模板库创建数据库，除了当前连接，不能有其他用户连在对应的模板库上面。

如果有其他用户连在test库，可能会报类似这样的错误

test=> create database test01 with template test;  
ERROR:  source database "test" is being accessed by other users  
DETAIL:  There is 1 other session using the database.

三、创建数据库时，如何指定encoding

PostgreSQL支持哪些encoding

用户可以参考PostgreSQL的官方文档，有对应的字符集支持列表

https://www.postgresql.org/docs/9.6/static/multibyte.html

Server=Yes表示该字符集支持用于create database。否则只支持作为客户端字符集。

Name	Description	Language	Server?	Bytes/Char	Aliases
BIG5	Big Five	Traditional Chinese	No	1-2	WIN950, Windows950
EUC_CN	Extended UNIX Code-CN	Simplified Chinese	Yes	1-3	-
EUC_JP	Extended UNIX Code-JP	Japanese	Yes	1-3	-
EUC_JIS_2004	Extended UNIX Code-JP, JIS X 0213	Japanese	Yes	1-3	-
EUC_KR	Extended UNIX Code-KR	Korean	Yes	1-3	-
EUC_TW	Extended UNIX Code-TW	Traditional Chinese, Taiwanese	Yes	1-3	-
GB18030	National Standard	Chinese	No	1-4	-
GBK	Extended National Standard	Simplified Chinese	No	1-2	WIN936, Windows936
ISO_8859_5	ISO 8859-5, ECMA 113	Latin/Cyrillic	Yes	1	-
ISO_8859_6	ISO 8859-6, ECMA 114	Latin/Arabic	Yes	1	-
ISO_8859_7	ISO 8859-7, ECMA 118	Latin/Greek	Yes	1	-
ISO_8859_8	ISO 8859-8, ECMA 121	Latin/Hebrew	Yes	1	-
JOHAB	JOHAB	Korean (Hangul)	No	1-3	-
KOI8R	KOI8-R	Cyrillic (Russian)	Yes	1	KOI8
KOI8U	KOI8-U	Cyrillic (Ukrainian)	Yes	1	-
LATIN1	ISO 8859-1, ECMA 94	Western European	Yes	1	ISO88591
LATIN2	ISO 8859-2, ECMA 94	Central European	Yes	1	ISO88592
LATIN3	ISO 8859-3, ECMA 94	South European	Yes	1	ISO88593
LATIN4	ISO 8859-4, ECMA 94	North European	Yes	1	ISO88594
LATIN5	ISO 8859-9, ECMA 128	Turkish	Yes	1	ISO88599
LATIN6	ISO 8859-10, ECMA 144	Nordic	Yes	1	ISO885910
LATIN7	ISO 8859-13	Baltic	Yes	1	ISO885913
LATIN8	ISO 8859-14	Celtic	Yes	1	ISO885914
LATIN9	ISO 8859-15	LATIN1 with Euro and accents	Yes	1	ISO885915
LATIN10	ISO 8859-16, ASRO SR 14111	Romanian	Yes	1	ISO885916
MULE_INTERNAL	Mule internal code	Multilingual Emacs	Yes	1-4	-
SJIS	Shift JIS	Japanese	No	1-2	Mskanji, ShiftJIS, WIN932, Windows932
SHIFT_JIS_2004	Shift JIS, JIS X 0213	Japanese	No	1-2	-
SQL_ASCII	unspecified (see text)	any	Yes	1	-
UHC	Unified Hangul Code	Korean	No	1-2	WIN949, Windows949
UTF8	Unicode, 8-bit	all	Yes	1-4	Unicode
WIN866	Windows CP866	Cyrillic	Yes	1	ALT
WIN874	Windows CP874	Thai	Yes	1	-
WIN1250	Windows CP1250	Central European	Yes	1	-
WIN1251	Windows CP1251	Cyrillic	Yes	1	WIN
WIN1252	Windows CP1252	Western European	Yes	1	-
WIN1253	Windows CP1253	Greek	Yes	1	-
WIN1254	Windows CP1254	Turkish	Yes	1	-
WIN1255	Windows CP1255	Hebrew	Yes	1	-
WIN1256	Windows CP1256	Arabic	Yes	1	-
WIN1257	Windows CP1257	Baltic	Yes	1	-
WIN1258	Windows CP1258	Vietnamese	Yes	1	ABC, TCVN, TCVN5712, VSCII

创建指定encoding的数据库

例子

创建一个UTF-8字符集的数据库

test=> create database test02 with encoding ‘UTF-8‘;  
CREATE DATABASE

注意

1. 指定的字符集必须是模板库字符集的超集，否则会报错。

2. 指定的lc_ctype和lc_collate必须与目标字符集兼容。

例子，template1是默认模板库，它的字符集为UTF8。

test=> \l template1  
                                  List of databases  
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges     
-----------+----------+----------+-------------+-------------+-----------------------  
 template1 | xxxxxxxx | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 | =c/xxxxxxxx          +  
           |          |          |             |             | xxxxxxxx=CTc/xxxxxxxx  
(1 row)

创建一个EUC_CN字符集的数据库

报错1，EUC_CN字符集与模板库的lc_collate,lc_ctype不兼容。

test=> create database test03 with encoding ‘EUC_CN‘;  
ERROR:  encoding "EUC_CN" does not match locale "zh_CN.UTF-8"  
DETAIL:  The chosen LC_CTYPE setting requires encoding "UTF8".

报错2，EUC_CN字符集与模板库的字符集UTF-8不兼容。

test=> create database test03 with encoding ‘EUC_CN‘ lc_collate=‘C‘ lc_ctype=‘C‘;  
ERROR:  new encoding (EUC_CN) is incompatible with the encoding of the template database (UTF8)  
HINT:  Use the same encoding as in the template database, or use template0 as template.

四、如何获取字符集支持的LC_COLLATE, LC_CTYPE信息

使用如下SQL可以查询系统表pg_collation得到字符集支持的lc_collate和lc_ctype。

其中encoding为空时，表示这个collation支持所有的字符集。

test=> select pg_encoding_to_char(collencoding) as encoding,collname,collcollate,collctype from pg_collation ;  
  encoding  |       collname        |      collcollate      |       collctype         
------------+-----------------------+-----------------------+-----------------------  
            | default               |                       |   
            | C                     | C                     | C  
            | POSIX                 | POSIX                 | POSIX  
 UTF8       | aa_DJ                 | aa_DJ.utf8            | aa_DJ.utf8  
 LATIN1     | aa_DJ                 | aa_DJ                 | aa_DJ  
 LATIN1     | aa_DJ.iso88591        | aa_DJ.iso88591        | aa_DJ.iso88591  
 UTF8       | aa_DJ.utf8            | aa_DJ.utf8            | aa_DJ.utf8  
 UTF8       | aa_ER                 | aa_ER                 | aa_ER  
 UTF8       | aa_ER.utf8            | aa_ER.utf8            | aa_ER.utf8  
.......  
 EUC_CN     | zh_CN                 | zh_CN                 | zh_CN  
 UTF8       | zh_CN                 | zh_CN.utf8            | zh_CN.utf8  
 EUC_CN     | zh_CN.gb2312          | zh_CN.gb2312          | zh_CN.gb2312  
 UTF8       | zh_CN.utf8            | zh_CN.utf8            | zh_CN.utf8  
 UTF8       | zh_HK                 | zh_HK.utf8            | zh_HK.utf8  
 UTF8       | zh_HK.utf8            | zh_HK.utf8            | zh_HK.utf8  
 EUC_CN     | zh_SG                 | zh_SG                 | zh_SG  
 UTF8       | zh_SG                 | zh_SG.utf8            | zh_SG.utf8  
 EUC_CN     | zh_SG.gb2312          | zh_SG.gb2312          | zh_SG.gb2312  
 UTF8       | zh_SG.utf8            | zh_SG.utf8            | zh_SG.utf8  
 EUC_TW     | zh_TW                 | zh_TW.euctw           | zh_TW.euctw  
 UTF8       | zh_TW                 | zh_TW.utf8            | zh_TW.utf8  
 EUC_TW     | zh_TW.euctw           | zh_TW.euctw           | zh_TW.euctw  
 UTF8       | zh_TW.utf8            | zh_TW.utf8            | zh_TW.utf8  
 UTF8       | zu_ZA                 | zu_ZA.utf8            | zu_ZA.utf8  
 LATIN1     | zu_ZA                 | zu_ZA                 | zu_ZA  
 LATIN1     | zu_ZA.iso88591        | zu_ZA.iso88591        | zu_ZA.iso88591  
 UTF8       | zu_ZA.utf8            | zu_ZA.utf8            | zu_ZA.utf8  
(869 rows)

五、创建数据库时指定LC_COLLATE, LC_CTYPE

例子

创建一个数据库，lc_collate, lc_ctype分别为zh_CN.utf8

test=> create database test05 with encoding ‘UTF-8‘ template template0 lc_collate=‘zh_CN.utf8‘ lc_ctype=‘zh_CN.utf8‘;  
CREATE DATABASE

注意

如果指定的lc_collate, lc_ctype与模板库的collate,ctype不兼容，会报错。

test=> create database test04 with encoding ‘UTF-8‘ lc_collate=‘zh_CN.utf8‘ lc_ctype=‘zh_CN.utf8‘;  
ERROR:  new collation (zh_CN.utf8) is incompatible with the collation of the template database (zh_CN.UTF-8)  
HINT:  Use the same collation as in the template database, or use template0 as template.

解决办法1，使用兼容的collate和ctype。

test=> create database test04 with encoding ‘UTF-8‘ lc_collate=‘zh_CN.UTF-8‘ lc_ctype=‘zh_CN.UTF-8‘;  
CREATE DATABASE

解决办法2，使用template0作为模板库。

test=> create database test05 with encoding ‘UTF-8‘ template template0 lc_collate=‘zh_CN.utf8‘ lc_ctype=‘zh_CN.utf8‘;  
CREATE DATABASE

六、如何修改已有数据库的collate,ctype

目前无法直接通过alter database的语法进行修改，用户可以使用创建新的数据库，导出，再导入的方式。

例子

1. 创建新数据库，指定目标collate和ctype

2. 使用pg_dump或其他客户端工具逻辑导出源数据库的数据

3. 使用pg_restore或其他客户端工具，将第二步导出数据导入新数据库

以上是关于如何设置数据库的LC_COLLATE, LC_CTYPE, ENCODING, TEMPLATE的主要内容，如果未能解决你的问题，请参考以下文章

在 OS X Lion 中，LANG 未设置为 UTF-8，如何解决？

PostgreSQL 数据库操作

PG基础篇--逻辑结构管理（库模式表约束）

Ubuntu里字符编码设置

PostgreSQL 库相关的操作