Alter table default character set 修改 MySQL 5.6 中的行

Posted

技术标签:

【中文标题】Alter table default character set 修改 MySQL 5.6 中的行【英文标题】:Alter table default character set modifies the rows in MySQL 5.6 【发布时间】:2019-07-05 07:20:55 【问题描述】:

我使用的是 mysql 5.6,我想修改一个表的默认编码(从 latin1 到 utf8),而不修改现有的列和行。 基于documentation,我尝试了以下命令:

ALTER TABLE mytable DEFAULT CHARACTER SET utf8;

它修改了我的表的默认字符集编码,并没有像预期的那样修改列的排序规则,但我真的很惊讶地看到:

Query OK, 32141 rows affected (6.31 sec)
Records: 32141 Duplicates: 0  Warnings: 0

除了“32141 行受影响”之外,结果与您预期的一样,如下所示:

MySQL> select count(*) from mytable;
+----------+
| count(*) |
+----------+
|    32141 |
+----------+
1 row in set (0.01 sec)
MySQL> show table status like 'mytable';
+-----------------------+--------+---------+------------+-------+----------------+-------------+-----------------+--------------+-----------+----------------+-------------+-------------+------------+-----------------+----------+----------------+---------+
| Name                  | Engine | Version | Row_format | Rows  | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time | Update_time | Check_time | Collation       | Checksum | Create_options | Comment |
+-----------------------+--------+---------+------------+-------+----------------+-------------+-----------------+--------------+-----------+----------------+-------------+-------------+------------+-----------------+----------+----------------+---------+
| mytable               | InnoDB |      10 | Compact    | 16723 |          20798 |   347815936 |               0 |     21561344 |  15728640 |           NULL | NULL        | NULL        | NULL       | utf8_general_ci |     NULL | partitioned    |         |
+-----------------------+--------+---------+------------+-------+----------------+-------------+-----------------+--------------+-----------+----------------+-------------+-------------+------------+-----------------+----------+----------------+---------+

MySQL> show create table mytable;
CREATE TABLE `mytable` (
  `ID` varchar(255) NOT NULL,
  `COL1` double DEFAULT NULL,
  `COL2` longtext CHARACTER SET latin1,
  `COL3` datetime DEFAULT NULL,
  `COL4` varchar(255) CHARACTER SET latin1 DEFAULT NULL,
  `COL5` int(11) DEFAULT NULL,
  `COL6` datetime DEFAULT NULL,
  `COL7` varchar(255) CHARACTER SET latin1 DEFAULT NULL,
  `COL8` datetime(3) NOT NULL,
  `COL9` int(11) NOT NULL DEFAULT '-1',
  `COL10` int(11) DEFAULT '0',
  `COL11` double DEFAULT '0',
  PRIMARY KEY (`ID`,`COL9`),
  KEY `idx1` (`COL7`,`COL3`,`COL6`),
  KEY `idx2` (`COL1`,`COL4`,`COL3`,`COL6`),
  KEY `idx3` (`ID`,`COL3`,`COL6`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
/*!50100 PARTITION BY RANGE (`COL9`)
(PARTITION p0 VALUES LESS THAN (1) ENGINE = InnoDB,
 PARTITION p1 VALUES LESS THAN (2) ENGINE = InnoDB,
 PARTITION p2 VALUES LESS THAN (3) ENGINE = InnoDB,
 PARTITION p3 VALUES LESS THAN (4) ENGINE = InnoDB,
 PARTITION p4 VALUES LESS THAN (5) ENGINE = InnoDB,
 PARTITION p5 VALUES LESS THAN (6) ENGINE = InnoDB,
 PARTITION p6 VALUES LESS THAN (7) ENGINE = InnoDB,
 PARTITION p7 VALUES LESS THAN (8) ENGINE = InnoDB,
 PARTITION p8 VALUES LESS THAN (9) ENGINE = InnoDB,
 PARTITION p9 VALUES LESS THAN (10) ENGINE = InnoDB,
 PARTITION p10 VALUES LESS THAN (11) ENGINE = InnoDB,
 PARTITION p11 VALUES LESS THAN (100) ENGINE = InnoDB,
 PARTITION p12 VALUES LESS THAN (101) ENGINE = InnoDB,
 PARTITION p13 VALUES LESS THAN (102) ENGINE = InnoDB,
 PARTITION p14 VALUES LESS THAN (103) ENGINE = InnoDB,
 PARTITION p15 VALUES LESS THAN (104) ENGINE = InnoDB,
 PARTITION p16 VALUES LESS THAN (105) ENGINE = InnoDB,
 PARTITION p17 VALUES LESS THAN (106) ENGINE = InnoDB,
 PARTITION p18 VALUES LESS THAN (107) ENGINE = InnoDB,
 PARTITION p19 VALUES LESS THAN (108) ENGINE = InnoDB,
 PARTITION p20 VALUES LESS THAN (109) ENGINE = InnoDB,
 PARTITION p21 VALUES LESS THAN (110) ENGINE = InnoDB,
 PARTITION p22 VALUES LESS THAN (111) ENGINE = InnoDB,
 PARTITION p23 VALUES LESS THAN (200) ENGINE = InnoDB,
 PARTITION p24 VALUES LESS THAN (201) ENGINE = InnoDB,
 PARTITION p25 VALUES LESS THAN (202) ENGINE = InnoDB,
 PARTITION p26 VALUES LESS THAN (203) ENGINE = InnoDB,
 PARTITION p27 VALUES LESS THAN (204) ENGINE = InnoDB,
 PARTITION p28 VALUES LESS THAN (205) ENGINE = InnoDB,
 PARTITION p29 VALUES LESS THAN (206) ENGINE = InnoDB,
 PARTITION p30 VALUES LESS THAN (207) ENGINE = InnoDB,
 PARTITION p31 VALUES LESS THAN (208) ENGINE = InnoDB,
 PARTITION p32 VALUES LESS THAN (209) ENGINE = InnoDB,
 PARTITION p33 VALUES LESS THAN (210) ENGINE = InnoDB,
 PARTITION p34 VALUES LESS THAN (211) ENGINE = InnoDB,
 PARTITION p35 VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
MySQL> show full columns from mytable;
+--------------------------+--------------+-------------------+------+-----+---------+-------+---------------------------------+---------+
| Field                    | Type         | Collation         | Null | Key | Default | Extra | Privileges                      | Comment |
+--------------------------+--------------+-------------------+------+-----+---------+-------+---------------------------------+---------+
| ID                       | varchar(255) | latin1_swedish_ci | NO   | PRI | NULL    |       | select,insert,update,references |         |
| COL1                     | double       | NULL              | YES  | MUL | NULL    |       | select,insert,update,references |         |
| COL2                     | longtext     | latin1_swedish_ci | YES  |     | NULL    |       | select,insert,update,references |         |
| COL3                     | datetime     | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| COL4                     | varchar(255) | latin1_swedish_ci | YES  |     | NULL    |       | select,insert,update,references |         |
| COL5                     | int(11)      | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| COL6                     | datetime     | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| COL7                     | varchar(255) | latin1_swedish_ci | YES  | MUL | NULL    |       | select,insert,update,references |         |
| COL8                     | datetime(3)  | NULL              | NO   |     | NULL    |       | select,insert,update,references |         |
| COL9                     | int(11)      | NULL              | NO   | PRI | -1      |       | select,insert,update,references |         |
| COL10                    | int(11)      | NULL              | YES  |     | 0       |       | select,insert,update,references |         |
| COL11                    | double       | NULL              | YES  |     | 0       |       | select,insert,update,references |         |
+--------------------------+--------------+-------------------+------+-----+---------+-------+---------------------------------+---------+

我的连接参数如下:

MySQL> show variables where variable_name like '%char%' or variable_name like '%collation%';
+--------------------------+--------------------------------------------------+
| Variable_name            | Value                                            |
+--------------------------+--------------------------------------------------+
| character_set_client     | utf8mb4                                          |
| character_set_connection | utf8mb4                                          |
| character_set_database   | utf8mb4                                          |
| character_set_filesystem | binary                                           |
| character_set_results    | utf8mb4                                          |
| character_set_server     | utf8mb4                                          |
| character_set_system     | utf8                                             |
| collation_connection     | utf8mb4_general_ci                               |
| collation_database       | utf8mb4_general_ci                               |
| collation_server         | utf8mb4_general_ci                               |
+--------------------------+--------------------------------------------------+

注意:

数据是从 Java 应用程序创建的 在创建数据时,连接参数设置为utf8 没有与此表链接的 FK

当我尝试使用一些新创建的表进行重现时,似乎没有修改行。见下文“0行受影响”:

MySQL> select count(*) from mytesttable;
+----------+
| count(*) |
+----------+
|        3 |
+----------+
3 row in set (0.10 sec)
MySQL> alter table mytesttable character set utf8;
Query OK, 0 rows affected (0.03 sec)
Records: 0  Duplicates: 0  Warnings: 0

我在数据创建过程中尝试将连接参数改回 latin1,但结果没有改变:仍然“0 行受影响”。

所以我的问题:

    我对命令的理解是否正确? (它不应该修改行) 什么可以解释在第一种情况下行受到影响?

编辑

我刚刚发现,如果我删除分区,问题就不会发生。

    使用分区我得到“XXX 受影响的行” 如果没有分区,我会得到“0 个受影响的行”

这是预期的吗?

用摘要编辑 2

最初:

    该表使用latin1 作为默认编码(列相同) 连接被声明为utf8

什么有效:

    在任何ALTER TABLE 命令之前,像“é”这样的字符似乎是latin1 编码的(E9) 运行命令ALTER TABLE mytable CHARACTER SET utf8mb4;不修改数据(十六进制命令仍显示E9仍声明为latin1。 运行命令ALTER TABLE mytable MODIFY COL2 LONGTEXT CHARACTER SET utf8mb4 更改为utf8mb4 (C3A9)

到目前为止一切顺利。

剩下的问题:

    如何确保表中的所有数据都是latin1?我试过SELECT COL2 FROM mytable WHERE LENGTH(COL2) != CHAR_LENGTH(COL2) LIMIT 1,我得到了 0 个结果。够了吗? 为什么命令ALTER TABLE mytable CHARACTER SET utf8mb4; 显示 “32141 行受影响”似乎数据没有被修改? (当表在同一列上有分区和索引时会发生这种情况) 根据前一点,更改表的默认编码是否安全(需要?)?还是我应该坚持修改列?

非常感谢您的帮助

【问题讨论】:

数据看起来真的变了吗? 嗨@deceze,感谢您抽出宝贵的时间。它似乎没有改变,但我不确定我是否尝试了好的命令。我在备份表和修改后的表上都尝试了select * from mytable limit 1;select hex(COL1), hex(COL2)... from mytable limit 1;,我得到了相同的结果。我也尝试了提到的命令here,但它没有给我任何结果 mytesttable 中,有CHARTEXT 的列吗? SHOW CREATE TABLE mytesttable ALTER 之前说了什么? 您觉得PARTITIONing 有什么好处吗? (我对此表示怀疑。) 让我们看看HEX() 一个带有重音字母的单元格。你可能有“双重编码”而没有意识到。 【参考方案1】:

你搞砸了,ALTER 让事情变得更糟。

首先,表列被声明为latin1并且连接声明客户端正在使用latin1(通过SET NAMES latin1)。如果é 在客户端中实际上是十六进制的E9,那会很好。但是客户端中的数据是UTF-8。所以 é 是两个字节 C3A9 作为 2 个 latin1 字符发送到数据库。损坏并不明显,因为当你SELECTed时它被逆转了。

后面的步骤将这些字节中的 每个 视为 latin1 并将它们转换为 utf8,从而使事情变得混乱,因此是“双重”编码。

请参阅 Trouble with UTF-8 characters; what I see is not what I stored 中的“Mojibake”和“双重编码”。如果您想尝试恢复数据,请参阅http://mysql.rjweb.org/doc.php/charcoll#fixes_for_various_cases中的相应案例

嗯,显然ALTER TABLE mytable DEFAULT CHARACTER SET utf8; 不仅仅是更改默认值,而是复制表格,并在这样做时引入了双重编码。

十多年来,我一直在研究 MySQL 字符集问题。这是我还没有观察到的新皱纹。

我很确定character_set_system 与您的问题无关。 (但我可能是错的!)

错误的设置名称

测试用例:

CREATE TABLE mytest ( MYDATA longtext ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
SET NAMES latin1;
INSERT INTO mytest VALUES ( "é" );
SELECT MYDATA, HEX(MYDATA) FROM mytest;

运行该测试用例:

mysql> SET NAMES latin1;

mysql> SHOW CREATE TABLE mytest\G
*************************** 1. row ***************************
       Table: mytest
Create Table: CREATE TABLE `mytest` (
  `MYDATA` longtext
) ENGINE=InnoDB DEFAULT CHARSET=latin1

mysql> INSERT INTO mytest VALUES ( "é" );

mysql> SELECT MYDATA, HEX(MYDATA), LENGTH(MYDATA),
              CHAR_LENGTH(MYDATA) FROM mytest;
+--------+-------------+----------------+---------------------+
| MYDATA | HEX(MYDATA) | LENGTH(MYDATA) | CHAR_LENGTH(MYDATA) |
+--------+-------------+----------------+---------------------+
| é      | C3A9        |              2 |                   2 |
+--------+-------------+----------------+---------------------+

角色看起来不错。但是 HEX 看起来像 UTF-8,而不是 latin1。而CHAR_LENGTH 是“错误的”。

案例是:CHARACTER SET latin1,但是里面有utf8字节。 在修复字符集时不处理字节:

然后在不改变字节的情况下转换列:

ALTER TABLE tbl MODIFY COLUMN MYDATA LONGBLOB;
ALTER TABLE tbl MODIFY COLUMN MYDATA LONGTEXT CHARACTER SET utf8mb4;

(确保拥有你原来拥有的所有属性,例如NOT NULL。)

这是“2 步 ALTER”,如 http://mysql.rjweb.org/doc.php/charcoll 中所述。)(确保其他规范保持不变 - VARCHAR、NOT NULL 等)

分区测试用例:

DROP TABLE IF EXISTS ptest;
CREATE TABLE ptest (
        nn INT NOT NULL,
        ee LONGTEXT
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1
    PARTITION BY RANGE (nn)
        (PARTITION p0 VALUES LESS THAN (1),
         PARTITION p1 VALUES LESS THAN MAXVALUE);
SET NAMES latin1;
INSERT INTO ptest (nn, ee)  VALUES ( 0, "é" ), ( 1, "ü" );
SELECT nn, ee, HEX(ee), LENGTH(ee), CHAR_LENGTH(ee) FROM ptest;
ALTER TABLE ptest
    DEFAULT CHARSET utf8;
SELECT nn, ee, HEX(ee), LENGTH(ee), CHAR_LENGTH(ee) FROM ptest;
SELECT @@version;
SHOW CREATE TABLE ptest\G

分区结果:

mysql>     DROP TABLE IF EXISTS ptest;
Query OK, 0 rows affected (0.02 sec)

mysql>     CREATE TABLE ptest (
    ->             nn INT NOT NULL,
    ->             ee LONGTEXT
    ->         ) ENGINE=InnoDB DEFAULT CHARSET=latin1
    ->         PARTITION BY RANGE (nn)
    ->             (PARTITION p0 VALUES LESS THAN (1),
    ->              PARTITION p1 VALUES LESS THAN MAXVALUE);
Query OK, 0 rows affected (0.03 sec)

mysql>     SET NAMES latin1;
Query OK, 0 rows affected (0.00 sec)

mysql>     INSERT INTO ptest (nn, ee)  VALUES ( 0, "é" ), ( 1, "ü" );
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql>     SELECT nn, ee, HEX(ee), LENGTH(ee), CHAR_LENGTH(ee) FROM ptest;
+----+------+---------+------------+-----------------+
| nn | ee   | HEX(ee) | LENGTH(ee) | CHAR_LENGTH(ee) |
+----+------+---------+------------+-----------------+
|  0 | é    | C3A9    |          2 |               2 |
|  1 | ü    | C3BC    |          2 |               2 |
+----+------+---------+------------+-----------------+
2 rows in set (0.00 sec)

mysql>     ALTER TABLE ptest
    ->         DEFAULT CHARSET utf8;
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql>     SELECT nn, ee, HEX(ee), LENGTH(ee), CHAR_LENGTH(ee) FROM ptest;
+----+------+---------+------------+-----------------+
| nn | ee   | HEX(ee) | LENGTH(ee) | CHAR_LENGTH(ee) |
+----+------+---------+------------+-----------------+
|  0 | é    | C3A9    |          2 |               2 |
|  1 | ü    | C3BC    |          2 |               2 |
+----+------+---------+------------+-----------------+
2 rows in set (0.00 sec)

mysql>     SELECT @@version;
+-----------------+
| @@version       |
+-----------------+
| 5.6.22-71.0-log |
+-----------------+
1 row in set (0.00 sec)

mysql>     SHOW CREATE TABLE ptest\G
*************************** 1. row ***************************
       Table: ptest
Create Table: CREATE TABLE `ptest` (
  `nn` int(11) NOT NULL,
  `ee` longtext CHARACTER SET latin1
) ENGINE=InnoDB DEFAULT CHARSET=utf8
/*!50100 PARTITION BY RANGE (nn)
(PARTITION p0 VALUES LESS THAN (1) ENGINE = InnoDB,
 PARTITION p1 VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
1 row in set (0.00 sec)

嗯...我没有看到 ALTER 问题。你用的是什么版本?你看到这个测试用例的问题了吗?

【讨论】:

ALTER TABLE mytable DEFAULT CHARACTER SET utf8; 不会更改十六进制值。这是更改十六进制值的ALTER TABLE ... MODIFY COL2 ...。我先用set NAMES utf8 再试了一次。在这种情况下,我得到: E96CE96D656E7473 (latin1) 在任何 ALTER 命令之前,在 ALTER TABLE mytable DEFAULT CHARACTER SET utf8;C3A96CC3A96D656E7473 (utf8) 之后仍然是 E96CE96D656E7473 (latin1) ALTER TABLE mytable MODIFY COL2 LONGTEXT CHARACTER SET utf8; 我已经编辑了主帖,总结了目前的情况,并提出了剩下的问题 @D3nsk - 重要的是要检查编码(E9 与 C3A9)是否与 的字符集(latin1 与 utf8/utf8mb4)一致。我感觉到它们不同步了,但您的评论暗示它们不是。 @D3nsk - 您能否构建一个包含所有创建表、集合、选择十六进制、更改等的 1 行表来演示问题?这将使我更容易检验理论。而且,假设某处潜伏着一个真正的错误,它将为提交错误报告提供证据。 编码似乎与列一致。当我执行以下操作时,它们不同步:set NAMES latin1; 然后INSERT INTO ... "éléments"... 然后ALTER TABLE mytable MODIFY COL2 LONGTEXT CHARACTER SET utf8;。但在我最初的用例中,连接编码被声明为 utf8 而不是 latin1(抱歉,我在上一篇文章中让事情变得混乱了)。

以上是关于Alter table default character set 修改 MySQL 5.6 中的行的主要内容,如果未能解决你的问题,请参考以下文章

ORACLE中通过SQL语句(alter table)来增加删除修改字段

study mysql

(笔记)Mysql命令alter add:增加表的字段

MySQL数据库,如何给数据库表和表字段添加备注?

oracle中比较alter table t move 和alter table t shrink space

SQL修改字段默认值