如何将整个MySQL数据库字符集转换为UTF-8和排序为UTF-8?
当前回答
受@sdfor注释的启发,这里有一个bash脚本可以完成这项工作
#!/bin/bash
printf "### Converting MySQL character set ###\n\n"
printf "Enter the encoding you want to set: "
read -r CHARSET
# Get the MySQL username
printf "Enter mysql username: "
read -r USERNAME
# Get the MySQL password
printf "Enter mysql password for user %s:" "$USERNAME"
read -rs PASSWORD
DBLIST=( mydatabase1 mydatabase2 )
printf "\n"
for DB in "${DBLIST[@]}"
do
(
echo 'ALTER DATABASE `'"$DB"'` CHARACTER SET utf8 COLLATE `'"$CHARSET"'`;'
mysql "$DB" -u"$USERNAME" -p"$PASSWORD" -e "SHOW TABLES" --batch --skip-column-names \
| xargs -I{} echo 'ALTER TABLE `'{}'` CONVERT TO CHARACTER SET utf8 COLLATE `'"$CHARSET"'`;'
) \
| mysql "$DB" -u"$USERNAME" -p"$PASSWORD"
echo "$DB database done..."
done
echo "### DONE ###"
exit
其他回答
使用ALTER DATABASE和ALTER TABLE命令。
ALTER DATABASE databasename CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
或者如果你仍然使用MySQL 5.5.2或更早的版本,不支持4字节UTF-8,使用utf8而不是utf8mb4:
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
最安全的方法是先将列修改为二进制类型,然后使用所需的字符集将其修改回二进制类型。
每种列类型都有其各自的二进制类型,如下所示:
>二进制 文本=> BLOB 丁文字=> TINYBLOB 文本=> memblob LONGTEXT => LONGBLOB VARCHAR => VARBINARY
Eg.:
ALTER TABLE [TABLE_SCHEMA].[TABLE_NAME] MODIFY [COLUMN_NAME] VARBINARY;
ALTER TABLE [TABLE_SCHEMA].[TABLE_NAME] MODIFY [COLUMN_NAME] VARCHAR(140) CHARACTER SET utf8mb4;
我尝试了几个拉丁表,它保留了所有的变音符。
你可以为所有列提取这个查询:
SELECT
CONCAT('ALTER TABLE ', TABLE_SCHEMA,'.', TABLE_NAME,' MODIFY ', COLUMN_NAME,' VARBINARY;'),
CONCAT('ALTER TABLE ', TABLE_SCHEMA,'.', TABLE_NAME,' MODIFY ', COLUMN_NAME,' ', COLUMN_TYPE,' CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;')
FROM information_schema.columns
WHERE TABLE_SCHEMA IN ('[TABLE_SCHEMA]')
AND COLUMN_TYPE LIKE 'varchar%'
AND (COLLATION_NAME IS NOT NULL AND COLLATION_NAME NOT LIKE 'utf%');
在所有列上执行此操作后,再对所有表执行此操作:
ALTER TABLE [TABLE_SCHEMA].[TABLE_NAME] CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
要为你所有的表生成这个查询,使用下面的查询:
SELECT
CONCAT('ALTER TABLE ', TABLE_SCHEMA, '.', TABLE_NAME, ' CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;')
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_COLLATION NOT LIKE 'utf8%'
and TABLE_SCHEMA in ('[TABLE_SCHEMA]');
现在你修改了所有的列和表,在数据库上做同样的事情:
ALTER DATABASE [DATA_BASE_NAME] CHARSET = utf8mb4 COLLATE = utf8mb4_general_ci;
在命令行shell上
如果你是一个命令行shell,你可以非常快地做到这一点。只需填写“dbname”:D
DB="dbname"
(
echo 'ALTER DATABASE `'"$DB"'` CHARACTER SET utf8 COLLATE utf8_general_ci;'
mysql "$DB" -e "SHOW TABLES" --batch --skip-column-names \
| xargs -I{} echo 'ALTER TABLE `'{}'` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;'
) \
| mysql "$DB"
简单复制/粘贴的一行程序
DB="dbname"; ( echo 'ALTER DATABASE `'"$DB"'` CHARACTER SET utf8 COLLATE utf8_general_ci;'; mysql "$DB" -e "SHOW TABLES" --batch --skip-column-names | xargs -I{} echo 'ALTER TABLE `'{}'` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;' ) | mysql "$DB"
如果数据不在相同的字符集中,您可以考虑http://dev.mysql.com/doc/refman/5.0/en/charset-conversion.html中的这个片段
如果列具有非二进制数据类型(CHAR、VARCHAR、TEXT),则其 内容应该在列字符集中编码,而不是其他字符集 字符集。如果内容用不同的字符编码 设置后,您可以先将列转换为使用二进制数据类型,然后 然后转换到具有所需字符集的非二进制列。
这里有一个例子:
ALTER TABLE t1 CHANGE c1 c1 BLOB;
ALTER TABLE t1 CHANGE c1 c1 VARCHAR(100) CHARACTER SET utf8;
确保选择正确的排序规则,否则可能会得到唯一的键冲突。如。 Éleanore和Eleanore在某些排序中可能被认为是相同的。
旁白:
我曾遇到过这样的情况,电子邮件中的某些字符“坏了”,尽管它们在数据库中以UTF-8格式存储。如果您使用utf8数据发送电子邮件,您可能还需要将电子邮件转换为utf8发送。
在phpailer中,只需更新这一行:public $CharSet = 'utf-8';
在继续之前,请确保您:已完成完全数据库备份!
步骤1:数据库级别更改
标识数据库的排序规则和字符集 选择default_character_set_name, default_collation_name from information_schema。图式年代 WHERE schema_name = 'your_database_name' 和 (DEFAULT_CHARACTER_SET_NAME != 'utf8' 或 DEFAULT_COLLATION_NAME不像'utf8%'); 修复数据库的排序规则 ALTER DATABASE DATABASE utf8 COLLATE utf8_unicode_ci;
步骤2:表级别更改
Identifying Database Tables with the incorrect character set or collation SELECT CONCAT( 'ALTER TABLE ', table_name, ' CHARACTER SET utf8 COLLATE utf8_general_ci; ', 'ALTER TABLE ', table_name, ' CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; ') FROM information_schema.TABLES AS T, information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` AS C WHERE C.collation_name = T.table_collation AND T.table_schema = 'your_database_name' AND (C.CHARACTER_SET_NAME != 'utf8' OR C.COLLATION_NAME not like 'utf8%') Adjusting table columns' collation and character set
捕获上层sql输出并运行它。(如后)
ALTER TABLE rma CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE rma_history CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_history CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE rma_products CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_products CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE rma_report_period CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_report_period CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE rma_reservation CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_reservation CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE rma_supplier_return CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_supplier_return CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE rma_supplier_return_history CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_supplier_return_history CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE rma_supplier_return_product CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_supplier_return_product CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
参考网址:https://confluence.atlassian.com/display/CONFKB/How+to+Fix+the+Collation+and+Character+Set+of+a+MySQL+Database