mysql charset and collation

作者: loinliao | 来源:发表于2017-12-26 16:28 被阅读0次

mysql charset and collation
2018-07-03
{ mysql } MySQL collation问题
Mysql 字符集(character set)以及 Colla
MySQL 中大小写敏感的 UTF8 字符集校对规则
Mysql全备、增量备份及恢复
mysql (校验规则)Collation
MySQL character set & collation
MySql数据库导入sql错误 Unknown collatio
gitlab 8 upgrade gitlab 9.5.1

from mysql doc:

A character set is a set of symbols and encodings. A collation is a set of rules for comparing characters in a character set. Let's make the distinction clear with an example of an imaginary character set.

Suppose that we have an alphabet with four letters: 'A', 'B', 'a', 'b'. We give each letter a number: 'A' = 0, 'B' = 1, 'a' = 2, 'b' = 3. The letter 'A' is a symbol, the number 0 is the encoding for 'A', and the combination of all four letters and their encodings is a character set.

Now, suppose that we want to compare two string values, 'A' and 'B'. The simplest way to do this is to look at the encodings: 0 for 'A' and 1 for 'B'. Because 0 is less than 1, we say 'A' is less than 'B'. Now, what we've just done is apply a collation to our character set. The collation is a set of rules (only one rule in this case): "compare the encodings." We call this simplest of all possible collations a binary collation.

But what if we want to say that the lowercase and uppercase letters are equivalent? Then we would have at least two rules: (1) treat the lowercase letters 'a' and 'b' as equivalent to 'A' and 'B'; (2) then compare the encodings. We call this a case-insensitive collation. It's a little more complex than a binary collation.

In real life, most character sets have many characters: not just 'A' and 'B' but whole alphabets, sometimes multiple alphabets or eastern writing systems with thousands of characters, along with many special symbols and punctuation marks. Also in real life, most collations have many rules: not just case insensitivity but also accent insensitivity (an "accent" is a mark attached to a character as in German 'ö') and multiple-character mappings (such as the rule that 'ö' = 'OE' in one of the two German collations).

总结

character set指的是字符集，即有哪些字符和他们的编码
collation指的是比较两个字符串的规则:

int collation_1(string a, string b) {
}
int collation_2(string a, string b) {
}

网友评论

本文标题：mysql charset and collation

本文链接：https://www.haomeiwen.com/subject/yvdjgxtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

mysql charset and collation

总结

相关文章