1. 字符集、字符集编码

ISO-8859-1 收录除 ASCII 外，还包括西欧、希腊语、泰语、阿拉伯语、希伯来语对应的文字符号
UTF-8 针对 Unicode 码表的可变长度字符编码
GB2312 简体中文
GBK 简体中文扩充
BIG5 台湾，繁体中文

1.1 各种编码占用字节数

编码	英文(Byte)	中文(Byte)
ISO-8859-1	1	1
UTF-8	1	3
UTF-16	4	4
GB2312	1	2
GBK	1	2
BIG5	1	2

1.2 ASCII 字符集

ASCII 字符集包括128个字符，分为四组，每组32个字符
American Standard Code for Information Interchange，美国信息交换标准代码

第一组，控制字符(Control Character)
第二组，标点符号、特殊字符和数字
第三组，大写字母、特殊符号
第四组，小写字母、特殊符号

2. Java 字符型(char)

char是按照字符存储的，不管英文还是中文，固定占用占用2个字节，
用来储存 Unicode 字符，范围在 0 ~ 65535

字面量用单引号扩起来

@Test
public void charTest () {
    char word = '中';
    System.out.println(word);
}

3. 乱码处理

当编码方式和解码方式不一致时会出现乱码

@Test
public void encoding() throws UnsupportedEncodingException {
    String str = "中国";

    byte[] bytes = str.getBytes();

    String gbk = new String(bytes,"GBK");
    String gbk2 = new String(str.getBytes("GBK"),"GBK");
    System.out.println(gbk + ", " + gbk2);

    String utf = new String(bytes,"UTF-8");
    String utf2 = new String(str.getBytes("UTF-8"),"UTF-8");
    System.out.println(utf + ", " + utf2);
}