从MySQL中使用emoji报错说起。。。

作者: ZenCabin | 来源:发表于2018-07-22 20:13 被阅读12次

从MySQL中使用emoji报错说起。。。
mysql emoji表情问题
软件开发随笔——MySql存储emoji表情字符
微信nickname中包含emoji的问题
MyBatis Mysql 插入emoji 报错
京东面试官：为什么不建议在 MySQL 中使用 UTF-8？
第三个模块让Mysql支持Emoji表情
jdk 1.8 + MySQL 8.0 jsp连接数据库记录
Mysql存储Emoji表情[为何utf8不能存储以及如何使My
dataSource init error java.sql.S

# MySql中存储emoji字符报错 #

情景：
用户在APP输入文本内容，客户端将用户输入的字符通过网络传递给服务器，传递过程使用utf-8编码，服务器收到数据后，经过相关业务处理保存到数据中，保存出错！！。

服务器端存储字符时抛出异常，
java.sql.SQLException: Incorrect string value: ‘\xF0\x9F\x92\x94’ for column ‘name’ at row 1
异常信息指出字符串不合法，并指出了不合法的字符‘\xF0\x9F\x92\x94’，这是一个utf-8的4个字节的字符，Unicode编码是U+1F494，在Unicode 11 EmojiSources可以看到这是一个emoji符号。那么为什么mysql在保存4个字节的字符时报错了呢？

产生问题的原因：
原来mysql数据库5.5.3+之前的版本中支持utf-8并不是一个完整的utf-8字符集，mysql在后续版本中修复了这个问题，推出了utf8mb4字符集，来解决不兼容的问题。

那么解决问题的方法就分成了两种：
1.服务端升级数据库版本到5.5.3+，并修改数据库的字符集为utf8mb4。
2.客户端在传递字符串的时候将大于4个字节的utf-8字符手动过滤掉！

修改mysql数据库字符集：

数据库属性.png

过滤客户端输入字符
思路：继承EditText，然后自定义filter设置给控件，拦截用户输入，对输入字符进行过滤。

#过滤客户端输入字符 #

需求：将EditText输入的字符串中大于4个字节的utf-8字符过滤掉。

思路：MyEditText继承EditText，添加filter过滤规则。

EditText继承自TextView， TextView构造方法中有一段代码：

if (maxlength >= 0) {
    setFilters(new InputFilter[] { new InputFilter.LengthFilter(maxlength) });
} else {
    setFilters(NO_FILTERS);
}

说明控件的mFilter集合中本身是包含其他filter的，所以在setFilter时应该注意取到原有的filter一并放入要执行的filter集合当中，然后调用setFilter方法，是过滤器生效。

public class MyEditText extends EditText {

    public MyEditText(Context context) {
        this(context, null, 0);
    }

    public MyEditText(Context context, AttributeSet attrs) {
        this(context, attrs, 0);
    }

    public MyEditText(Context context, AttributeSet attrs, int defStyleAttr) {
        super(context, attrs, defStyleAttr);
        // 设置焦点模式
        setFocusableInTouchMode(true);

        // 过滤4字节utf-8字符
        addInputFilter(new ExceptEmojiInputFilter());
    }

    public void addInputFilter(InputFilter... inputFilter){
        if(inputFilter == null || inputFilter.length < 1) {
            return;
        }
        InputFilter[] filters = getFilters();
        if (filters != null && filters.length > 0) {
            InputFilter[] newFilters = new InputFilter[filters.length + inputFilter.length];
            System.arraycopy(filters, 0, newFilters, 0, filters.length);
            System.arraycopy(inputFilter, 0, newFilters, filters.length, inputFilter.length);
            setFilters(newFilters);
        } else {
            setFilters(inputFilter);
        }
    }
}

实现过滤器

public class ExceptEmojiInputFilter implements InputFilter {

    @Override
    public CharSequence filter(CharSequence source, int start, int end, Spanned dest, int dstart, int dend) {
        try {
            String s = source.toString();   // source（SpannableStringBuilder实例）
            s = StringUtil.parseUtf8Over4Byte(s);
            if (source instanceof Spanned) {
                SpannableString sp = new SpannableString(s);
                TextUtils.copySpansFrom((Spanned) source,
                        start, end, null, sp, 0);
                return sp;
            } else {
                return s;
            }
        } catch (UnsupportedEncodingException e) {
            e.printStackTrace();
        } catch (ClassCastException e) {
            e.printStackTrace();
        }

        return null;
    }
}

StringUtil.parseUtf8Over4Byte(String s); // 实现具体过滤过程

public static String parseUtf8Over4Byte(String str) throws UnsupportedEncodingException {
        byte[] bArr = str.getBytes("utf-8");
        ByteBuffer bb = ByteBuffer.allocate(bArr.length);
        int offset = 0;
        for (int i = 0, len = bArr.length; i < len; i++) {
            if ((bArr[i] & 0xF8) == 0xF0) {//F8=1111 1000,F0=1111 0000过滤4位UTF8编码（字头11110xxx）
                i += 4;
            } else if ((bArr[i] & 0xFC) == 0xF8) {//FC=1111 1100,F8=1111 1000过滤5位UTF8编码（字头111110xx）
                i += 5;
            } else if ((bArr[i] & 0xFE) == 0xFC) {//FE=1111 1110,FC=1111 1100,过滤6位UTF8编码（字头1111110x）
                i += 6;
            } else {
                bb.put(bArr[i]);    // 将合法字符存入ByteBuffer
                offset++;
            }
        }
        bb.flip();
        if (offset == 0) {
            bb.clear();
            return "";
        }
        byte[] cop = new byte[offset];
        System.arraycopy(bb.array(), 0, cop, 0, offset);
        bb.clear();
        return new String(cop, "utf-8");    // 构造新的字符对象并返回
    }