Python：查看文件的编码格式-chardet

作者: _简姑娘_ | 来源:发表于2020-10-19 23:31 被阅读0次

Python：查看文件的编码格式-chardet
python查看sql文件的编码格式
Python 3 查看字符编码方法
python批量查看修改文件编码
在Vim中查看文件编码和文件编码转换
python检测文件编码问题
Python chardet检查文件的编码
Vim 常用功能
Python—处理文件(mimetypes和chardet)
Python如何查看文件的编码格式

概述
Python提供了Unicode表示的str和bytes两种数据类型，并且可以通过encode()和decode()方法转换，前提是知道哪种编码。如果不知道，可以使用chardet来检测编码。
安装

pip install chardet

语法

import chardet

file_path = "E:\\test.txt"
with open(file_path,"rb") as obj:
    data = obj.read()

file_encoding = chardet.detect(data)
print(file_encoding)  # {'encoding': 'utf-8', 'confidence': 0.99, 'language': ''}

其中，encoding为检测出的编码，confidence为可信度， language是语言。
另外一个例子：

>>> data = '离离原上草，一岁一枯荣'.encode('gbk')
>>> chardet.detect(data)
{'encoding': 'GB2312', 'confidence': 0.7407407407407407, 'language': 'Chinese'}

检测的编码是GB2312，注意到GBK是GB2312的超集，两者是同一种编码，检测正确的概率是74%，language字段指出的语言是'Chinese'。

注意：chardet支持检测的编码列表请参考官方文档Supported encodings。

网友评论

本文标题：Python：查看文件的编码格式-chardet

本文链接：https://www.haomeiwen.com/subject/vnixmktx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

Python：查看文件的编码格式-chardet

相关文章

Python：查看文件的编码格式-chardet

python查看sql文件的编码格式

Python 3 查看字符编码方法

python批量查看修改文件编码

在Vim中查看文件编码和文件编码转换

python检测文件编码问题

Python chardet检查文件的编码

Vim 常用功能

Python—处理文件(mimetypes和chardet)

Python如何查看文件的编码格式

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读