python编码问题:
所有使用python的都会遇到下面的问题:
Traceback (most recent call last):
File "amazon_test.py", line 30, in
print(s)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-7: ordinal not in range(128)
解决方法
首先,你要有个通用的环境:
- locale保证
LANG=zh_CN.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8
具体设置:
# ~/.bashrc中添加
LANG=zh_CN.UTF-8
LANGUAGE=zh_CN:zh:en_US:en
LC_ALL=en_US.UTF-8
- py文件第一行一般为
#!/usr/bin/env python
第二行# -*- coding: utf-8 -*-
或者# coding=utf-8
保证文件的编码为utf-8格式(有些人会把vim环境设置为gbk或者chinese,文件保存时可能会变成gbk格式,需要注意)
p.s. : vimrc设置推荐:
set encoding=utf-8 " 新创建文件格式为utf-8
set termencoding=utf-8 " 终端显示格式,把解析的字符用utf-8编码来进行显示和渲染终端屏幕
set fileencodings=utf-8,gb18030,gbk,cp936,gb2312 " 可以查看多种格式的文件
python2
-
解码输入流
- 读取文件
with open(file_path, 'r') as f: for line in f: line = line.decode('your_file_encoding', errors='ignore').strip()
- 标准输入流
for line in sys.stdin: line = line.decode('your_file_encoding', errors='ignore').strip()
-
写某编码的文件
print >> sys.stdout, line.encode('gb18030', 'ignore')
# 或者用,推荐下面的方法
sys.stdout.write(line.encode('gb18030', 'ignore') + '\n')
python3
- 解码输入流
- 读取文件
with open(file_path, mode='r', encoding='gb18030', errors='ignore') as f: for line in f: # line is unicode string pass
- 标准输入流
import io import sys sys.stdin = io.TextIOWrapper(sys.stdin.buffer, encoding='utf-8') for line in sys.stdin: pass
import sys
sys.stdin.reconfigure(encoding='utf-8')
for line in sys.stdin:
pass
- 编码输出
- 写文件
with open(file_output, encoding='your_dest_encoding', mode='w') as f: f.write(line)
- 输出流
import sys import io sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8') sys.stdout.write(line + '\n')
网友评论