Python 字符串 - PyTips 0x07

作者: 蛙声一爿 | 来源:发表于2016-03-16 00:53 被阅读74次

Python 字节与字节数组 - PyTips 0x08
Python 中 Unicode 的正确用法 - PyTips
Python 字符串 - PyTips 0x07
Python 之禅与 Pythonic - PyTips 0x0
Python 上下文管理器 - PyTips 0x0d
Python 的堆与优先队列 - PyTips 0x10
Python 中的函数式编程 - PyTips 0x02
Python 知之深浅 - PyTips 0x0c
Python 无处不在的else - PyTips 0x0b
Python 修饰器与 `functools` - PyTips

PyTips

项目地址：https://git.io/pytips

所有用过 Python (2&3)的人应该都看过下面两行错误信息：

UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)

UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 0-1: invalid continuation byte

这就是 Python 界的"锟斤拷"！

今天和接下来几期的内容将主要关注 Python 中的字符串（str）、字节（bytes）及两者之间的相互转换（encode/decode）。也许不能让你突然间解决所有乱码问题，但希望可以帮助你迅速找到问题所在。

定义

Python 中对字符串的定义如下：

Textual data in Python is handled with str objects, or strings. Strings are immutable sequences of Unicode code points.

Python 3.5 中字符串是由一系列 Unicode 码位（code point）所组成的不可变序列：

('S' 'T' 'R' 'I' 'N' 'G')

'STRING'

不可变是指无法对字符串本身进行更改操作：

s = 'Hello'
print(s[3])
s[3] = 'o'

l



---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

<ipython-input-2-ce8cf24852f9> in <module>()
      1 s = 'Hello'
      2 print(s[3])
----> 3 s[3] = 'o'


TypeError: 'str' object does not support item assignment

而序列（sequence）则是指字符串继承序列类型（list/tuple/range）的通用操作：

[i.upper() for i in "hello"]

['H', 'E', 'L', 'L', 'O']

至于 Unicode 暂时可以看作一张非常大的地图，这张地图里面记录了世界上所有的符号，而码位则是每个符号所对应的坐标（具体内容将在后面的几期介绍）。

s = '雨'
print(s)
print(len(s))
print(s.encode())

雨
1
b'\xe9\x9b\xa8'

常用操作

len：字符串长度；
split & join
find & index
strip
upper & lower & swapcase & title & capitalize
endswith & startswith & is*
zfill

# split & join
s = "Hello world!"
print(",".join(s.split())) # 常用的切分 & 重组操作

"https://github.com/rainyear/pytips".split("/", 2) # 限定切分次数

Hello,world!





['https:', '', 'github.com/rainyear/pytips']

s = "coffee"
print(s.find('f'))    # 从左至右搜索，返回第一个下标
print(s.rfind('f'))   # 从右至左搜索，返回第一个下表

print(s.find('a'))    # 若不存在则返回 -1
print(s.index('a'))   # 若不存在则抛出 ValueError，其余与 find 相同

2
3
-1



---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

<ipython-input-6-59556fd9319f> in <module>()
      4 
      5 print(s.find('a'))    # 若不存在则返回 -1
----> 6 print(s.index('a'))   # 若不存在则抛出 ValueError，其余与 find 相同


ValueError: substring not found

print(" hello world    ".strip())
print("helloworld".strip("heo"))
print("["+"          i         ".lstrip() +"]")
print("["+"          i         ".rstrip() +"]")

hello world
lloworld
[i         ]
[          i]

print("{}\n{}\n{}\n{}\n{}".format(
    "hello, WORLD".upper(),
    "hello, WORLD".lower(),
    "hello, WORLD".swapcase(),
    "hello, WORLD".capitalize(),
    "hello, WORLD".title()))

HELLO, WORLD
hello, world
HELLO, world
Hello, world
Hello, World

print("""
{}|{}
{}|{}
{}|{}
{}|{}
{}|{}
{}|{}
""".format(
    "Python".startswith("P"),"Python".startswith("y"),
    "Python".endswith("n"),"Python".endswith("o"),
    "i23o6".isalnum(),"1 2 3 0 6".isalnum(),
    "isalpha".isalpha(),"isa1pha".isalpha(),
    "python".islower(),"Python".islower(),
    "PYTHON".isupper(),"Python".isupper(),
))

True|False
True|False
True|False
True|False
True|False
True|False

"101".zfill(8)

'00000101'

format / encode

格式化输出 format 是非常有用的工具，将会单独进行介绍；encode 会在 bytes-decode-Unicode-encode-bytes 中详细介绍。

欢迎关注公众号 PyHub！

网友评论

本文标题：Python 字符串 - PyTips 0x07

本文链接：https://www.haomeiwen.com/subject/qtyqlttx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

Python 字符串 - PyTips 0x07

定义

常用操作

相关文章

Python 字节与字节数组 - PyTips 0x08

Python 中 Unicode 的正确用法 - PyTips