字符串不仅支持所有通用序列操作,还实现了很多附件方法。
我会以『字符串方法』为标题,分几篇笔记逐一介绍这些方法。
我会在这仓库中持续更新笔记:https://github.com/orca-j35/python_notes
join
🔨 str.join(iterable)
Return a string which is the concatenation of the strings in iterable. A TypeError
will be raised if there are any non-string values in iterable, including bytes
objects. The separator between elements is the string providing this method.
# 将iterable中字符串进行连接,并以调用该方法的字符串作为分隔符
>>> '-'.join(['ab','cd','ef'])
'ab-cd-ef'
>>> '-'.join(['ab'])
'ab'
>>> '-'.join([])
''
>>> '/'.join(dict(name='joy',age=3))
'name/age'
# 如果iterable中包含非字符串对象,则会抛出TypeError异常
# bytes对象同样会引发TypeError异常
partition&rpartition
🔨 str.partition(sep)
Split the string at the first occurrence of sep, and return a 3-tuple containing the part before the separator, the separator itself, and the part after the separator. If the separator is not found, return a 3-tuple containing the string itself, followed by two empty strings.
# 该方法会将字符分拆为三个部分
# 从字符串低位索引开始,在一次遇到sep时对字符串进行分拆,会将字符串分拆为3个字符串:
# sep之前的字符构成第一个字符串,sep构成第二个字符串,sep之后的字符构成第三个字符串
>>> 'abcdabcd'.partition('cd')
('ab', 'cd', 'abcd')
>>> 'abcdabcd'.partition('ab')
('', 'ab', 'cdabcd')
>>> 'abcd'.partition('cd')
('ab', 'cd', '')
# 如果字符串中没有sep,也会返回三个元组:
# 原字符串构成第一个字符串,后两个字符串均为空
>>> 'abcdabcd'.partition('ef')
('abcdabcd', '', '')
🔨 str.rpartition(sep)
Split the string at the last occurrence of sep, and return a 3-tuple containing the part before the separator, the separator itself, and the part after the separator. If the separator is not found, return a 3-tuple containing two empty strings, followed by the string itself.
# 该方法会将字符分拆为三个部分
# 从字符串高位索引开始,在一次遇到sep时对字符串进行分拆,会将字符串分拆为3个字符串:
# sep之前的字符构成第一个字符串,sep构成第二个字符串,sep之后的字符构成第三个字符串
>>> 'abcdabcd'.rpartition('cd')
('abcdab', 'cd', '')
>>> 'abcdabcd'.rpartition('ab')
('abcd', 'ab', 'cd')
>>> 'abcd'.rpartition('ab')
('', 'ab', 'cd')
# 如果字符串中没有sep,也会返回三个元组:
# 前两个字符串均为空,原字符串构成第三个字符串,
>>> 'abcdabcd'.rpartition('ef')
('', '', 'abcdabcd')
split&rsplit
🔨 str.split(sep=None, maxsplit=-1)
Return a list of the words in the string, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done (thus, the list will have at most maxsplit+1
elements). If maxsplit is not specified or -1
, then there is no limit on the number of splits (all possible splits are made).
If sep is given, consecutive delimiters are not grouped together and are deemed to delimit empty strings (for example, '1,,2'.split(',')
returns ['1', '', '2']
). The separgument may consist of multiple characters (for example, '1<>2<>3'.split('<>')
returns ['1', '2', '3']
). Splitting an empty string with a specified separator returns ['']
.
# 该方法会以sep作为分隔符,对字符串进行拆解,并返回拆解后的列表
# 拆解操作始于字符的左侧
>>> '1,2,3'.split(',')
['1', '2', '3']
# maxsplit用于指定分解次数;默认值是-1,表示进行最大限度的拆解
>>> '1,2,3'.split(',', maxsplit=1)
['1', '2,3']
>>> ''.split('-')
['']
>>> 'bcd'.split('a')
['bcd']
# 连续的分隔符和尾部的分隔符,均会产生空字符串
>>> '1,2,,,3,'.split(',')
['1', '2', '', '', '3', '']
# sep可以包含多个字符
>>> '1<>2<>3'.split('<>')
['1', '2', '3']
If sep is not specified or is None
, a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace. Consequently, splitting an empty string or a string consisting of just whitespace with a None
separator returns []
.
# 如果sep的值为None,则会将连续的空白符视为分隔符
>>> '1 2 3'.split()
['1', '2', '3']
>>> '1\t2\n3'.split()
['1', '2', '3']
>>> '1,2,3'.split()
['1,2,3']
>>> '1 2 3'.split(maxsplit=1)
['1', '2 3']
# 字符串的头部和尾部的空白符,不会产生空字符串
>>> ' 1 2 3 '.split()
['1', '2', '3']
# 拆解仅包含空白符的字符串会返回一个空列表
>>> ' '.split()
[]
>>> ''.split()
[]
🔨 str.rsplit(sep=None, maxsplit=-1)
Return a list of the words in the string, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done, the rightmost ones. If sep is not specified or None
, any whitespace string is a separator. Except for splitting from the right, rsplit()
behaves like split()
which is described in detail below.
# 该方法会以sep作为分隔符,对字符串进行拆解,并返回拆解后的列表
# 拆解操作始于字符的右侧,其余行为和split()一致
>>> '1,2,3'.rsplit(',', maxsplit=1)
['1,2', '3']
>>> ',1,2,,3,'.rsplit(',')
['', '1', '2', '', '3', '']
>>> '1 2 3'.rsplit(maxsplit=1)
['1 2', '3']
splitlines
🔨 str.splitlines([keepends])
Return a list of the lines in the string, breaking at line boundaries. Line breaks are not included in the resulting list unless keepends is given and true.
该方法会将行边界符作为分拆点,将字符串拆解为由多字符串组成的列表。当 keepends 为 True·
时,则会在结果中保留行边界符。
以下是作为分拆依据的行边界符(line boundaries)。注意,行边界符是通用换行符('\n','\r\n','\r')的超集(universal newlines)
Representation | Description |
---|---|
\n |
Line Feed |
\r |
Carriage Return |
\r\n |
Carriage Return + Line Feed |
\v or \x0b
|
Line Tabulation |
\f or \x0c
|
Form Feed |
\x1c |
File Separator |
\x1d |
Group Separator |
\x1e |
Record Separator |
\x85 |
Next Line (C1 Control Code) |
\u2028 |
Line Separator |
\u2029 |
Paragraph Separator |
Changed in version 3.2: \v
and \f
added to list of line boundaries.
# \r\n 被视作一个整体
>>> 'ab c\n\nde fg\rkl\r\n'.splitlines()
['ab c', '', 'de fg', 'kl']
>>> 'ab c\n\nde fg\rkl\r\n'.splitlines(keepends=True)
['ab c\n', '\n', 'de fg\r', 'kl\r\n']
Unlike split()
when a delimiter string sep is given, this method returns an empty list for the empty string, and a terminal line break does not result in an extra line:
# 在遇到空字符串时,splitlines会返回一个空列表
>>> "".splitlines()
[]
>>> "One line\n".splitlines()
['One line']
对比 split('\n')
:
# 在给定sep时,split会在遇到空字符串时返回一个包含空字符串的列表
>>> ''.split('\n')
['']
>>> 'Two lines\n'.split('\n')
['Two lines', '']
网友评论