美文网首页
python字符串和文本

python字符串和文本

作者: fcharming | 来源:发表于2018-04-10 18:06 被阅读0次

    Python CookBook总结

    用Shell 通配符匹配字符串

    你想使用Unix Shell 中常用的通配符(比如.py , Dat[0-9].csv 等) 去匹配文
    本字符串

    >>> from fnmatch import fnmatch, fnmatchcase
    >>> fnmatch('foo.txt', '*.txt')
    True
    >>> fnmatch('foo.txt', '?oo.txt')
    True
    >>> fnmatch('Dat45.csv', 'Dat[0-9]*')
    True
    >>> names = ['Dat1.csv', 'Dat2.csv', 'config.ini', 'foo.py']
    >>> [name for name in names if fnmatch(name, 'Dat*.csv')]
    ['Dat1.csv', 'Dat2.csv']
    

    fnmatch() 函数使用底层操作系统的大小写敏感规则(不同的系统是不一样的) 来
    匹配模式。比如:

    >>> # On OS X (Mac)
    >>> fnmatch('foo.txt', '*.TXT')
    False
    >>> # On Windows
    >>> fnmatch('foo.txt', '*.TXT')
    True
    

    字符串匹配和搜索

    如果你想匹配的是字面字符串,那么你通常只需要调用基本字符串方法就行

    >>> text = 'yeah, but no, but yeah, but no, but yeah'
    >>> # Exact match
    >>> text == 'yeah'
    False
    >>> # Match at start or end
    >>> text.startswith('yeah')
    True
    >>> text.endswith('no')
    False
    >>> # Search for the location of the first occurrence
    >>> text.find('no')
    10
    

    更复杂一些,需要使用正则表达式模块re

    >>> text1 = '11/27/2012'
    >>> import re
    >>> re.match(r'\d+/\d+/\d+', text1):
    >>> datepat = re.compile(r'\d+/\d+/\d+')
    >>> datepat.match(text1)
    >>> text = 'Today is 11/27/2012. PyCon starts 3/13/2013.'
    >>> datepat.findall(text)
    ['11/27/2012', '3/13/2013']
    

    字符串搜索和替换

    对于简单的字面模式,直接使用str.repalce() 方法即可,比如:

    >>> text = 'yeah, but no, but yeah, but no, but yeah'
    >>> text.replace('yeah', 'yep')
    'yep, but no, but yep, but no, but yep'
    

    对于复杂的模式,请使用re 模块中的sub() 函数。为了说明这个,假设你想将形
    式为11/27/2012 的日期字符串改成2012-11-27 。示例如下:

    >>> text = 'Today is 11/27/2012. PyCon starts 3/13/2013.'
    >>> import re
    >>> re.sub(r'(\d+)/(\d+)/(\d+)', r'\3-\1-\2', text)
    'Today is 2012-11-27. PyCon starts 2013-3-13.'
    #如果你打算用相同的模式做多次替换,考虑先编译它来提升性能
    >>> datepat = re.compile(r'(\d+)/(\d+)/(\d+)')
    >>> datepat.sub(r'\3-\1-\2', text)
    'Today is 2012-11-27. PyCon starts 2013-3-13.'
    #对于更加复杂的替换,可以传递一个替换回调函数来代替
    >>> from calendar import month_abbr
    >>> def change_date(m):
    ... mon_name = month_abbr[int(m.group(1))]
    ... return '{} {} {}'.format(m.group(2), mon_name, m.group(3))
    ...
    >>> datepat.sub(change_date, text)
    'Today is 27 Nov 2012. PyCon starts 13 Mar 2013.'
    

    如果除了替换后的结果外,你还想知道有多少替换发生了,可以使用re.subn()来代替。比如:

    >>> newtext, n = datepat.subn(r'\3-\1-\2', text)
    >>> newtext
    'Today is 2012-11-27. PyCon starts 2013-3-13.'
    >>> n
    2
    

    你需要以忽略大小写的方式搜索与替换文本字符串

    >>> text = 'UPPER PYTHON, lower python, Mixed Python'
    >>> re.findall('python', text, flags=re.IGNORECASE)
    ['PYTHON', 'python', 'Python']
    >>> re.sub('python', 'snake', text, flags=re.IGNORECASE)
    'UPPER snake, lower snake, Mixed snake'
    

    最短匹配模式

    #*是贪婪的,会尽可能多的匹配
    >>> str_pat = re.compile(r'\"(.*)\"')
    >>> text1 = 'Computer says "no."'
    >>> str_pat.findall(text1)
    ['no.']
    >>> text2 = 'Computer says "no." Phone says "yes."'
    >>> str_pat.findall(text2)
    ['no." Phone says "yes.']
    #?是不贪婪的,尽可能少的匹配
    >>> str_pat = re.compile(r'\"(.*?)\"')
    >>> str_pat.findall(text2)
    ['no.', 'yes.']
    

    字符串对齐

    #使用字符串的ljust() , rjust() 和center()方法
    >>> text = 'Hello World'
    >>> text.ljust(20)
    'Hello World '
    >>> text.rjust(20)
    ' Hello World'
    >>> text.center(20)
    ' Hello World '
    >>> text.rjust(20,'=')
    '=========Hello World'
    >>> text.center(20,'*')
    '****Hello World*****'
    #函数format() 同样可以用来很容易的对齐字符串。
    #你要做的就是使用<,> 或者ˆ 字符后面紧跟一个指定的宽度。
    >>> format(text, '>20')
    ' Hello World'
    >>> format(text, '<20')
    'Hello World '
    >>> format(text, '^20')
    ' Hello World '
    >>> format(text, '=>20s')
    '=========Hello World'
    >>> format(text, '*^20s')
    '****Hello World*****'
    

    以指定列宽格式化字符串

    s = "Look into my eyes, look into my eyes, the eyes, the eyes, \
    the eyes, not around the eyes, don't look around the eyes, \
    look into my eyes, you're under."
    >>> import textwrap
    >>> print(textwrap.fill(s, 70))
    Look into my eyes, look into my eyes, the eyes, the eyes, the eyes,
    not around the eyes, don't look around the eyes, look into my eyes,
    you're under.
    >>> print(textwrap.fill(s, 40))
    Look into my eyes, look into my eyes,
    the eyes, the eyes, the eyes, not around
    the eyes, don't look around the eyes,
    look into my eyes, you're under.
    >>> print(textwrap.fill(s, 40, initial_indent=' '))
    

    相关文章

      网友评论

          本文标题:python字符串和文本

          本文链接:https://www.haomeiwen.com/subject/bxmnhftx.html