Python正则表达式

作者: 断尾壁虎V | 来源:发表于2017-12-15 14:32 被阅读0次

正则表达式
正则表达式
Python正则表达式指南
Python爬虫(十)_正则表达式
python正则表达式
[转]python正则表达式(一) 函数使用
Python正则表达式
Python正则表达式用法详解
Python正则表达式指南
Python处理正则表达式超时的办法

正则表达式中，常用的匹配模式：

image.png

python 正则表达式的语法：

image.png

使用正则表达式，需要导入模块：

import  re
print(re.findall('\w','hello_ | @ 12123'))  # 匹配字母数字下划线
print(re.findall('\W','hello_ | @ 12123'))  # 匹配非字母数字下划线
print(re.findall('\s','hello_ | @ 12123'))   # 匹配任意字符
print(re.findall('\S','hello_ | @ 12123'))   # 匹配任意空字符
print(re.findall('\d','hello_ | @ 12123'))    # 匹配数字
print(re.findall('\D','hello_ | @ 12123'))    # 匹配非数字
print(re.findall('\Ah','hello_ | @ 12123'))   # 以h开头的字符
print(re.findall('^h','hello_ | @ 12123'))    # 以h开头的字符
print(re.findall('123\Z','hello_ | @ 12123')) # 以123结尾的字符
print(re.findall('123$','hello_ | @ 12123'))  # 以123结尾的字符
print(re.findall('\n','hello_ | @ 12123 \n df \t'))  # 匹配换行
print(re.findall('\t','hello_ | @ 12123 \n df \t'))  # 匹配tab空格

输出结果：
['h', 'e', 'l', 'l', 'o', '', '1', '2', '1', '2', '3']
[' ', '|', ' ', '@', ' ']
[' ', ' ', ' ']
['h', 'e', 'l', 'l', 'o', '', '|', '@', '1', '2', '1', '2', '3']
['1', '2', '1', '2', '3']
['h', 'e', 'l', 'l', 'o', '_', ' ', '|', ' ', '@', ' ']
['h']
['h']
['123']
['123']
['\n']
['\t']

通配符

. 表示任意一个字符：

import re
print(re.findall('a.b','a\nb a1b'))  # ['a1b']

* 表示匹配* 前一个字符0次或多次

import re
print(re.findall('ab*','ab  a  abbb vd')) # ['ab', 'a', 'abbb']

? 匹配前一个字符的0次或一次

import re
print(re.findall('ab?','a  abbbbb')) #['a', 'ab']

+ 匹配前一个字符一次或者多次

import re
print(re.findall('ab+','abbb a abc')) # ['abbb', 'ab']

匹配所有包含小数的数字， .前面的反斜杠为转义：

print(re.findall('\d+\.?\d*',"asdfasdf123as1.13dfa12adsf1asdf3")) #['123', '1.13', '12', '1', '3']

.*默认为贪婪匹配：

print(re.findall('a.*b','a1b22222222b')) #['a1b22222222b']

.*? 为非贪婪匹配：推荐使用

print(re.findall('a.*?b','a1b22222222b')) #['a1b']

{m,n} 匹配m次至n次：

import re
print(re.findall('ab{2,4}','abbbbb')) # ['abbbb']

\ print(re.findall('a\c','a\c')) 对于正则来说a\c确实可以匹配到a\c,但是在python解释器读取a\c时，会发生转义，然后交给re去执行，所以抛出异常

print(re.findall(r'a\\c','a\c')) # r 代表告诉解释器使用rawstring，即原生字符串，把我们正则内的所有符号都当普通字符处理，不要转义
print(re.findall('a\\\\c','a\c')) #同上面的意思一样，和上面的结果一样都是['a\\c']

re的其他方法

search # 匹配成功一次后就不在匹配，返回对象,使用group()返回匹配的内容。

import re
print(re.search('ab','dddd  abbbbbabab')) # <_sre.SRE_Match object; span=(6, 8), match='ab'>
print(re.search('ab','dddd  abbbbbabab').group()) # ab

match 在自读开始处进行匹配，同search+^:

print(re.match('e','alex make love'))  #None
print(re.search('^e','alex make love'))  #None

splite 对字符进行切分：

print(re.split('[ab]','abcd'))     #['', '', 'cd']，先按'a'分割得到''和'bcd',再对''和'bcd'分别按'b'分割

sub 不指定次数，默认替换所有：

print(re.sub('a','A','i have apple')) # i hAve Apple
print(re.sub('a','A','I have a dog',1)) #I hAve a dog 替换一次
print(re.sub('a','A','I have a dog',2)) #I hAve A dog 替换两次

print(re.sub('(\w)(\W+)(\w+)(\W+)(\w+)(\W+)(\w+)',r'\7\2\3\4\5\6\1','I have a dog')) #dog have a I
print(re.sub('(\w)( .* )(\w+)',r'\3\2\1','I have a dog')) #dog have a I

print(re.subn('a','A','I have a dog')) #   ('I hAve A dog', 2),结果带有总共替换的个数

compile 预先定义正则表达式规则。

obj=re.compile('\d{2}')

print(obj.search('abc123eeee').group()) #12
print(obj.findall('abc123eeee')) #['12'],重用了obj

网友评论

本文标题：Python正则表达式

本文链接：https://www.haomeiwen.com/subject/bvpkwxtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

Python正则表达式

通配符

re的其他方法

相关文章

正则表达式

正则表达式