正则表达式 re包 2018-10-02

作者: 默写年华Antifragile | 来源:发表于2018-10-02 20:17 被阅读20次

正则表达式 re包 2018-10-02
Golang标准库——regexp
RegExp Cheatsheet
python05-正则表达式(一)
Python语言基础之——re模块和面向对象
Python 正则表达式——re模块介绍
Python 正则表达式——re模块介绍
python05-正则表达式(二)
正则表达式
正则表达式

re: regular expression, 简写：regex

正则表达式规则：版本：v2.3.5 (2017-6-12) 作者：deerchao; http://deerchao.net/tutorials/regex/regex.htm
-------------------------------------------------------------------------------------
正则表达式的功能：正则表达式(regular expression)主要功能是从字符串(string)中通过特定的模式(pattern)，搜索想要找到的内容。
-------------------------------------------------------------------------------------

`re`常用函数：

re.compile(pattern, flags)
将一个正则表达式的pattern 转化成一个正则表达式对象
Compile a regular expression pattern into a regular expression object, which can be used for matching using its match(), search() and other methods, described below.

prog = re.compile(pattern)
result = prog.match(string)

is equivalent to

result = re.match(pattern, string)

-------------------------------------------------------------------------------------

re.search(pattern, string, flags = 0)
在 string 中找到 pattern 第一次出现的地方
Scan through string looking for the first location where the regular expression pattern produces a match, and return a corresponding match object. Return None if no position in the string matches the pattern; note that this is different from finding a zero-length match at some point in the string.
-------------------------------------------------------------------------------------
re.match(pattern, string, flags = 0)
在字符串 string 的句首进行匹配 pattern，不能像search()任意匹配
if zero or more characters at the beginning of string match the regular expression pattern, return a corresponding match object. Return None if the string does not match the pattern; note that this is different from a zero-length match.
Note that even in MULTILINE mode, re.match() will only match at the beginning of the string and not at the beginning of each line.
If you want to locate a match anywhere in string, use search() instead (see also search() vs. match()).
-------------------------------------------------------------------------------------
re.split(pattern, string, flags = 0)
以在 string 中匹配到的pattern为界对 string 进行分割，如果 pattern使用了括号，那么找到的pattern也一起返回；
如下所示，'\W+' 匹配1个或者多个任意不是字母、数字、下划线的字符，则匹配到了逗号,以及后面的空格，因此以逗号和空格为界进行分割；第二个例子加了括号，则将匹配到的逗号和空格也进行返回。

>>> re.split(r'\W+', 'Words, words, words.')
['Words', 'words', 'words', '']
>>> re.split(r'(\W+)', 'Words, words, words.')
['Words', ', ', 'words', ', ', 'words', '.', '']
>>> re.split(r'\W+', 'Words, words, words.', 1)
['Words', 'words, words.']
>>> re.split('[a-f]+', '0a3B9', flags=re.IGNORECASE)
['0', '3', '9']

If there are capturing groups in the separator and it matches at the start of the string, the result will start with an empty string. The same holds for the end of the string:如果在字符串头部或者是字符串尾部匹配到，则会增加返回一个空字符串

>>> re.split(r'(\W+)', '...words, words...')
['', '...', 'words', ', ', 'words', '...', '']

-------------------------------------------------------------------------------------

re.sub(pattern,repl, string, count = 0, flags = 0)
用 repl 去无重叠地覆盖 pattern 在 string中匹配的字符：
Return the string obtained by replacing the leftmost non-overlapping occurrences of pattern in string by the replacement repl. If the pattern isn’t found, string is returned unchanged. repl can be a string or a function; if it is a string, any backslash escapes in it are processed. That is, \n is converted to a single newline character, \r is converted to a carriage return, and so forth. Unknown escapes such as & are left alone. Backreferences, such as \6, are replaced with the substring matched by group 6 in the pattern. For example:

>>> re.sub(r'def\s+([a-zA-Z_][a-zA-Z_0-9]*)\s*\(\s*\):',
...        r'static PyObject*\npy_\1(void)\n{',
...        'def myfunc():')
'static PyObject*\npy_myfunc(void)\n{'

这里 def myfunction(): 都被匹配到了，但是([a-zA-Z_][a-zA-Z_0-9]*)加了括号，所以这里面匹配到的 myfunc 视为群组1，然后用 repl 对匹配好的内容进行无重叠地覆盖，由于 string 全部被匹配，因此全部被覆盖，然后再把群组1 往代码中的\1 处替代。

当 repl 是一个函数时：
If repl is a function, it is called for every non-overlapping occurrence of pattern. The function takes a single match object argument, and returns the replacement string. For example:

>>> def dashrepl(matchobj):
...     if matchobj.group(0) == '-': return ' '
...     else: return '-'
>>> re.sub('-{1,2}', dashrepl, 'pro----gram-files')
'pro--gram files'
>>> re.sub(r'\sAND\s', ' & ', 'Baked Beans And Spam', flags=re.IGNORECASE)
'Baked Beans & Spam'

网友评论

本文标题：正则表达式 re包 2018-10-02

本文链接：https://www.haomeiwen.com/subject/hhkaoftx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

正则表达式 re包 2018-10-02

`re`常用函数：

相关文章