美文网首页
Regular Expression

Regular Expression

作者: 变与不变 | 来源:发表于2015-01-27 09:02 被阅读88次

I spend 3 hours to learn about regular expression. The lesson comes from The Linux Command Line, that is one open source e-book about the Linux bash command.To demo the regular expressions, this book use the grep command:

The name grep is actually derived from the phrase "global regular expression print".

metacharacters

Regular expression metacharacters consist of the following:
^ $ . [] {} - ? * + () | \

POSIX Character Classes

Basically , we need to understand a little about the characters code history:

Back when Unix was first developed, it only knew about ASCII characters, and this fea- ture reflects that fact. In ASCII, the first 32 characters (numbers 0-31) are control codes (things like tabs, backspaces, and carriage returns). The next 32 (32-63) contain printable characters, including most punctuation characters and the numerals zero through nine. The next 32 (numbers 64-95) contain the uppercase letters and a few more punctuation symbols. The final 31 (numbers 96-127) contain the lowercase letters and yet more punc- tuation symbols. Based on this arrangement, systems using ASCII used a collation order that looked like this:
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
This differs from proper dictionary order, which is like this:
aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ
As the popularity of Unix spread beyond the United States, there grew a need to support characters not found in U.S. English. The ASCII table was expanded to use a full eight bits, adding characters numbers 128-255, which accommodated many more languages. To support this ability, the POSIX standards introduced a concept called a locale, which could be adjusted to select the character set needed for a particular location.

We also need to pay attention to the difference between pathname expansion and regular expression, but POSIX characters classes can be used for both.

BRE and ERE

BRE : basic regular expressions, following metacharacters are recognized:
^ $ . [] *
ERE: extended regular expression, besides the BRE metacharacters , the following metacharacters(AND THEIR ASSOCIATED FUNCTIONS) are ADDED:
() {} ? + |

The “(”, “)”, “{”, and “}” characters are treated as metacharacters in BRE if they are escaped with a backslash, whereas with ERE, preced- ing any metacharacter with a backslash causes it to be treated as a literal.

At last, what is the means to POSIX? POSIX is Portable Operating System Interface (with the "X" added to the end for extra snappiness).

相关文章

  • 正则表达式 re包 2018-10-02

    参考官网:Regular expression operations re: regular expression...

  • 10. Regular Expression Matching

    最怕regular Expression的题了。出现regular expression立刻想到几点:notes:...

  • AX 使用.Net的Regular Expression

    AX本身不支持regular expression,但可以使用.Net 的Regular Expression。 ...

  • Regular Expression

    校验数字的表达式 数字:^[0-9]*$ n位的数字:^\d{n}$ 至少n位的数字:^\d{n,}$ m-n位的...

  • Regular Expression

    I spend 3 hours to learn about regular expression. The le...

  • Regular Expression

    \d,\w,\s,[a-zA-Z0-9],\b,.,*,+,?,x{3},^,$分别是什么?\d:匹配数字\w :...

  • Regular expression

    定义 来自维基百科正则表达式。在理论计算机科学和形式语言理论中,正则表达式是定义搜索模式的一串字符。这种模式通常用...

  • Regular Expression

    1 正则常规用法 ①几个常用方法 正则调用: test()<用于检测字符是否匹配某个模式,有则返回true,否则返...

  • Regular Expression

    一、在 iOS 中使用正则表达式 二、Online Tool LINK 三、匹配字符 Special Symbol...

  • Regular Expression

    The regular expression written by me in github. Regular e...

网友评论

      本文标题:Regular Expression

      本文链接:https://www.haomeiwen.com/subject/suikxttx.html