JAVA正则表达式

作者: 盼旺 | 来源:发表于2019-09-25 22:04 被阅读0次

正则表达式示例

查找字符串 http://是否出现。

import java.util.regex.Pattern;
public class RegexText {
    public static void main(String[] args) {
        String text    =
                "This is the text to be searched " +
                        "for occurrences of the http:// pattern.";
        String pattern = ".*http://.*";
        boolean matches = Pattern.matches(pattern, text);
        System.out.println("matches = " + matches);
        //matches = true
    }
}

正则表达式的API

Pattern (java.util.regex.Pattern)

Pattern.matches()

检查一个正则表达式的模式是否匹配一段文本的最直接方法是调用静态方法Pattern.matches()

Pattern.compile()
如果需要匹配多次出现,需要通过Pattern.compile() 方法得到一个Pattern实例。

        String text    =
                "This is the text to be searched " +
                        "for occurrences of the http:// pattern.";
        String patternString = ".*http://.*";
//        Pattern pattern = Pattern.compile(patternString);
//        可以在Compile 方法中，指定一个特殊标志：下面是忽略大小写
        Pattern pattern = Pattern.compile(patternString, Pattern.CASE_INSENSITIVE);
        //public static final int CASE_INSENSITIVE = 0x02;

Pattern.matcher()一个r结尾一个s结尾

一旦获得了Pattern对象，接着可以获得Matcher对象。Matcher类有很多方法其中一个matches()方法，可以检查文本是否匹配模式.示例如下

String text    =
        "This is the text to be searched " +
        "for occurrences of the http:// pattern.";
String patternString = ".*http://.*";
Pattern pattern = Pattern.compile(patternString, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(text);
boolean matches = matcher.matches();
System.out.println("matches = " + matches);

Pattern.split()
Pattern 类的 split()方法，可以用正则表达式作为分隔符，把文本分割为String类型的数组。

import java.util.regex.Pattern;
public class RegexText {
    public static void main(String[] args) {
        String text = "A sep Text sep With sep Many sep Separators";
        String patternString = "sep";
        Pattern pattern = Pattern.compile(patternString);
        String[] split = pattern.split(text);
        System.out.println("split.length = " + split.length);
        for(String element : split){
            System.out.println("element = " + element);
        }
        /*
        split.length = 5
        element = A 
        element =  Text 
        element =  With 
        element =  Many 
        element =  Separators
         */
    }
}

Pattern.pattern()
Pattern 类的 pattern 返回用于创建Pattern 对象的正则表达式,示例：

 String patternString = "sep";
Pattern pattern = Pattern.compile(patternString);
String pattern2 = pattern.pattern();
System.out.println(pattern2);//sep
//上面代码中 pattern2 值为sep ，与patternString 变量相同。

Matcher (java.util.regex.Matcher)

用于匹配一段文本中多次出现一个正则表达式
多文本中匹配同一个正则表达式
创建Matcher
通过Pattern 的matcher() 方法创建一个Matcher。

matcher.matches()
返回布尔值是否有匹配到

matcher.lookingAt()
开头匹配正则表达式换句话说是否以正则表达式开头

String text    =
                "This is the text to be searched " +
                        "for occurrences of the http:// pattern.";
String patternString = ".his is the";
Pattern pattern = Pattern.compile(patternString, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(text);
System.out.println("lookingAt = " + matcher.lookingAt());
System.out.println("matches   = " + matcher.matches());
//        lookingAt = true
//        matches   = false  它要求前后不能有多余字符 patternString后面不准匹配

matcher.find() + matcher.start() + matcher.end()
find() 方法用于在文本中查找出现的正则表达式
find()方法返回第一个，之后每次调用 find()都会返回下一个。
start() 和end()返回每次匹配的字串在整个文本中的开始和结束位置。实际上,end()返回的是字符串末尾的后一位，这样，可以在把start() 和 end()的返回值直接用在String.substring()里

String text    =
        "This is the text which is to be searched " +
                "for occurrences of the word 'is'.";
String patternString = "is";
Pattern pattern = Pattern.compile(patternString);
Matcher matcher = pattern.matcher(text);
int count = 0;
while(matcher.find()) {
    count++;
    System.out.println("found: " + count + " : "  + matcher.start() + " - " + matcher.end());
}
//        found: 1 : 2 - 4
//        found: 2 : 5 - 7
//        found: 3 : 23 - 25
//        found: 4 : 70 - 72

matcher.reset()
reset() 方法会重置Matcher内部的匹配状态。当find()方法开始匹配时,Matcher 内部会记录截至当前查找的距离。调用 reset()会重新从文本开头查找
matcher.group()多分组
(John)
此正则表达式匹配John, 括号不属于要匹配的文本。括号定义了一个分组。当正则表达式匹配到文本后，可以访问分组内的部分。
(John) (.+?)
这个表达式匹配文本”John” 后跟一个空格,然后跟1个或多个字符，最后跟一个空格。你可能看不到最后的空格。
字符点 . 表示任意字符。字符 + 表示出现一个或多个，和. 在一起表示任何字符,出现一次或多次。字符? 表示匹配尽可能短的文本。

String text    =
        "John writes about this, and John Doe writes about that," +
                " and John Wayne writes about everything."
        ;
String patternString1 = "(John) (.+?) ";
Pattern pattern = Pattern.compile(patternString1);
Matcher matcher = pattern.matcher(text);
while(matcher.find()) {
    System.out.println("found: " + matcher.group(1) +
            " "       + matcher.group(2));
}
//        found: John writes
//        found: John Doe
//        found: John Wayne

matcher.group()嵌套分组
((John) (.+?))
当遇到嵌套分组时, 分组编号是由左括号的顺序确定的。上例中，分组1 是那个大分组。分组2 是包括John的分组，分组3 是包括 .+? 的分组。

String text    =
        "John writes about this, and John Doe writes about that," +
                " and John Wayne writes about everything."
        ;
String patternString1 = "((John) (.+?)) ";
Pattern pattern = Pattern.compile(patternString1);
Matcher matcher = pattern.matcher(text);
while(matcher.find()) {
    System.out.println("found:   "+matcher.group(1));
}
//        found:   John writes
//        found:   John Doe
//        found:   John Wayne

matcher.replaceAll() + matcher.replaceFirst()
replaceAll()和 replaceFirst()方法可以用于替换Matcher搜索字符串中的一部分。replaceAll()方法替换全部匹配的正则表达式，replaceFirst()只替换第一个匹配的。

String text    =
        "John writes about this, and John Doe writes about that," +
                " and John Wayne writes about everything."
        ;
String patternString1 = "((John) (.+?)) ";
Pattern pattern = Pattern.compile(patternString1);
Matcher matcher = pattern.matcher(text);

String replaceAll = matcher.replaceAll("Joe Blocks ");
System.out.println("replaceAll   = " + replaceAll);
//        replaceAll   = Joe Blocks about this, and Joe Blocks writes about that, and Joe Blocks writes about everything.
String replaceFirst = matcher.replaceFirst("Joe Blocks ");
System.out.println("replaceFirst = " + replaceFirst);
//        replaceFirst = Joe Blocks about this, and John Doe writes about that, and John Wayne writes about everything.

matcher.appendReplacement() + matcher.appendTail()
appendReplacement() 和 appendTail() 方法用于替换输入文本中的字符串短语，同时把替换后的字符串附加到一个 StringBuffer 中。

当find()方法找到一个匹配项时，可以调用 appendReplacement() 方法，这会导致输入字符串被增加到StringBuffer 中，而且匹配文本被替换。从上一个匹配文本结尾处开始，直到本次匹配文本会被拷贝。

appendReplacement()会记录拷贝StringBuffer 中的内容，可以持续调用find(),直到没有匹配项。

直到最后一个匹配项目，输入文本中剩余一部分没有拷贝到StringBuffer. 这部分文本是从最后一个匹配项结尾，到文本末尾部分。通过调用 appendTail() 方法，可以把这部分内容拷贝到 StringBuffer 中.

String text    =
        "John writes about this, and John Doe writes about that," +
                " and John Wayne writes about everything."
        ;

String patternString1 = "((John) (.+?)) ";
Pattern pattern = Pattern.compile(patternString1);
Matcher  matcher = pattern.matcher(text);
StringBuffer stringBuffer = new StringBuffer();

while(matcher.find()){
    matcher.appendReplacement(stringBuffer, "Joe Blocks ");
    System.out.println(stringBuffer.toString());
}
matcher.appendTail(stringBuffer);
System.out.println(stringBuffer.toString());
//Joe Blocks
//Joe Blocks about this, and Joe Blocks
//Joe Blocks about this, and Joe Blocks writes about that, and Joe Blocks
//Joe Blocks about this, and Joe Blocks writes about that, and Joe Blocks writes about everything.