正则表达式是处理文本数据的瑞士军刀,平时在elisp编程中会经常用到,下面梳理一下正则表达式在文本搜索和替换时常用的几种用法。
;; 在buffer中和字符串中正则搜索和替换的多种方式集锦
(with-temp-buffer
(insert "abc123de4567xxx")
(setq regex "[a-z]+\\([0-9]+\\)[a-z]+\\([0-9]+\\)")
(goto-char (point-min))
(setq cur-line (buffer-substring-no-properties (line-beginning-position) (line-end-position)))
;; 1. 在buffer中正则搜索,并获取到捕获组的内容
(goto-char (point-min))
(while (re-search-forward regex nil t)
(message "完整匹配结果:%s" (match-string 0))
(message "捕获组1:%s" (match-string 1))
(message "捕获组2:%s" (match-string 2)))
;; 输出
;; 完整匹配结果:abc123de4567
;; 捕获组1:123
;; 捕获组2:4567
;; 2. 在字符串中正则搜索,并获取到捕获组的内容
(string-match regex cur-line)
(message "完整匹配结果:%s" (match-string 0 cur-line))
(message "捕获组1:%s" (match-string 1 cur-line))
(message "捕获组2:%s" (match-string 2 cur-line))
;; 输出:
;; 完整匹配结果:abc123de4567
;; 捕获组1:123
;; 捕获组2:4567
;; 3. 在字符串中正则搜索和替换(带捕获组)
(message "替换前的字符串:%s" cur-line)
;; 在捕获组1的前后加上"%",在捕获组2的前后加上"#"
(message "替换后的字符串:%s" (replace-regexp-in-string regex "%\\1%#\\2#" cur-line))
;; 输出:
;; 替换前的字符串:abc123de4567xxx
;; 替换后的字符串:%123%#4567#xxx
;; 4. 在字符串中正则搜索和替换,并用lambda函数处理捕获组
(message "替换前的字符串:%s" cur-line)
;; 在捕获组1的前后加上"%",在捕获组2的前后加上"#"
(message "替换后的字符串:%s" (replace-regexp-in-string regex (lambda (s) (concat "%" (match-string 1 s) "%#" (match-string 2 s) "#")) cur-line))
;; 输出:
;; 替换前的字符串:abc123de4567xxx
;; 替换后的字符串:%123%#4567#xxx
;; 5. 在buffer中正则搜索和替换
(message "替换前的buffer内容:%s" (buffer-substring-no-properties (point-min) (point-max)))
(goto-char (point-min))
;; 把一个或多个数字替换为"#"
(replace-regexp "[0-9]+" "#")
(message "替换后的buffer内容:%s" (buffer-substring-no-properties (point-min) (point-max)))
;; 输出:
;; 替换前的buffer内容:abc123de4567xxx
;; 替换后的buffer内容:abc#de#xxx
;; 6. 在buffer中正则搜索和替换,带捕获组
(goto-char (point-min))
(message "替换前的buffer内容:%s" (buffer-substring-no-properties (point-min) (point-max)))
;; 在开头的字母子串前后加上方括号"[]"
(replace-regexp "^\\([a-z]+\\)" "[\\1]")
(message "替换后的buffer内容:%s" (buffer-substring-no-properties (point-min) (point-max)))
;; 输出:
;; 替换前的buffer内容:abc#de#xxx
;; 替换后的buffer内容:[abc]#de#xxx
)
网友评论