参考 - 这个正则表达式是啥意思?
Posted
技术标签:
【中文标题】参考 - 这个正则表达式是啥意思?【英文标题】:Reference - What does this regex mean?参考 - 这个正则表达式是什么意思? 【发布时间】:2021-05-20 00:34:24 【问题描述】:这是什么?
这是常见问答集。这也是一个社区 Wiki,因此邀请所有人参与维护它。
这是为什么?
regex 正在遭受 give me ze code 类型的问题和没有任何解释的糟糕答案。此参考旨在提供高质量问答的链接。
范围是多少?
此参考适用于以下语言:php、perl、javascript、python、ruby、java、.net。
这可能过于宽泛,但这些语言共享相同的语法。对于特定的功能,它背后有语言的标签,例如:
什么是正则表达式平衡组? .net【问题讨论】:
I created a meta discussion, everyone is invited >>> 【参考方案1】:堆栈溢出正则表达式常见问题解答
在regextag details page 上还可以查看很多一般提示和有用的链接。
在线教程
RegexOne ↪ Regular Expressions Info ↪量词
零个或多个:*
:greedy、*?
:reluctant、*+
:possessive
一个或多个:+
:greedy、+?
:reluctant、++
:possessive
?
:optional (zero-or-one)
最小/最大范围(包括所有):n,m
:between n & m、n,
:n-or-more、n
:exactly n
贪婪、不情愿(又名“懒惰”、“不贪婪”)和所有格量词之间的区别:
Greedy vs. Reluctant vs. Possessive Quantifiers
In-depth discussion on the differences between greedy versus non-greedy
What's the difference between n
and n?
Can someone explain Possessive Quantifiers to me? php, perl, java, ruby
Emulating possessive quantifiers.net
非堆栈溢出参考:来自Oracle、regular-expressions.info
字符类
What is the difference between square brackets and parentheses?[...]
:任何一个字符,[^...]
:否定/任何字符,但是
[^]
matches any one character including newlinesjavascript
[\w-[\d]]
/ [a-z-[qz]]
:设置减法.net,xml-schema,xpath,JGSoft
[\w&&[^\d]]
: set intersectionjava,ruby1.9+
[[:alpha:]]
:POSIX 字符类
[[:<:]]
和 [[:>:]]
字边界
Why do [^\\D2]
, [^[^0-9]2]
, [^2[^0-9]]
get different results in Java?java
简写:
数字:\d
:digit,\D
:non-digit
单词字符(字母、数字、下划线):\w
:word character、\W
:non-word character
空白:\s
:whitespace,\S
:non-whitespace
Unicode categories (\pL, \PL
, etc.)
转义序列
水平空白:\h
:space-or-tab, \t
:tab
换行符:
\r
, \n
:carriage return and line feed
\R
:generic newlinephpjava-8
否定的空白序列:\H
:Non horizontal whitespace character, \V
:Non vertical whitespace character, \N
:Non line feed characterpcrephp5java-8
其他:\v
:vertical tab、\e
:the escape character
锚点
anchor | matches | flavors |
---|---|---|
^ |
Start of string | Common* |
^ |
Start of line | Commonm
|
$ |
End of line | Commonm
|
$ |
End of text | Common* |
$ |
The very end of string |
phpD , javascript
|
\A |
Start of string | Common except js |
\Z |
End of text | Common except js python |
\Z |
The very end of string | python |
\z |
The very end of string | Common except js python |
\b |
Word boundary | Common |
\B |
Not a word boundary | Common |
\G |
End of previous match | Common except js, python re
|
Term | Definition |
---|---|
Start of string | At the very start of the string. |
Start of line | At the very start of the string, andafter a non-terminal line terminator. |
End of string | At the very end of the string. |
End of text | At the very end of the string, andat a terminal line terminator. |
End of line | At the very end of the string, andat a line terminator. |
Word boundary | At a word character not preceded by a word character, andat a non-word character not preceded by a non-word character. |
End of previous match | At a previously set position, usually where a previous match ended.At the very start of the string if no position was set. |
“普通”指的是:icujavajs.netobjective-cpcreperlphpphpphppythonswiftswiftp>
* 默认|
m
多行模式。 |
D
美元结束模式。
群组
(...)
:capture group, (?:)
:non-capture group
Why is my repeating capturing group only capturing the last match?
\1
:backreference and capture-group reference, $1
:capture group reference
What's the meaning of a number after a backslash in a regular expression?
\g<1>123
:How to follow a numbered capture group, such as \1
, with a number?:python
What does a subpattern (?i:regex)
mean?
What does the 'P' in (?P<group_name>regexp)
mean?
(?>)
:atomic group 或 independent group, (?|)
:branch reset
Equivalent of branch reset in .NET/C#.net
命名的捕获组:
General named capturing group reference at regular-expressions.info
java: (?<groupname>regex)
: Overview 和 naming rules (非堆栈溢出链接)
其他语言:(?P<groupname>regex)
python,(?<groupname>regex)
.net,(?<groupname>regex)
perl,(?P<groupname>regex)
和(?<groupname>regex)
php
环顾四周
前瞻:(?=...)
:positive、(?!...)
:negative
向后看:(?<=...)
:positive、(?<!...)
:negative
后向限制:
Lookbehinds need to be constant-length php, perl, python, ruby
Lookarounds of limited length 0,n
java
Variable length lookbehinds are allowed.net
后视替代方案:
Using \K
php,perl(Flavors that support \K
)
Alternative regex module for Pythonpython
The hacky way
JavaScript negative lookbehind equivalents External link
修饰符
flag | modifier | flavors |
---|---|---|
a |
ASCII | python |
c |
current position | perl |
e |
expression | php perl |
g |
global | most |
i |
case-insensitive | most |
m |
multiline | php perl python javascript .net java |
m |
(non)multiline | ruby |
o |
once | perl ruby |
S |
study | php |
s |
single line | ruby |
U |
ungreedy | php r |
u |
unicode | most |
x |
whitespace-extended | most |
y |
sticky ↪ | javascript |
其他:
|
:alternation (OR) operator, .
:any character, [.]
:literal dot character
What special characters must be escaped?
控制动词(php 和 perl):(*PRUNE)
、(*SKIP)
、(*FAIL)
and (*F)
php 仅限:(*BSR_ANYCRLF)
递归(php 和 perl):(?R)
,(?0)
and (?1)
,(?-1)
,(?&groupname)
常见任务
Get a string between two curly braces:...
Match (or replace) a pattern except in situations s1, s2, s3...
How do I find all YouTube video ids in a string using a regex?
验证:
互联网:email addresses、URLs(主机/端口:regex 和 non-regex 替代品)、passwords
数字:a number、min-max ranges (such as 1-31)、phone numbers、date
使用正则表达式解析 html:请参阅“一般信息 > 何时不使用正则表达式”
高级正则表达式-Fu
字符串和数字: Regular expression to match a line that doesn't contain a word How does this PCRE pattern detect palindromes? Match strings whose length is a fourth power How does this regex find triangular numbers? How to determine if a number is a prime with regex? How to match the middle character in a string with regex? 其他: How can we match a^n b^n? 匹配嵌套括号 Using a recursive patternphp,perl Using balancing groups.net “Vertical” regex matching in an ASCII “image” List of highly up-voted regex questions on Code Golf How to make two quantifiers repeat the same number of times? An impossible-to-match regular expression:(?!a)a
Match/delete/replace this
except in contexts A, B and C
Match nested brackets with regex without using recursion or balancing groups?
风味特定信息
(标有*
的除外,此部分包含非堆栈溢出链接。)
java.util.regex.Matcher
中函数的区别:
matches()
): 匹配必须同时定位到 input-start 和 -end
find()
): 匹配可能在输入字符串(子字符串)中的任何位置
lookingAt()
: 匹配必须仅锚定到输入开始
(对于一般的锚点,请参阅“锚点”部分)
唯一接受正则表达式的java.lang.String
函数:matches(s)
、replaceAll(s,s)
、replaceFirst(s,s)
、split(s)
、split(s,i)
*An (opinionated and) detailed discussion of the disadvantages of and missing features in java.util.regex
.NET
How to read a .NET regex with look-ahead, look-behind, capturing groups and back-references mixed together?
官方文档:
Boost 正则表达式引擎:General syntax、Perl syntax (由 TextPad、Sublime Text、UltraEdit 使用...???)
JavaScript general info 和 RegExp object
.NETmysqlOraclePerl5 version 18.2
PHP:pattern syntax,preg_match
Python:Regular expression operations、search
vs match
、how-to
锈:crate regex
,struct regex::Regex
Splunk:regex terminology and syntax 和 regex command
Tcl:regex syntax,manpage,regexp
command
Visual Studio Find and Replace
一般信息
(标有*
的链接是非堆栈溢出链接。)
可能导致正则表达式引擎失败的正则表达式示例
Why does this regular expression kill the Java regex engine?工具:测试者和解释者
(此部分包含非 Stack Overflow 链接。)
在线(*包括更换测试仪,+包括拆分测试仪):
Debuggex(还有一个有用的正则表达式存储库)javascript,python,pcre *Regular Expressions 101 php, pcre, python, javascript Regex Pal, regular-expressions.info javascript RubularrubyRegExrRegex Herodotnet *+ regexstorm.net .net * RegexPlanet:Java 987654612 @,Go 987654614 @,Haskell 987654616 @,JavaScript 987654618 @,.NET 987654620 @,Perl 987654622 php PCRE php, Python python, Ruby ruby, XRegExp xregexpfreeformatter.com
xregexp
*+regex.larsolavtorvik.com
php PCRE 和 POSIX,javascript
Refiddle javascript ruby .net
离线:
Microsoft Windows:RegexBuddy(分析)、RegexMagic(创建)、Expresso(分析、创建、免费)【讨论】:
相关:the question for which an answer 以 “您无法使用正则表达式解析 [X]HTML。” 开头。 Perl 有更多(例如(?(
用于条件),但您可以阅读 perl 官方文档。
“工具”部分下的Refiddle
现在指向某个在线赌场网站。可能应该被删除。以上是关于参考 - 这个正则表达式是啥意思?的主要内容,如果未能解决你的问题,请参考以下文章