Rails 6- 符合 RFC5322 的电子邮件验证

Posted

技术标签:

【中文标题】Rails 6- 符合 RFC5322 的电子邮件验证【英文标题】:Rails 6- RFC5322 compliant Email validation 【发布时间】:2021-01-13 15:13:25 【问题描述】:

这是一个 PCRE 正则表达式。 https://regex101.com/r/gJ7pU0/1 可以验证电子邮件地址。

Ruby 是否有符合 RFC5322 的正则表达式? Ruby 有 URI::MailTo::EMAIL_REGEXP,但我认为它不符合 RFC5322。

另一篇文章提到了这个“邮件”宝石,但我没有看到用它验证电子邮件地址的方法。

https://github.com/mikel/mail/tree/6b0ebb142c476bf7c00524effe513a4f151f59ab

PERC RFC5322 兼容

(?(DEFINE)
    (?<addr_spec> (?&local_part) @ (?&domain) )
    (?<local_part> (?&dot_atom) | (?&quoted_string) | (?&obs_local_part) )
    (?<domain> (?&dot_atom) | (?&domain_literal) | (?&obs_domain) )
    (?<domain_literal> (?&CFWS)? \[ (?: (?&FWS)? (?&dtext) )* (?&FWS)? \] (?&CFWS)? )
    (?<dtext> [\x21-\x5a] | [\x5e-\x7e] | (?&obs_dtext) )
    (?<quoted_pair> \\ (?: (?&VCHAR) | (?&WSP) ) | (?&obs_qp) )
    (?<dot_atom> (?&CFWS)? (?&dot_atom_text) (?&CFWS)? )
    (?<dot_atom_text> (?&atext) (?: \. (?&atext) )* )
    (?<atext> [a-zA-Z0-9!#$%&'*+\/=?^_`|~-]+ )
    (?<atom> (?&CFWS)? (?&atext) (?&CFWS)? )
    (?<word> (?&atom) | (?&quoted_string) )
    (?<quoted_string> (?&CFWS)? " (?: (?&FWS)? (?&qcontent) )* (?&FWS)? " (?&CFWS)? )
    (?<qcontent> (?&qtext) | (?&quoted_pair) )
    (?<qtext> \x21 | [\x23-\x5b] | [\x5d-\x7e] | (?&obs_qtext) )
    # comments and whitespace
    (?<FWS> (?: (?&WSP)* \r\n )? (?&WSP)+ | (?&obs_FWS) )
    (?<CFWS> (?: (?&FWS)? (?&comment) )+ (?&FWS)? | (?&FWS) )
    (?<comment> \( (?: (?&FWS)? (?&ccontent) )* (?&FWS)? \) )
    (?<ccontent> (?&ctext) | (?&quoted_pair) | (?&comment) )
    (?<ctext> [\x21-\x27] | [\x2a-\x5b] | [\x5d-\x7e] | (?&obs_ctext) )
    # obsolete tokens
    (?<obs_domain> (?&atom) (?: \. (?&atom) )* )
    (?<obs_local_part> (?&word) (?: \. (?&word) )* )
    (?<obs_dtext> (?&obs_NO_WS_CTL) | (?&quoted_pair) )
    (?<obs_qp> \\ (?: \x00 | (?&obs_NO_WS_CTL) | \n | \r ) )
    (?<obs_FWS> (?&WSP)+ (?: \r\n (?&WSP)+ )* )
    (?<obs_ctext> (?&obs_NO_WS_CTL) )
    (?<obs_qtext> (?&obs_NO_WS_CTL) )
    (?<obs_NO_WS_CTL> [\x01-\x08] | \x0b | \x0c | [\x0e-\x1f] | \x7f )
    # character class definitions
    (?<VCHAR> [\x21-\x7E] )
    (?<WSP> [ \t] )
)
^(?&addr_spec)$

【问题讨论】:

您确实展示了 RFC5322 和正则表达式的阴暗面。我猜Ruby中没有这样的正则表达式,并且您发布的正则表达式不被Ruby解释器接受。让您的 Rails 应用程序兼容所有可能的电子邮件地址模式是否有实际的理由? 你会推荐什么标准/正则表达式? 【参考方案1】:

PCRE 到 Onigmo 递归/子例程正则表达式的转换非常简单:

删除不受支持的(?(DEFINE)...) 构造 将所有用于定义消费模式的命名组放在正则表达式的开头,并将0 量词应用于所有组,以便它们不匹配 将(?&amp;...) 替换为\g&lt;...&gt; 语法(我刚刚在Notepad++ 中将\(\?&amp;(\w+)\) 替换为\\g&lt;$1&gt;)。

在 Ruby 中可以使用的最终表达式如下所示

re =/(?<addr_spec> \g<local_part> @ \g<domain> )0
(?<local_part> \g<dot_atom> | \g<quoted_string> | \g<obs_local_part> )0
(?<domain> \g<dot_atom> | \g<domain_literal> | \g<obs_domain> )0
(?<domain_literal> \g<CFWS>? \[ (?: \g<FWS>? \g<dtext> )* \g<FWS>? \] \g<CFWS>? )0
(?<dtext> [\x21-\x5a] | [\x5e-\x7e] | \g<obs_dtext> )0
(?<quoted_pair> \\ (?: \g<VCHAR> | \g<WSP> ) | \g<obs_qp> )0
(?<dot_atom> \g<CFWS>? \g<dot_atom_text> \g<CFWS>? )0
(?<dot_atom_text> \g<atext> (?: \. \g<atext> )* )0
(?<atext> [a-zA-Z0-9!#$%&'*+\/=?^_`|~-]+ )0
(?<atom> \g<CFWS>? \g<atext> \g<CFWS>? )0
(?<word> \g<atom> | \g<quoted_string> )0
(?<quoted_string> \g<CFWS>? " (?: \g<FWS>? \g<qcontent> )* \g<FWS>? " \g<CFWS>? )0
(?<qcontent> \g<qtext> | \g<quoted_pair> )0
(?<qtext> \x21 | [\x23-\x5b] | [\x5d-\x7e] | \g<obs_qtext> )0
# comments and whitespace
(?<FWS> (?: \g<WSP>* \r\n )? \g<WSP>+ | \g<obs_FWS> )0
(?<CFWS> (?: \g<FWS>? \g<comment> )+ \g<FWS>? | \g<FWS> )0
(?<comment> \( (?: \g<FWS>? \g<ccontent> )* \g<FWS>? \) )0
(?<ccontent> \g<ctext> | \g<quoted_pair> | \g<comment> )0
(?<ctext> [\x21-\x27] | [\x2a-\x5b] | [\x5d-\x7e] | \g<obs_ctext> )0
# obsolete tokens
(?<obs_domain> \g<atom> (?: \. \g<atom> )* )0
(?<obs_local_part> \g<word> (?: \. \g<word> )* )0
(?<obs_dtext> \g<obs_NO_WS_CTL> | \g<quoted_pair> )0
(?<obs_qp> \\ (?: \x00 | \g<obs_NO_WS_CTL> | \n | \r ) )0
(?<obs_FWS> \g<WSP>+ (?: \r\n \g<WSP>+ )* )0
(?<obs_ctext> \g<obs_NO_WS_CTL> )0
(?<obs_qtext> \g<obs_NO_WS_CTL> )0
(?<obs_NO_WS_CTL> [\x01-\x08] | \x0b | \x0c | [\x0e-\x1f] | \x7f )0
# character class definitions
(?<VCHAR> [\x21-\x7E] )0
(?<WSP> [ \t] )0
^\g<addr_spec>$/x

查看Ruby test:

p re.match?('+1~1+@iana.org')           # => true
p re.match?('test@[123.123.123.123')    # => false

【讨论】:

以上是关于Rails 6- 符合 RFC5322 的电子邮件验证的主要内容,如果未能解决你的问题,请参考以下文章

ruby 符合RFC 5322的消息ID生成器

根据 RFC5321/RFC5322 对电子邮件地址进行正则表达式验证

根据 RFC5322 和 https://en.wikipedia.org/wiki/Email_address 验证电子邮件 ID

正则表达式,获取电子邮件日期头字段的所有部分

带有特殊字符的电子邮件被拒绝 - RFC-6532 和“quoted-printable”

给定 [] 邮箱中的地址不符合 RFC 2822, 3.6.2。当电子邮件在变量中时