python regex - 匹配两个字符串

Posted

技术标签:

【中文标题】python regex - 匹配两个字符串【英文标题】:python regex - match two string [closed] 【发布时间】:2014-08-05 22:13:51 【问题描述】:

我想检查两个字符串是否彼此相似...... 例如:

string1 = "Select a valid choice. **aaaa** is not one of the available choices."
string2 = "Select a valid choice. **bbbb** is not one of the available choices."

string3 = "Ensure this value has at most 30 characters (it has 40 chars)."
string4 = "Ensure this value has at most 60 characters (it has 110 chars)."

如果我比较 string1 和 string2 它应该返回 True 如果我比较 string1 和 string3 它应该返回 False

【问题讨论】:

向我们展示您尝试过的正则表达式。 它不会将 string1 和 2 设为 true。 无法比较值,因为 string1 包含 aaaa,而 string2 包含 bbbb,它将返回 False 而不是 True。 所以你必须提取你想要比较的相关部分。请尝试一下并将结果包含在您的问题中。 您能否在您的情况下拼写“等于”和“不等于”字符串的正式定义。对人类来说很清楚,但你知道,计算机不是人类。 【参考方案1】:

您可以使用Levenshtein distance

def lev(s1, s2):
    if len(s1) < len(s2):
        return lev(s2, s1)

    # len(s1) >= len(s2)
    if len(s2) == 0:
        return len(s1)

    previous_row = xrange(len(s2) + 1)
    for i, c1 in enumerate(s1):
        current_row = [i + 1]
        for j, c2 in enumerate(s2):
            insertions = previous_row[j + 1] + 1 # j+1 instead of j since previous_row and current_row are one character longer
            deletions = current_row[j] + 1       # than s2
            substitutions = previous_row[j] + (c1 != c2)
            current_row.append(min(insertions, deletions, substitutions))
        previous_row = current_row

    return previous_row[-1]

string1 = "Select a valid choice. aaaa is not one of the available choices."
string2 = "Select a valid choice. bbbb is not one of the available choices."
string3 = "Ensure this value has at most 30 characters (it has 40 chars)."
string4 = "Ensure this value has at most 60 characters (it has 110 chars)."

print lev(string1, string2) # => 4
print lev(string3, string4) # => 3
print lev(string1, string3) # => 49

从here复制的代码

【讨论】:

以上是关于python regex - 匹配两个字符串的主要内容,如果未能解决你的问题,请参考以下文章

Python Regex:在多行上匹配一个字符?

python regex如何避免匹配多个分号?

RegEx - 匹配以冒号开头的子字符串

Python Pandas Regex:在列中搜索带有通配符的字符串并返回匹配项[重复]

Python Regex

常见python正则用法实例