找到最接近特定字符串的单词?
Posted
技术标签:
【中文标题】找到最接近特定字符串的单词?【英文标题】:Find the word which I closest to the particular string? 【发布时间】:2013-10-08 01:54:51 【问题描述】:我们做用户核对报告,因为我们需要找到分配给特定用户的电子邮件 ID。
前任
客户报告用户名可能如下所示
Sathish K
Sathya A
但在我们的合并报告中,实际用户名将如下所示
Sathish Kothandam
Sathya Arjun
所以我创建了一个宏
Sub test
Dim t as string
t= “Sathish K”
msgbox(getemailId(t))
End sub
Dim rng As Range
Function getemailId(Byval findString As String)
With ActiveWorkbook.Sheets("CONSOLIDATED").Range("B:B")
Set rng = .find(What:=findString, LookIn:=xlValues)
If Not rng Is Nothing Then
‘ B – Column contains username C – Email id of the user
getemailId = rng.offset(0,1).value
Else
find1 = 0
End If
End With
End Function
我的宏在上述场景中完美运行,但有时我可能会收到如下用户名
Satish Kothandam
Sathiya Arjun
但这一次它返回 0 。有什么办法可以实现我的目标吗? 希望我解释清楚?
【问题讨论】:
如果你可以把数据放在ms访问表中,你可以使用SOUNDEX。看看这个link 这个link 用于 Excel Soundex。 嗨桑托什。谢谢你的建议。但是 excel soundex 的链接仅适用于几个单词。不是所有的..我已经从那个网站下载了Example excel工作簿并检查了。 ? 我测试过,它对我有用。你能给我一个它不起作用的例子吗? 作为我上面的例子 Sathish-Satish,sathya-satya 【参考方案1】:请看下面的示例代码。
Sub test()
Dim str1 As String, str2 As String
Dim str1c As String, str2c As String
str1 = "Sathish"
str2 = "Satish"
str1c = SOUNDEX(str1)
str2c = SOUNDEX(str2)
MsgBox str1c = str2c
End Sub
Function SOUNDEX(Surname As String) As String
' Developed by Richard J. Yanco
' This function follows the Soundex rules given at
' http://home.utah-inter.net/kinsearch/Soundex.html
Dim Result As String, c As String * 1
Dim Location As Integer
Surname = UCase(Surname)
' First character must be a letter
If Asc(Left(Surname, 1)) < 65 Or Asc(Left(Surname, 1)) > 90 Then
SOUNDEX = ""
Exit Function
Else
' St. is converted to Saint
If Left(Surname, 3) = "ST." Then
Surname = "SAINT" & Mid(Surname, 4)
End If
' Convert to Soundex: letters to their appropriate digit,
' A,E,I,O,U,Y ("slash letters") to slashes
' H,W, and everything else to zero-length string
Result = Left(Surname, 1)
For Location = 2 To Len(Surname)
Result = Result & Category(Mid(Surname, Location, 1))
Next Location
' Remove double letters
Location = 2
Do While Location < Len(Result)
If Mid(Result, Location, 1) = Mid(Result, Location + 1, 1) Then
Result = Left(Result, Location) & Mid(Result, Location + 2)
Else
Location = Location + 1
End If
Loop
' If category of 1st letter equals 2nd character, remove 2nd character
If Category(Left(Result, 1)) = Mid(Result, 2, 1) Then
Result = Left(Result, 1) & Mid(Result, 3)
End If
' Remove slashes
For Location = 2 To Len(Result)
If Mid(Result, Location, 1) = "/" Then
Result = Left(Result, Location - 1) & Mid(Result, Location + 1)
End If
Next
' Trim or pad with zeroes as necessary
Select Case Len(Result)
Case 4
SOUNDEX = Result
Case Is < 4
SOUNDEX = Result & String(4 - Len(Result), "0")
Case Is > 4
SOUNDEX = Left(Result, 4)
End Select
End If
End Function
Private Function Category(c) As String
' Returns a Soundex code for a letter
Select Case True
Case c Like "[AEIOUY]"
Category = "/"
Case c Like "[BPFV]"
Category = "1"
Case c Like "[CSKGJQXZ]"
Category = "2"
Case c Like "[DT]"
Category = "3"
Case c = "L"
Category = "4"
Case c Like "[MN]"
Category = "5"
Case c = "R"
Category = "6"
Case Else 'This includes H and W, spaces, punctuation, etc.
Category = ""
End Select
End Function
【讨论】:
感谢这个 santosh .. 这就是我想要的。抱歉最初误解了这个功能:)【参考方案2】:您可以使用 levenshtein 算法。它计算两个字符串之间的距离。
来源维基媒体
Function levenshtein(a As String, b As String) As Integer
Dim i As Integer
Dim j As Integer
Dim cost As Integer
Dim d() As Integer
Dim min1 As Integer
Dim min2 As Integer
Dim min3 As Integer
If Len(a) = 0 Then
levenshtein = Len(b)
Exit Function
End If
If Len(b) = 0 Then
levenshtein = Len(a)
Exit Function
End If
ReDim d(Len(a), Len(b))
For i = 0 To Len(a)
d(i, 0) = i
Next
For j = 0 To Len(b)
d(0, j) = j
Next
For i = 1 To Len(a)
For j = 1 To Len(b)
If Mid(a, i, 1) = Mid(b, j, 1) Then
cost = 0
Else
cost = 1
End If
' Since Min() function is not a part of VBA, we'll "emulate" it below
min1 = (d(i - 1, j) + 1)
min2 = (d(i, j - 1) + 1)
min3 = (d(i - 1, j - 1) + cost)
' If min1 <= min2 And min1 <= min3 Then
' d(i, j) = min1
' ElseIf min2 <= min1 And min2 <= min3 Then
' d(i, j) = min2
' Else
' d(i, j) = min3
' End If
' In Excel we can use Min() function that is included
' as a method of WorksheetFunction object
d(i, j) = Application.WorksheetFunction.Min(min1, min2, min3)
Next
Next
levenshtein = d(Len(a), Len(b))
End Function
【讨论】:
以上是关于找到最接近特定字符串的单词?的主要内容,如果未能解决你的问题,请参考以下文章