如何实现模糊搜索之类的 sublime text?
Posted
技术标签:
【中文标题】如何实现模糊搜索之类的 sublime text?【英文标题】:How to implement sublime text like fuzzy search? 【发布时间】:2013-05-30 06:11:55 【问题描述】:如何在 select2 上实现类似 sublime 的模糊搜索?
例如,输入“sta jav sub”将匹配“*** javascript sublime like”
【问题讨论】:
【参考方案1】:这是一个备用匹配函数。 http://jsfiddle.net/trevordixon/pXzj3/4/
function match(search, text)
search = search.toUpperCase();
text = text.toUpperCase();
var j = -1; // remembers position of last found character
// consider each search character one at a time
for (var i = 0; i < search.length; i++)
var l = search[i];
if (l == ' ') continue; // ignore spaces
j = text.indexOf(l, j+1); // search for character & update position
if (j == -1) return false; // if it's not found, exclude this item
return true;
这个速度更快(根据 Chrome 中的 this test),如果您要过滤很多项目,这可能会开始重要。
【讨论】:
我喜欢你的实现,但你不会尝试将搜索字符串中的字符匹配在一起,ej ABS 会匹配 Any Bachelor Sunrise,这可能是也可能不是你所期望的 :)跨度> 这不正是模糊搜索所期望的吗? @Kloar 他的意思是A B S
应该匹配Any Bachelor Sunrise
,但是ABS
应该只匹配,比如说ABS Building Co.
@rvighne 我认为 ABS 应该两者都匹配,但是第二场比赛的排名应该比第一场高
我发现这永远不会搜索第一个字符,因为在执行indexOf
方法时 j 永远不会等于 0。最好按原样使用 j,然后在 if 条件后添加 j++
来更新位置,而不是在 indexOf
方法中。【参考方案2】:
select2 允许您实现自己的“匹配器”函数 (as seen on their docs),使用它和一些正则表达式,您可以执行以下操作:
$("#element").select2(
matcher: function(term, text, opt)
//We call to uppercase to do a case insensitive match
//We replace every group of whitespace characters with a .+
//matching any number of characters
return text.toUpperCase().match(term.toUpperCase().replace(/\s+/g, '.+'));
);
在过滤/搜索列表时,会针对每个 select2 列表元素调用匹配器函数,您可以使用它实现任何类型的自定义搜索。
【讨论】:
【参考方案3】:我写了一些很长的 Sublime Text 的模糊匹配。实现这一点需要做一些事情。
首先,按顺序匹配模式中的所有字符。其次,对匹配进行评分,以使某些匹配的字符比其他字符更有价值。
我想出了几个要检查的因素。 “CamelCase”字母或分隔符(空格或下划线)后面的字母值得很多分。连续比赛的价值更高。在开始时找到的结果更有价值。
一个至关重要的技巧是找到最匹配的字符。这不一定是第一个。考虑模糊匹配(“tk”,“黑骑士”)。有两个可以匹配的K。第二个值得更多的分数,因为它跟随一个空格。
JavaScript 代码如下。有一些细微差别在博客文章中有更详细的描述。还有一个交互式演示。以及 GitHub 上的完整源代码(包括演示和 C++ 实现)。
Blog Post Interactive DemoGitHub
// Returns [bool, score, formattedStr]
// bool: true if each character in pattern is found sequentially within str
// score: integer; higher is better match. Value has no intrinsic meaning. Range varies with pattern.
// Can only compare scores with same search pattern.
// formattedStr: input str with matched characters marked in <b> tags. Delete if unwanted.
function fuzzy_match(pattern, str)
// Score consts
var adjacency_bonus = 5; // bonus for adjacent matches
var separator_bonus = 10; // bonus if match occurs after a separator
var camel_bonus = 10; // bonus if match is uppercase and prev is lower
var leading_letter_penalty = -3; // penalty applied for every letter in str before the first match
var max_leading_letter_penalty = -9; // maximum penalty for leading letters
var unmatched_letter_penalty = -1; // penalty for every letter that doesn't matter
// Loop variables
var score = 0;
var patternIdx = 0;
var patternLength = pattern.length;
var strIdx = 0;
var strLength = str.length;
var prevMatched = false;
var prevLower = false;
var prevSeparator = true; // true so if first letter match gets separator bonus
// Use "best" matched letter if multiple string letters match the pattern
var bestLetter = null;
var bestLower = null;
var bestLetterIdx = null;
var bestLetterScore = 0;
var matchedIndices = [];
// Loop over strings
while (strIdx != strLength)
var patternChar = patternIdx != patternLength ? pattern.charAt(patternIdx) : null;
var strChar = str.charAt(strIdx);
var patternLower = patternChar != null ? patternChar.toLowerCase() : null;
var strLower = strChar.toLowerCase();
var strUpper = strChar.toUpperCase();
var nextMatch = patternChar && patternLower == strLower;
var rematch = bestLetter && bestLower == strLower;
var advanced = nextMatch && bestLetter;
var patternRepeat = bestLetter && patternChar && bestLower == patternLower;
if (advanced || patternRepeat)
score += bestLetterScore;
matchedIndices.push(bestLetterIdx);
bestLetter = null;
bestLower = null;
bestLetterIdx = null;
bestLetterScore = 0;
if (nextMatch || rematch)
var newScore = 0;
// Apply penalty for each letter before the first pattern match
// Note: std::max because penalties are negative values. So max is smallest penalty.
if (patternIdx == 0)
var penalty = Math.max(strIdx * leading_letter_penalty, max_leading_letter_penalty);
score += penalty;
// Apply bonus for consecutive bonuses
if (prevMatched)
newScore += adjacency_bonus;
// Apply bonus for matches after a separator
if (prevSeparator)
newScore += separator_bonus;
// Apply bonus across camel case boundaries. Includes "clever" isLetter check.
if (prevLower && strChar == strUpper && strLower != strUpper)
newScore += camel_bonus;
// Update patter index IFF the next pattern letter was matched
if (nextMatch)
++patternIdx;
// Update best letter in str which may be for a "next" letter or a "rematch"
if (newScore >= bestLetterScore)
// Apply penalty for now skipped letter
if (bestLetter != null)
score += unmatched_letter_penalty;
bestLetter = strChar;
bestLower = bestLetter.toLowerCase();
bestLetterIdx = strIdx;
bestLetterScore = newScore;
prevMatched = true;
else
// Append unmatch characters
formattedStr += strChar;
score += unmatched_letter_penalty;
prevMatched = false;
// Includes "clever" isLetter check.
prevLower = strChar == strLower && strLower != strUpper;
prevSeparator = strChar == '_' || strChar == ' ';
++strIdx;
// Apply score for last match
if (bestLetter)
score += bestLetterScore;
matchedIndices.push(bestLetterIdx);
// Finish out formatted string after last pattern matched
// Build formated string based on matched letters
var formattedStr = "";
var lastIdx = 0;
for (var i = 0; i < matchedIndices.length; ++i)
var idx = matchedIndices[i];
formattedStr += str.substr(lastIdx, idx - lastIdx) + "<b>" + str.charAt(idx) + "</b>";
lastIdx = idx + 1;
formattedStr += str.substr(lastIdx, str.length - lastIdx);
var matched = patternIdx == patternLength;
return [matched, score, formattedStr];
【讨论】:
【参考方案4】:albertein 的答案与 Trevor 的版本不匹配,因为原始函数基于字符而不是单词执行匹配。这是一个更简单的基于字符的匹配:
$("#element").select2(
matcher: function(term, text, opts)
var pattern = term.replace(/\s+/g, '').split('').join('.*');
text.match(new RegExp(pattern, 'i'))
)
【讨论】:
【参考方案5】:var fuzzysearch = function (querystrings, values)
return !querystrings.some(function (q)
return !values.some(function (v)
return v.toLocaleLowerCase().indexOf(q) !== -1;
);
);
在图书收藏中搜索标题和作者的示例 http://jsfiddle.net/runjep/r887etnh/2/
对于对搜索结果进行排名的 9kb 替代方案:http://kiro.me/projects/fuse.html
你可能需要一个 polyfill 来实现 'some' 函数 https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/some
var books = [
id: 1,
title: 'The Great Gatsby',
author: 'F. Scott Fitzgerald'
,
id: 2,
title: 'The DaVinci Code',
author: 'Dan Brown'
,
id: 3,
title: 'Angels & Demons',
author: 'Dan Brown'
];
search = function ()
var queryarray = document.getElementById('inp').value.trim().toLowerCase().split(' ');
var res = books.filter(function (b)
return fs(queryarray, [b.title, b.author]);
);
document.getElementById('res').innerHTML = res.map(function (b)
return b.title + ' <i> ' + b.author + '</i>';
).join('<br/> ');
fs = function (qs, vals)
return !qs.some(function (q)
return !vals.some(function (v)
return v.toLocaleLowerCase().indexOf(q) !== -1;
);
);
<input id="inp" />
<button id="but" onclick="search()">Search</button>
<div id="res"></div>
【讨论】:
这是原版 javascript,没有选择 2 - 抱歉【参考方案6】:function fuzzyMe(term, query)
var score = 0;
var termLength = term.length;
var queryLength = query.length;
var highlighting = '';
var ti = 0;
// -1 would not work as this would break the calculations of bonus
// points for subsequent character matches. Something like
// Number.MIN_VALUE would be more appropriate, but unfortunately
// Number.MIN_VALUE + 1 equals 1...
var previousMatchingCharacter = -2;
for (var qi = 0; qi < queryLength && ti < termLength; qi++)
var qc = query.charAt(qi);
var lowerQc = qc.toLowerCase();
for (; ti < termLength; ti++)
var tc = term.charAt(ti);
if (lowerQc === tc.toLowerCase())
score++;
if ((previousMatchingCharacter + 1) === ti)
score += 2;
highlighting += "<em>" + tc + "</em>";
previousMatchingCharacter = ti;
ti++;
break;
else
highlighting += tc;
highlighting += term.substring(ti, term.length);
return
score: score,
term: term,
query: query,
highlightedTerm: highlighting
;
以上处理了模糊性。然后您可以遍历所有选择的 2 个元素
$("#element").select2(
matcher: function(term, text, opt)
return fuzzyMe(term, text).highlightedTerm;
);
模糊代码的功劳-:https://github.com/bripkens/fuzzy.js
【讨论】:
【参考方案7】:在新的 select2 上遇到了困难,这里有什么效果
$("#foo").select2(
matcher: matcher
);
function matcher(params, data)
// return all opts if seachbox is empty
if(!params.term)
return data;
else if(data)
var term = params.term.toUpperCase();
var option = data.text.toUpperCase();
var j = -1; // remembers position of last found character
// consider each search character one at a time
for (var i = 0; i < term.length; i++)
var l = term[i];
if (l == ' ') continue; // ignore spaces
j = option.indexOf(l, j+1); // search for character & update position
if (j == -1) return false; // if it's not found, exclude this item
return data; // return option
【讨论】:
以上是关于如何实现模糊搜索之类的 sublime text?的主要内容,如果未能解决你的问题,请参考以下文章