鏈夐檺鐘舵€佹満涓嶭ucene鐨勯偅浜涗簨锛堝紑绡囷級
Posted 璺宠烦鐖哥殑Abc
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了鏈夐檺鐘舵€佹満涓嶭ucene鐨勯偅浜涗簨锛堝紑绡囷級相关的知识,希望对你有一定的参考价值。
纭畾鏈夐檺鐘舵€佹満锛坉eterministic finite automaton/dfa锛夋槸涓€涓暟瀛﹁绠楁ā鍨嬶紝缁勬垚閮ㄥ垎鏄竴涓?鍏冪粍锛?/span>
鏈夐檺鐨勭姸鎬侀泦Q
鏈夐檺鐨勮緭鍏ョ鍙稴锛屽張琚О浣渁lphabet锛堣窡鎴戜滑鐔熺煡鐨勮嫳鏂囧瓧姣嶈〃搴旇涓嶄竴鏍凤紝鏄釜寮曠敵锛?/span>
鐘舵€佸彉鎹㈠嚱鏁癋锛孎锛歋 饾棏 Q -> Q
鍒濆鐘舵€乻0锛宻0 鈭?Q
鎺ョ撼鐘舵€侀泦Z锛孼 鈯?Q
鍋囪鏈変竴涓瓧绗︿覆锛堢鍙峰簭鍒楋級w=a(0)|a(1)|a(2)|a(3)|鈥a(n)锛屼笖w 鈯?/span> S锛屽彟鏈塺=r(0)|r(1)|r(2)|r(3)|鈥?/span>|r(n)鐨勪竴涓姸鎬佸簭鍒楋紝褰搘鍜宺绗﹀悎濡備笅鏉′欢鏃惰涓虹姸鎬佹満M鎺ョ撼w锛?/span>
r0 = s0锛屽垵濮嬬姸鎬佷负鐘舵€佹満鐨勫垵濮嬬姸鎬?/span>
r(i+1) = F(r(i), a(i+1)), for i = 0,鈥?n-1锛涘綋鍓嶇姸鎬乺(i)杈撳叆a(i+1)鍚庤繘琛屽彉鎹㈠悗鐨勪笅涓€涓姸鎬佷负r(i+1)锛岃繖閲孎鍙樻崲鍚庣殑鐘舵€佸繀鐒跺睘浜嶲
r(n) 鈭?Z锛屽彉鎹㈠悗鐨勬渶缁堢姸鎬佷负缁堟€乑涔嬩竴
纭畾鏈夐檺鐘舵€佹満鍜屼笉纭畾鏈夐檺鐘舵€佹満锛坣ondeterministic finite automaton/nfa锛夌殑鍖哄埆鍦ㄤ簬锛宒fa涓€瀵规簮鐘舵€佸拰杈撳叆绗﹀彿鍙互鍞竴纭畾涓€涓彉鎹㈡搷浣滐紝涓旀瘡涓姸鎬佸彉鎹㈡搷浣滈兘闇€瑕佷竴涓緭鍏ョ鍙凤紝鑰宯fa涓嶅仛杩欎釜闄愬埗锛堟瘮濡傚墠杩癲fa鐨勭粍鎴愮涓夋潯锛屽湪nfa锛岀姸鎬佸彉鎹㈠嚱鏁板彲浠ユ槸F: S 饾棏 Q-> P(Q)锛屽彲浠ヤ粠涓€涓姸鎬佸彉鎹负鑻ュ共涓叾浠栫姸鎬侊級锛宒fa鏄竴绉嶇壒娈婄姸鎬佷笅鐨刵fa銆?/span>
濡傚浘锛岃繖閲岃〃绀轰簡涓€涓猑A(B*)A$鐨勬鍒欏尮閰嶏紝褰撹緭鍏ヤ负A锛屽埌杈維2鐘舵€侊紝S2鐘舵€佹帴鍙楄緭鍏鎴朆锛岃緭鍏鐘舵€佷笉鍙橈紝杈撳叆A璺宠浆鍒扮姸鎬丼1锛屼负鎺ョ撼鐘舵€併€?/em>
鏈夐檺鐘舵€佹満鍦ㄥ鐞嗘枃瀛楀尮閰嶄笂鏈夐潪甯稿ソ鐨勪紭鍔匡紝Lucene 4.0涔嬪墠鐨勬ā绯婃煡璇紙fuzzy query锛夌敤浜嗙畝鍗曠矖鏆寸殑閬嶅巻娉曪細閬嶅巻绱㈠紩涓叏閮╰erm骞朵緷娆¤绠椾笌杈撳叆璇嶇殑缂栬緫璺濈锛屽彲浠ヨВ鍐抽棶棰橈紝浣嗘槸鏈夊緢楂樼殑绠楁硶澶嶆潅搴︼紝鍦?.0涔嬪悗Lucene寮曞叆浜嗘湁闄愮姸鎬佹満鏉ユ彁楂樻ā绯婃煡璇㈢殑鎬ц兘锛屾嵁璇存彁鍗囪揪鍒颁簡鐧惧€嶄箣澶?锛坔ttp://blog.mikemccandless.com/2011/03/lucenes-fuzzyquery-is-100-times-faster.html锛?/span>
Lucene涓笌鑷姩鏈虹浉鍏崇殑涓昏鏄?org.apache.lucene.util.automaton 鍖呬互鍙?org.apache.lucene.search 鍖呬笅涓?Automata/Automaton 鐩稿叧鐨勭被銆?/span>
浠ュ墠缂€鏌ヨ涓轰緥锛岄鍏堝皢妫€绱㈣瘝鏋勫缓涓轰竴涓湁闄愮姸鎬佹満锛屾瘮濡傜敤鍓嶇紑 "/Computer" 鍖归厤瀛楁 category锛屽彲浠ユ瀯寤轰负涓€涓?state 鍜?transition 閮芥槸11鐨勬湁闄愮姸鎬佹満锛屽苟灏嗘湯灏剧殑r瀛楃璁剧疆涓烘帴绾崇姸鎬侊紙accept=true锛夛細
濡傚浘锛屽€煎緱涓€鎻愮殑鏄疞ucene涓殑Automaton灏佽浜唗oDot()鏂规硶锛屽彲浠ュ皢鐘舵€佹満杈撳嚭涓篻raphviz鍙瘑鍒殑鎻忚堪绗︼紝閫氳繃graphviz鐨勫懡浠よ宸ュ叿dot鍙互杈撳嚭涓簆ng鏍煎紡鍥剧墖銆?/em>
鏈€鍚庝竴涓彉鎹㈡槸鎸囧湪鍒拌揪鎺ョ撼鐘舵€佷箣鍚庣殑浠绘剰杈撳叆渚濈劧婊¤冻鎺ョ撼鏉′欢锛堜篃灏辨槸琛ㄨ揪浜嗗墠缂€鏌ヨ鐨勬剰鎬濓級銆?/span>
杩欓噷璇磋閬囧埌鐨勪竴涓潙锛孡ucene 7.0 鐗堟湰鍓嶅瓨鍦ㄤ竴涓?bug (https://issues.apache.org/jira/browse/LUCENE-7914)锛屽湪閬囧埌 regex/prefix 绛夋煡璇㈡椂锛堝悗绔敤 Automata/AutomatonQuery 瀹炵幇锛夛紝鑰屾瀯寤?Automata 鐨勮繃绋嬩腑浼氱敤 Operations 鐨?nbsp;isFinite() 鏂规硶鍒ゆ柇鏋勫缓瀹屾垚鍚庣殑鐘舵€佹満鏄惁涓?DFA锛屽潙灏卞潙鍦?isFinite() 鏂规硶瀛樺湪閫掑綊锛屽綋 regex 鐢ㄤ簡濡備笅绫讳技鐨?regex 鏌ヨ锛屾垨鑰呮槸涓€涓潪甯搁暱鐨刾refix鏌ヨ锛岄兘鏈夊彲鑳戒細鍥犱负閫掑綊杩囧害鑰屾姤 StackOverFlow 寮傚父銆?/span>
POST /test/_search
{
"query": {
"regexp": {
"test": "t{1,9500}"
}
}
}
璇锋敞鎰忥紝杩欎釜鎿嶄綔浼氬鑷存墽琛屾煡璇㈢殑鍏ㄩ儴 node 涓?ES 涓昏繘绋嬪紓甯搁€€鍑猴紝濡傛灉杩欎釜绱㈠紩鎭板阀鍒嗗竷鍦ㄩ泦缇ゅ唴鐨勫叏閮?node 涓婏紝閭d箞姝ょ被鏌ヨ鐩稿綋浜庡紩鐖嗕簡涓€棰楁牳寮广€傘€?/span>
褰撶劧 Lucene 7.0 涔嬪悗淇浜嗚繖涓棶棰橈紝鐩稿簲鐨?Es6.0 涔嬪悗鐗堟湰涔熷氨涓嶅瓨鍦ㄧ被浼肩殑闂浜嗭紝涓昏鐨勮В鍐虫柟娉曞氨鏄湪 isFinite() 鏂规硶閫掑綊涓檺鍒朵簡鏈€澶ч€掑綊娣卞害銆?/span>
璇翠簡杩欎箞澶氾紝鐜板湪灏卞啓鐐逛唬鐮侊紝绠€鍗曞疄鐜颁竴涓?prefix 鏌ヨ绀轰緥锛?/span>
public void testPrefixQuery() throws Exception {
String[] categories = new String[] {"/Computers",
"/Computers/Mac",
"/Computers/Windows"};
IndexWriterConfig config = new IndexWriterConfig();
try (Directory directory = new RAMDirectory();
IndexWriter writer = new IndexWriter(directory, config)) {
for (int i = 0; i < categories.length; i++) {
Document doc = new Document();
doc.add(newStringField("category", categories[i], Field.Store.YES));
writer.addDocument(doc);
}
try (IndexReader reader = DirectoryReader.open(writer)) {
PrefixQuery query = new PrefixQuery(new Term("category", "/Computers"));
IndexSearcher searcher = new IndexSearcher(reader);
ScoreDoc[] hits = searcher.search(query, 1000).scoreDocs;
System.out.println(Arrays.toString(hits));
}
}
}
杈撳嚭锛?/span>
[doc=0 score=1.0 shardIndex=0, doc=1 score=1.0 shardIndex=0, doc=2 score=1.0 shardIndex=0]
鏈€鍚庤琛ュ厖鐨勬槸锛屽鏋滃 Lucene 浠g爜鎺ュ彛姣旇緝鎰熷叴瓒o紝鍙互濂藉ソ鐮旂┒涓€涓?Lucene 婧愮爜涓殑鍗曞厓娴嬭瘯锛屽湪 Lucene/Solr 婧愪唬鐮佷腑鍖呭惈浜嗛潪甯镐赴瀵岀殑鍗曞厓娴嬭瘯锛屽姛鑳界偣瑕嗙洊闈篃闈炲父鍏紝闈炲父鍊煎緱鎴戜滑瀛︿範锛岃蒋浠舵祴璇曞拰璐ㄩ噺淇濋殰鐞嗗簲鏄蒋浠跺紑鍙戣€呬綔涓?owner 鐨勪簨鎯呫€?nbsp;
以上是关于鏈夐檺鐘舵€佹満涓嶭ucene鐨勯偅浜涗簨锛堝紑绡囷級的主要内容,如果未能解决你的问题,请参考以下文章
璋锋瓕澶х墰璇达細涓轰粈涔?Kotlin 姣斾綘浠敤鐨勯偅浜涘瀮鍦捐瑷€閮藉ソ
寰俊鏀寔鐨凙uthorization code鎺堟潈妯″紡锛堝叕浼楀彿寮€鍙戯級锛堝紑鏀惧钩鍙拌祫鏂欎腑蹇冧腑鐨勪唬鍏紬鍙峰彂璧风綉椤垫巿鏉冿級