避免 lucence QueryParser Parse 异常?
Posted
技术标签:
【中文标题】避免 lucence QueryParser Parse 异常?【英文标题】:Avoid lucence QueryParser Parse exception? 【发布时间】:2011-05-30 05:46:20 【问题描述】:在第 3 行,我得到了异常,例如“IOException:读取过去的 eof”和“LookaheadSuccess:应用程序中的错误。” 有没有办法避免这种情况?我讨厌休息,每次执行搜索时都按两次继续
请注意,当我告诉视觉工作室向我展示即使捕获到它们时也会抛出的异常时,我才会注意到这一点。我没有得到异常,我只是看到它们被抛出,因此每次搜索时都会出现两个(或三个)断点。该应用程序运行良好。
var searcher = new IndexSearcher(directory, true); var parser = new QueryParser(Lucene.Net.Util.Version.LUCENE_29, "all", 分析器); var query = parser.Parse(text); //这里
【问题讨论】:
text
是什么样的?在这里不可能说出什么可能有帮助,但它听起来像是一个格式不正确的查询。
【参考方案1】:
这些是在 Lucene 中发生并被捕获的第一次机会异常。你已将 Visual Studio 配置为中断所有异常,而不仅仅是那些未处理的异常。使用例外对话框 (ctrl-alt-e iirc) 并更改您的设置。
【讨论】:
【参考方案2】:Lucene.NET(在版本 3.0.3 时)使用 IOExceptions 来管理解析器流程的几个部分。这对性能产生了不好的影响(在我的开发机器上最多 90 毫秒)。
好消息是,当前位于其源代码存储库http://lucenenet.apache.org/community.html 中的版本似乎已删除导致此问题的特定异常。当然对我来说,这大大提高了性能。希望这会有所帮助。
【讨论】:
【参考方案3】:Lucene 3.0.3 中的 QueryParser 补丁以避免引发 LookaheadSuccess 异常:
--- a/src/core/QueryParser/QueryParser.cs
+++ b/src/core/QueryParser/QueryParser.cs
@@ -1708,16 +1708,13 @@ namespace Lucene.Net.QueryParsers
private bool Jj_2_1(int xla)
-
+
+ bool lookaheadSuccess = false;
jj_la = xla;
jj_lastpos = jj_scanpos = token;
try
- return !Jj_3_1();
-
- catch (LookaheadSuccess)
-
- return true;
+ return !Jj_3_1(out lookaheadSuccess);
finally
@@ -1725,29 +1722,31 @@ namespace Lucene.Net.QueryParsers
- private bool Jj_3R_2()
-
- if (jj_scan_token(TermToken)) return true;
- if (jj_scan_token(ColonToken)) return true;
+ private bool Jj_3R_2(out bool lookaheadSuccess)
+
+ if (jj_scan_token(TermToken, out lookaheadSuccess)) return true;
+ if (lookaheadSuccess) return false;
+ if (jj_scan_token(ColonToken, out lookaheadSuccess)) return true;
return false;
- private bool Jj_3_1()
+ private bool Jj_3_1(out bool lookaheadSuccess)
Token xsp;
xsp = jj_scanpos;
- if (Jj_3R_2())
+ if (Jj_3R_2(out lookaheadSuccess))
jj_scanpos = xsp;
- if (Jj_3R_3()) return true;
+ if (Jj_3R_3(out lookaheadSuccess)) return true;
return false;
- private bool Jj_3R_3()
-
- if (jj_scan_token(StarToken)) return true;
- if (jj_scan_token(ColonToken)) return true;
+ private bool Jj_3R_3(out bool lookaheadSuccess)
+
+ if (jj_scan_token(StarToken, out lookaheadSuccess)) return true;
+ if (lookaheadSuccess) return false;
+ if (jj_scan_token(ColonToken, out lookaheadSuccess)) return true;
return false;
@@ -1861,14 +1860,9 @@ namespace Lucene.Net.QueryParsers
throw GenerateParseException();
- [Serializable]
- private sealed class LookaheadSuccess : System.Exception
-
-
-
- private LookaheadSuccess jj_ls = new LookaheadSuccess();
- private bool jj_scan_token(int kind)
-
+ private bool jj_scan_token(int kind, out bool lookaheadSuccess)
+
+ lookaheadSuccess = false;
if (jj_scanpos == jj_lastpos)
jj_la--;
@@ -1896,8 +1890,8 @@ namespace Lucene.Net.QueryParsers
if (tok != null) Jj_add_error_token(kind, i);
- if (jj_scanpos.kind != kind) return true;
- if (jj_la == 0 && jj_scanpos == jj_lastpos) throw jj_ls;
+ if (jj_scanpos.kind != kind) return true;
+ if (jj_la == 0 && jj_scanpos == jj_lastpos) lookaheadSuccess = true;
return false;
@@ -2029,32 +2023,34 @@ namespace Lucene.Net.QueryParsers
private void Jj_rescan_token()
-
+
+ bool lookaheadSuccess = false;
jj_rescan = true;
for (int i = 0; i < 1; i++)
- try
+ JJCalls p = jj_2_rtns[i];
+ do
- JJCalls p = jj_2_rtns[i];
- do
+ if (p.gen > jj_gen)
- if (p.gen > jj_gen)
+ jj_la = p.arg;
+ jj_lastpos = jj_scanpos = p.first;
+ switch (i)
- jj_la = p.arg;
- jj_lastpos = jj_scanpos = p.first;
- switch (i)
-
- case 0:
- Jj_3_1();
- break;
-
+ case 0:
+ Jj_3_1(out lookaheadSuccess);
+ if (lookaheadSuccess)
+
+ goto Jj_rescan_token_after_while_label;
+
+ break;
- p = p.next;
- while (p != null);
-
- catch (LookaheadSuccess)
-
-
+
+ p = p.next;
+ while (p != null);
+
+ Jj_rescan_token_after_while_label:
+ lookaheadSuccess = false;
jj_rescan = false;
--
Lucene 3.0.3 中的 QueryParser 补丁以避免大量 System.IO.IOException 异常抛出:
CharStream.cs:
--- CharStream.cs
+++ CharStream.cs
@@ -44,6 +44,7 @@
/// implementing this interface. Can throw any java.io.IOException.
/// </summary>
char ReadChar();
+ char ReadChar(ref bool? systemIoException);
/// <summary> Returns the column position of the character last read.</summary>
/// <deprecated>
@@ -93,6 +94,7 @@
/// to this method to implement backup correctly.
/// </summary>
char BeginToken();
+ char BeginToken(ref bool? systemIoException);
/// <summary> Returns a string made up of characters from the marked token beginning
/// to the current buffer position. Implementations have the choice of returning
FastCharStream.cs:
--- FastCharStream.cs
+++ FastCharStream.cs
@@ -48,12 +48,35 @@
public char ReadChar()
+ bool? systemIoException = null;
if (bufferPosition >= bufferLength)
- Refill();
+
+ Refill(ref systemIoException);
+
+ return buffer[bufferPosition++];
+
+
+ public char ReadChar(ref bool? systemIoException)
+
+ if (bufferPosition >= bufferLength)
+
+ Refill(ref systemIoException);
+ // If using this Nullable as System.IO.IOException signal and is signaled.
+ if (systemIoException.HasValue && systemIoException.Value == true)
+
+ return '\0';
+
+
return buffer[bufferPosition++];
- private void Refill()
+ // You may ask to be signaled of a System.IO.IOException through the systemIoException parameter.
+ // Set it to false if you are interested, it will be set to true to signal a System.IO.IOException.
+ // Set it to null if you are not interested.
+ // This is used to avoid having a lot of System.IO.IOExceptions thrown while running the code under a debugger.
+ // Having a lot of exceptions thrown under a debugger causes the code to execute a lot more slowly.
+ // So use this if you are experimenting a lot of slow parsing at runtime under a debugger.
+ private void Refill(ref bool? systemIoException)
int newPosition = bufferLength - tokenStart;
@@ -86,7 +109,18 @@
int charsRead = input.Read(buffer, newPosition, buffer.Length - newPosition);
if (charsRead <= 0)
- throw new System.IO.IOException("read past eof");
+
+ // If interested in using this Nullable to signal a System.IO.IOException
+ if (systemIoException.HasValue)
+
+ systemIoException = true;
+ return;
+
+ else
+
+ throw new System.IO.IOException("read past eof");
+
+
else
bufferLength += charsRead;
@@ -96,6 +130,12 @@
tokenStart = bufferPosition;
return ReadChar();
+
+ public char BeginToken(ref bool? systemIoException)
+
+ tokenStart = bufferPosition;
+ return ReadChar(ref systemIoException);
+
public void Backup(int amount)
@@ -156,4 +196,4 @@
get return 1;
-
\ No newline at end of file
+
QueryParserTokenManager.cs:
--- QueryParserTokenManager.cs
+++ QueryParserTokenManager.cs
@@ -1341,9 +1341,16 @@
for (; ; )
+ bool? systemIoException = false;
try
- curChar = input_stream.BeginToken();
+ curChar = input_stream.BeginToken(ref systemIoException);
+ if (systemIoException != null && systemIoException.HasValue && systemIoException.Value == true)
+
+ jjmatchedKind = 0;
+ matchedToken = JjFillToken();
+ return matchedToken;
+
catch (System.IO.IOException)
@@ -1459,4 +1466,4 @@
while (start++ != end);
-
\ No newline at end of file
+
你也可以使用我的github版本 https://github.com/franckspike/lucenenet.git
【讨论】:
以上是关于避免 lucence QueryParser Parse 异常?的主要内容,如果未能解决你的问题,请参考以下文章
Lucene系列:(10)多条件搜索 QueryParser
Lucene 高阶查询的六脉神剑 —— QueryParser