Lucene Query种类

Posted 一颗小蚕豆 期待发芽

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Lucene Query种类相关的知识,希望对你有一定的参考价值。

1.3. 按词条搜索—TermQuery

Query query = null;

query=new TermQuery(new Term("name","word1 a and"));

hits=searcher.search(query);// 查找 name:word1 a and 共0个结果

System.out.println("查找 "+query.toString()+" 共" + hits.length() + "个结果");

1.4. 按“与或”搜索—BooleanQuery

1.和: MUST与MUST_NOT

2.或: SHOULD与SHOULD

3.A与B的并集-B  MUST与MUST_NOT

Query query1=null;

Query query2=null;

BooleanQuery query=null;

query1=new TermQuery(new Term("name","word1"));

query2=new TermQuery(new Term("name","word2"));

query=new BooleanQuery();

query.add(query1,BooleanClause.Occur.MUST);

query.add(query2,BooleanClause.Occur.MUST_NOT);

1.5. 在某一范围内搜索—RangeQuery

Term beginTime=new Term("time","200001");

Term endTime=new Term("time","200005");

RangeQuery query=null;

query=new RangeQuery(beginTime,endTime,false);//不包含边界值

1.6. 使用前缀搜索—PrefixQuery

Term pre1=new Term("name","wor");

PrefixQuery query=null;

query = new PrefixQuery(pre1);

1.7. 短语搜索—PhraseQuery

a)默认坡度为0

PhraseQuery query = new PhraseQuery();

query.add(new Term(“bookname”,”钢”));

query.add(new Term(“bookname”,”铁”));

Hits hits=searcher.search(query); //搜索“钢铁”短语,而非“钢”和“铁”

b)设置坡度,默认为0

PhraseQuery query = new PhraseQuery();

query.add(new Term(“bookname”,”钢”));

query.add(new Term(“bookname”,”铁”));

query.setSlop(1);

Hits hits=searcher.search(query);//搜索“钢铁”或“钢*铁”中含一字

1.8. 多短语搜索—MultiPhraseQuery

a)

MultiPhraseQuery query=new MultiPhraseQuery();

//首先向其中加入要查找的短语的前缀

query.add(new Term(“bookname”,”钢”));

//构建3个Term,作为短语的后缀

Term t1=new Term(“bookname”,”铁”);

Term t2=new Term(“bookname”,”和”);

Term t3=new Term(“bookname”,”要”);

//再向query中加入所有的后缀,与前缀一起,它们将组成3个短语

query.add(new Term[]{t1,t2,t3});

Hits hits=searcher.search(query);

for(int i=0;i<hits.length();i++)

    System.out.println(hits.doc(i));

b)

MultiPhraseQuery query=new MultiPhraseQuery();

Term t1=new Term(“bookname”,”钢”);

Term t2 = new Term(“bookname”,”和”);

query.add(new Term[]{t1,t2});

query.add(new Term(“bookname”,”铁”));

c)

MultiPhraseQuery query=new MultiPhraseQuery();

Term t1=new Term(“bookname”,”钢”);

Term t2 = new Term(“bookname”,”和”);

query.add(new Term[]{t1,t2});

query.add(new Term(“bookname”,”铁”));

Term t3=new Term(“bookname”,”是”);

Term t4=new Term(“bookname”,”战”);

query.add(new Term[]{t3,t4});

1.9. 模糊搜索—FuzzyQuery

使用的算法为levenshtein算法,在比较两个字符串时,将动作分为3种:

l         加一个字母

l         删一个字母

l         改变一个字母

FuzzyQuery query=new FuzzyQuery(new Term(“content”,”work”));

 

public FuzzyQuery(Term term)

public FuzzyQuery(Term term,float minimumSimilarity)throws IllegalArgumentException

public FuzzyQuery(Term term,float minimumSimilarity,int prefixLength)throws IllegalArgumentException

其中minimumSimilarity为最小相似度,越小则文档的数量越多。默认为0.5.其值必须<1.0

FuzzyQuery query=new FuzzyQuery(new Term(“content”,”work”),0.1f);

其中prefixLength表示要有多少个前缀字母必须完全匹配

FuzzyQuery query=new FuzzyQuery(new Term(“content”,”work”),0.1f,1);

1.10.            通配符搜索—WildcardQuery

* 表示0到多个字符

? 表示一个单一的字符

WildcardQuery query=new WildcardQuery(new Term(“content”,”?qq*”));

1.11.            跨度搜索

1.11.1.      SpanTermQuery

效果和TermQuery相同

SpanTermQuery query=new SpanTermQuery(new Term(“content”,”abc”));

1.11.2.      SpanFirstQuery

Field内容的起始位置开始,在一个固定的宽度内查找所指定的词条

SpanFirstQuery query=new SpanFirstQuery(new Term(“content”,”abc”),3);//是第3个word,不是byte

1.11.3.      SpanNearQuery

SpanNearQuery相当与PhaseQuery

SpanTermQuery people=new SpanTermQuery(new Term(“content”,”mary”));

SpanTermQuery how=new SpanTermQuery(new Term(“content”,”poor”));

SpanNearQuery query=new SpanNearQuery(new SpanQuery[]{people,how},3,false);

1.11.4.      SpanOrQuery

把所有SpanQuery的结果合起来

SpanTermQuery s1=new SpanTermQuery(new Term(“content”,”aa”);

SpanTermQuery s2=new SpanTermQuery(new Term(“content”,”cc”);

SpanTermQuery s3=new SpanTermQuery(new Term(“content”,”gg”);

SpanTermQuery s4=new SpanTermQuery(new Term(“content”,”kk”);

SpanNearQuery query1=new SpanNearQuery(new SpanQuery[]{s1,s2},1,false);

SpanNearQuery query2=new SpanNearQuery(new SpanQuery[]{s3,s4},3,false);

SpanOrQuery query=new SpanOrQuery(new SpanQuery[]{query1,query2});

1.11.5.      SpanNotQuery

从第1个SpanQuery的查询结果中,去掉第2个SpanQuery的查询结果

SpanTermQuery s1=new SpanTermQuery(new Term(“content”,”aa”);

SpanFirstQuery query1=new SpanFirstQuery(s1,3);

 

SpanTermQuery s3=new SpanTermQuery(new Term(“content”,”gg”);

SpanTermQuery s4=new SpanTermQuery(new Term(“content”,”kk”);

SpanNearQuery query2=new SpanNearQuery(new SpanQuery[]{s3,s4},4,false);

 

SpanNotQuery query=new SpanNotQuery(query1,query2);

1.12.            RegexQuery—正则表达式的查询

String regex="http://[a-z]{1,3}\\.abc\\.com/.*";

       RegexQuery query=new RegexQuery(new Term("url",regex));

 

转:http://www.blogjava.net/fanyingjie/archive/2010/06/21/324038.html

以上是关于Lucene Query种类的主要内容,如果未能解决你的问题,请参考以下文章

Lucene的分析资料

lucene query

Lucene--搜索

Lucene-Query的使用及其索引库的维护

lucene-查询query->FuzzyQuery相近词语的搜索

lucene-查询query->PrefixQuery使用前缀搜索