Refactoring with Solr
Posted RuiKing2010
篇首语:本文由小常识网(小编为大家整理,主要介绍了Refactoring with Solr相关的知识,希望对你有一定的参考价值。
1.About Solr Solris the popular, blazing fast, open source NoSQL search platform from the ApacheLucene project. Its major features include powerful full-text search, hithighlighting, faceted search, dynamic clustering, database integration, richdocument (e.g., Word, PDF) handling, and geospatial search. Solr is highlyscalable, providing fault tolerant distributed search and indexing, and powersthe search and navigation features of many of the world's largest internetsites. SolrFeatures: Solr is a standalone enterprise search server with a REST-like API.You put documents in it (called "indexing") via JSON, XML, CSV orbinary over HTTP. You query it via HTTP GET and receive JSON, XML, CSV orbinary results. 2.Solr SetupSoftwareDownload Java:You will need the Java Runtime Environment (JRE) version 1.7 or higher. Tomcat:Through the server deployment project (May also be other server). Solr SetupSteps Step1 Step2 Copysolr \\server\\webapps\\solr.war to tomcat \\webapps Step3 Runtomcat startup.bat (tomcat will automatically unpack solr.war) Step4 Deletetomcat \\webapps\\solr.war (if not,tomcat will publish solr every time whenserver start up) Step5
Addabove code in <web-app /> node. Step6 Copyall files under solr \\example\\example-DIH\\solr to local path named $solrHome Step7 Copysolr \\dist\\*.jar to tomcat webapps\\solr\\WEB-INF\\lib Step8 Starttomcat, to access http://localhost:8080/solr/ Ifsuccessfully, You will see below page. 3.Schema.xml Schema.xml is usually thefirst file you configure when setting up a new Solr installation. Theschema declares: l Whatkinds of fields there are l Whichfield should be used as the unique/primary key l Whichfields are required l Howto index and search each field
Field Types
Fields Thedocumentation provides a list of valid attributes: name: mandatory - the name forthe field type: mandatory - the name of apreviously defined type from the <types> section indexed: true if this field shouldbe indexed (searchable or sortable) stored: true if this field shouldbe retrievable compressed: [false] if this fieldshould be stored using gzip compression (this will only apply if the field typeis compressable; among the standard field types, only TextField and StrFieldare) multiValued: true if this field maycontain multiple values per document omitNorms: (expert) set to true to omitthe norms associated with this field (this disables length normalization andindex-time boosting for the field, and saves some memory). Only full-textfields or fields that need an index-time boost need norms. termVectors: [false] set to true tostore the term vector for a given field. When using MoreLikeThis, fields usedfor similarity should be stored for best performance. termPositions: Store position informationwith the term vector. This will increase storage costs. termOffsets: Store offset informationwith the term vector. This will increase storage costs. default: a value that should be usedif no value is specified when adding a document. Misc
Equivalent to the primary keyof the document.
Usedfor determining if multiple terms are ANDed or ORed together by default. 4.Solrconfig.xml Solrconfig.xmlis usually the second file you configure when setting up a new Solrinstallation, after schema.xml.
The more commonly-usedelements in solrconfig.xml are:
l data directory location l cacheparameters l requesthandlers Request handlers areresponsble for accepting HTTP requests, performing searches, then returning theresults.
l searchcomponents Search components extend theabstract class SearchComponent and areresponsible for performing the actual searches. 5.SolrJ Setting up the classpath From /dist apache-solr-solrj-*.jar From/dist/solrj-lib commons-codec-1.3.jar commons-httpclient-3.1.jar commons-io-1.4.jar jcl-over-slf4j-1.5.5.jar slf4j-api-1.5.5.jar From /lib slf4j-jdk14-1.5.5.jar
9. References
Boosts In addition to the scoring factorsmentioned above, the primary method of modifying document scores is byboosting. There are 2 kinds of boosts. Index-time andQuery-time boosts. Index-time boosts are applied when addingdocuments, and apply to the entire document or to specific fields. Query-time boosts are applied whenconstructing a search query, and apply to specific fields. Query boosts are applied by appending thecaret character ^ followed by a positive number to query clauses. title:foo OR(title:foo AND title:bar)^2.0 OR title:"foo bar"^10 Negative boosts Whilst Lucene allows negative boosts, Solrdoes not. The only way to meaningfully perform anegative boost, is by applying a positive boost to a negative query. Forexample: (*:*-title:foo)^2.0 This boosts all documents which don't have"foo" in the title by 2.0, thereby effectively applying a down boostto documents which do. We mainly use Index-time fashion to applyboosts when adding documents. There are two fields to operate usingSolrj.We can adding boost to the field in solr document or adding boost to solrdocument itself.
There are three people, and All of them have a title “I can playjava”.Now we add boost 1 to field,2 to title field and 3 to titlefield. After we selected the key word "java",the data will displaylike blow.
以上是关于Refactoring with Solr的主要内容,如果未能解决你的问题,请参考以下文章
ruby 相关:
Sunspot Solr Rails - 使用“with”搜索多个模型
BookNote: Refactoring - Improving the Design of Existing Code