solr使用教程五面试+工作

Posted Java帮帮

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了solr使用教程五面试+工作相关的知识,希望对你有一定的参考价值。

附2:solrconfig.xml

<?xml version="1.0" encoding="UTF-8" ?>

<config>

  <luceneMatchVersion>LUCENE_42</luceneMatchVersion>

         <lib dir="../../../lib" regex=".*\.jar" />

  <lib dir="../../../contrib/extraction/lib" regex=".*\.jar" />

  <lib dir="../../../dist/" regex="solr-cell-\d.*\.jar" />

  <lib dir="../../../contrib/clustering/lib/" regex=".*\.jar" />

  <lib dir="../../../dist/" regex="solr-clustering-\d.*\.jar" />

  <lib dir="../../../contrib/langid/lib/" regex=".*\.jar" />

  <lib dir="../../../dist/" regex="solr-langid-\d.*\.jar" />

  <lib dir="../../../contrib/velocity/lib" regex=".*\.jar" />

  <lib dir="../../../dist/" regex="solr-velocity-\d.*\.jar" />

  <lib dir="/total/crap/dir/ignored" />

  <dataDir>${solr.data.dir:}</dataDir>

  <directoryFactory name="DirectoryFactory"

                    class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}"/>

  <codecFactory class="solr.SchemaCodecFactory"/>

  <indexConfig>

    <!-- maxFieldLength was removed in 4.0. To get similar behavior, include a

         LimitTokenCountFilterFactory in your fieldType definition. E.g.

     <filter class="solr.LimitTokenCountFilterFactory" maxTokenCount="10000"/>

    -->

    <!-- Maximum time to wait for a write lock (ms) for an IndexWriter. Default: 1000 -->

    <!-- <writeLockTimeout>1000</writeLockTimeout>  -->

    <!-- The maximum number of simultaneous threads that may be

         indexing documents at once in IndexWriter; if more than this

         many threads arrive they will wait for others to finish.

         Default in Solr/Lucene is 8. -->

    <!-- <maxIndexingThreads>8</maxIndexingThreads>  -->


    <!-- Expert: Enabling compound file will use less files for the index,

         using fewer file descriptors on the expense of performance decrease.

         Default in Lucene is "true". Default in Solr is "false" (since 3.6) -->

    <!-- <useCompoundFile>false</useCompoundFile> -->


    <!-- ramBufferSizeMB sets the amount of RAM that may be used by Lucene

         indexing for buffering added documents and deletions before they are

         flushed to the Directory.

         maxBufferedDocs sets a limit on the number of documents buffered

         before flushing.

         If both ramBufferSizeMB and maxBufferedDocs is set, then

         Lucene will flush based on whichever limit is hit first.  -->

   <ramBufferSizeMB>100</ramBufferSizeMB>

    <maxBufferedDocs>1000</maxBufferedDocs>


    <!-- Expert: Merge Policy

         The Merge Policy in Lucene controls how merging of segments is done.

         The default since Solr/Lucene 3.3 is TieredMergePolicy.

         The default since Lucene 2.3 was the LogByteSizeMergePolicy,

         Even older versions of Lucene used LogDocMergePolicy.

       

    

        <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">

          <int name="maxMergeAtOnce">100</int>

          <int name="segmentsPerTier">100</int>

        </mergePolicy>

      -->

    

    <!-- Merge Factor

         The merge factor controls how many segments will get merged at a time.

         For TieredMergePolicy, mergeFactor is a convenience parameter which

         will set both MaxMergeAtOnce and SegmentsPerTier at once.

         For LogByteSizeMergePolicy, mergeFactor decides how many new segments

         will be allowed before they are merged into one.

         Default is 10 for both merge policies.


     -->

    <mergeFactor>50</mergeFactor>

 


    <!-- Expert: Merge Scheduler

         The Merge Scheduler in Lucene controls how merges are

         performed.  The ConcurrentMergeScheduler (Lucene 2.3 default)

         can perform merges in the background using separate threads.

         The SerialMergeScheduler (Lucene 2.2 default) does not.

     -->

    <!--

       <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/>

       -->


    <!-- LockFactory


         This option specifies which Lucene LockFactory implementation

         to use.

     

         single = SingleInstanceLockFactory - suggested for a

                  read-only index or when there is no possibility of

                  another process trying to modify the index.

         native = NativeFSLockFactory - uses OS native file locking.

                  Do not use when multiple solr webapps in the same

                  JVM are attempting to share a single index.

         simple = SimpleFSLockFactory  - uses a plain file for locking


         Defaults: 'native' is default for Solr3.6 and later, otherwise

                   'simple' is the default


         More details on the nuances of each LockFactory...

         http://wiki.apache.org/lucene-java/AvailableLockFactories

    -->

    <lockType>${solr.lock.type:native}</lockType>


    <!-- Unlock On Startup


         If true, unlock any held write or commit locks on startup.

         This defeats the locking mechanism that allows multiple

         processes to safely access a lucene index, and should be used

         with care. Default is "false".


         This is not needed if lock type is 'single'

     -->

    <!--

    <unlockOnStartup>false</unlockOnStartup>

      -->

   

    <!-- Expert: Controls how often Lucene loads terms into memory

         Default is 128 and is likely good for most everyone.

      -->

    <!-- <termIndexInterval>128</termIndexInterval> -->


    <!-- If true, IndexReaders will be reopened (often more efficient)

         instead of closed and then opened. Default: true

      -->

    <!--

    <reopenReaders>true</reopenReaders>

      -->


    <!-- Commit Deletion Policy

         Custom deletion policies can be specified here. The class must

         implement org.apache.lucene.index.IndexDeletionPolicy.


         The default Solr IndexDeletionPolicy implementation supports

         deleting index commit points on number of commits, age of

         commit point and optimized status.

        

         The latest commit point should always be preserved regardless

         of the criteria.

    -->

    <!--

    <deletionPolicy class="solr.SolrDeletionPolicy">

    -->

      <!-- The number of commit points to be kept -->

      <!-- <str name="maxCommitsToKeep">1</str> -->

      <!-- The number of optimized commit points to be kept -->

      <!-- <str name="maxOptimizedCommitsToKeep">0</str> -->

      <!--

          Delete all commit points once they have reached the given age.

          Supports DateMathParser syntax e.g.

        -->

      <!--

         <str name="maxCommitAge">30MINUTES</str>

         <str name="maxCommitAge">1DAY</str>

      -->

    <!--

    </deletionPolicy>

    -->


    <!-- Lucene Infostream

      

         To aid in advanced debugging, Lucene provides an "InfoStream"

         of detailed information when indexing.


         Setting The value to true will instruct the underlying Lucene

         IndexWriter to write its debugging info the specified file

      -->

     <!-- <infoStream file="INFOSTREAM.txt">false</infoStream> -->

  </indexConfig>


  <jmx />


  <updateHandler class="solr.DirectUpdateHandler2">


    <updateLog>

      <str name="dir">${solr.ulog.dir:}</str>

    </updateLog>


  

     <autoCommit>

          <maxDocs>1000</maxDocs>

       <maxTime>15000</maxTime>

       <openSearcher>false</openSearcher>

     </autoCommit>


   

  </updateHandler>

 

 

  <query>

    <!-- Max Boolean Clauses


         Maximum number of clauses in each BooleanQuery,  an exception

         is thrown if exceeded.


         ** WARNING **

        

         This option actually modifies a global Lucene property that

         will affect all SolrCores.  If multiple solrconfig.xml files

         disagree on this property, the value at any given moment will

         be based on the last SolrCore to be initialized.

        

      -->

    <maxBooleanClauses>1024</maxBooleanClauses>



    <!-- Solr Internal Query Caches


         There are two implementations of cache available for Solr,

         LRUCache, based on a synchronized LinkedHashMap, and

         FastLRUCache, based on a ConcurrentHashMap. 


         FastLRUCache has faster gets and slower puts in single

         threaded operation and thus is generally faster than LRUCache

         when the hit ratio of the cache is high (> 75%), and may be

         faster under other scenarios on multi-cpu systems.

    -->


    <!-- Filter Cache


         Cache used by SolrIndexSearcher for filters (DocSets),

         unordered sets of *all* documents that match a query.  When a

         new searcher is opened, its caches may be prepopulated or

         "autowarmed" using data from caches in the old searcher.

         autowarmCount is the number of items to prepopulate.  For

         LRUCache, the autowarmed items will be the most recently

         accessed items.


         Parameters:

           class - the SolrCache implementation LRUCache or

               (LRUCache or FastLRUCache)

           size - the maximum number of entries in the cache

           initialSize - the initial capacity (number of entries) of

               the cache.  (see java.util.HashMap)

           autowarmCount - the number of entries to prepopulate from

               and old cache. 

      -->

    <filterCache class="solr.FastLRUCache"

                 size="512"

                 initialSize="512"

                 autowarmCount="0"/>


    <!-- Query Result Cache

        

         Caches results of searches - ordered lists of document ids

         (DocList) based on a query, a sort, and the range of documents requested. 

      -->

    <queryResultCache class="solr.LRUCache"

                     size="512"

                     initialSize="512"

                     autowarmCount="0"/>

  

    <!-- Document Cache


         Caches Lucene Document objects (the stored fields for each

         document).  Since Lucene internal document ids are transient,

         this cache will not be autowarmed. 

      -->

    <documentCache class="solr.LRUCache"

                   size="512"

                   initialSize="512"

                   autowarmCount="0"/>

   

    <!-- Field Value Cache

        

         Cache used to hold field values that are quickly accessible

         by document id.  The fieldValueCache is created by default

         even if not configured here.

      -->

    <!--

       <fieldValueCache class="solr.FastLRUCache"

                        size="512"

                        autowarmCount="128"

                        showItems="32" />

      -->


    <!-- Custom Cache


         Example of a generic cache.  These caches may be accessed by

         name through SolrIndexSearcher.getCache(),cacheLookup(), and

         cacheInsert().  The purpose is to enable easy caching of

         user/application level data.  The regenerator argument should

         be specified as an implementation of solr.CacheRegenerator

         if autowarming is desired. 

      -->

    <!--

       <cache name="myUserCache"

              class="solr.LRUCache"

              size="4096"

              initialSize="1024"

              autowarmCount="1024"

              regenerator="com.mycompany.MyRegenerator"

              />

      -->



    <!-- Lazy Field Loading


         If true, stored fields that are not requested will be loaded

         lazily.  This can result in a significant speed improvement

         if the usual case is to not load all stored fields,

         especially if the skipped fields are large compressed text

         fields.

    -->

    <enableLazyFieldLoading>true</enableLazyFieldLoading>


   <!-- Use Filter For Sorted Query


        A possible optimization that attempts to use a filter to

        satisfy a search.  If the requested sort does not include

        score, then the filterCache will be checked for a filter

        matching the query. If found, the filter will be used as the

        source of document ids, and then the sort will be applied to

        that.


        For most situations, this will not be useful unless you

        frequently get the same search repeatedly with different sort

        options, and none of them ever use "score"

     -->

   <!--

      <useFilterForSortedQuery>true</useFilterForSortedQuery>

     -->


   <!-- Result Window Size


        An optimization for use with the queryResultCache.  When a search

        is requested, a superset of the requested number of document ids

        are collected.  For example, if a search for a particular query

        requests matching documents 10 through 19, and queryWindowSize is 50,

        then documents 0 through 49 will be collected and cached.  Any further

        requests in that range can be satisfied via the cache. 

     -->

   <queryResultWindowSize>20</queryResultWindowSize>


   <!-- Maximum number of documents to cache for any entry in the

        queryResultCache.

     -->

   <queryResultMaxDocsCached>200</queryResultMaxDocsCached>


   <!-- Query Related Event Listeners


        Various IndexSearcher related events can trigger Listeners to

        take actions.


        newSearcher - fired whenever a new searcher is being prepared

        and there is a current searcher handling requests (aka

        registered).  It can be used to prime certain caches to

        prevent long request times for certain requests.


        firstSearcher - fired whenever a new searcher is being

        prepared but there is no current registered searcher to handle

        requests or to gain autowarming data from.


       

     -->

    <!-- QuerySenderListener takes an array of NamedList and executes a

         local query request for each NamedList in sequence.

      -->

    <listener event="newSearcher" class="solr.QuerySenderListener">

      <arr name="queries">

        <!--

           <lst><str name="q">solr</str><str name="sort">price asc</str></lst>

           <lst><str name="q">rocks</str><str name="sort">weight asc</str></lst>

          -->

      </arr>

    </listener>

    <listener event="firstSearcher" class="solr.QuerySenderListener">

      <arr name="queries">

        <lst>

          <str name="q">static firstSearcher warming in solrconfig.xml</str>

        </lst>

      </arr>

    </listener>


   

       <str name="spellcheck.maxCollations">3</str>          

     </lst>


     <!-- append spellchecking to our list of components -->

     <arr name="last-components">

       <str>spellcheck</str>

     </arr>

  </requestHandler>



  <requestHandler name="/update" class="solr.UpdateRequestHandler">


  </requestHandler>

<requestHandler name="/update/json" class="solr.JsonUpdateRequestHandler">

        <lst name="defaults">

         <str name="stream.contentType">application/json</str>

       </lst>

  </requestHandler>

  <requestHandler name="/update/csv" class="solr.CSVRequestHandler">

        <lst name="defaults">

         <str name="stream.contentType">application/csv</str>

       </lst>

  </requestHandler>


  <requestHandler name="/update/extract"

                  startup="lazy" <!-- Use Cold Searcher


         If a search request comes in and there is no current

         registered searcher, then immediately register the still

         warming searcher and use it.  If "false" then all requests

         will block until the first searcher is done warming.

      -->

    <useColdSearcher>false</useColdSearcher>


    未完看下篇文章。



以上是关于solr使用教程五面试+工作的主要内容,如果未能解决你的问题,请参考以下文章

solr使用教程二面试+工作

solr简明教程

Solr安装配置教程java整合solr

solr的安装及配置详细教程

自定义Dockerfile构建Solr

《揭秘一线互联网企业 前端JavaScript高级面试》视频教程总结系列五:MVVM和Vue 相关