Solr——commit

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Solr——commit相关的知识,希望对你有一定的参考价值。

参考技术A Solr有两种commit的方式

在对hard commit进行说明之前需要对transaction log进行说明

tlog的作用是保证数据的一致性(类似于Oracle中的Redo log),避免应用非正常关闭时的数据丢失。

commit操作时会将数据写入到tlog中,然后tlog会将数据的修改反映在索引库中。遇到应用非正常关闭的情况,在应用启动时,系统会将tlog中未操作的数据先写入到索引库中。如果tlog中有大量的未操作的数据,系统启动时恢复的时间会很长。

在commit完成后,是否开启新的searcher,以便能够搜索到新的数据。

开启新searcher时,过期旧searcher的cache(如filterCache, queryResultCache等),对新searcher进行Autowarming操作。

soft commit是solr 4.0中提供的新功能,soft commit是实现Solr的near real time search(NRT)功能的基础

soft commit保证数据的可见性,无论此时数据是否保存在索引库中。

soft commit后将会开启新的searcher,过期旧searcher的cache(如filterCache, queryResultCache等),对新searcher进行Autowarming操作。

如果数据量大,Autowarming操作的时间会很长。一旦Autowarming操作的时间大于soft commit的时间(新的searcher还没有创建完毕,有需要创建更新的searcher),将会一直打开新的searcher,系统资源将会耗尽。因此对于数据量大的应用尽可能的增加soft commit的时间。

大大增加soft commit的时间,避免open too much searcher的问题。

避免因为应用非正常关闭引起的启动恢复时间过长的情况,将hard commit时间尽可能的减少,如15秒。将openSearcher的值设为false。

两者结合着使用,既能保证数据的完整性,又能确保速度。

https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

http://opensourceconnections.com/blog/2013/04/25/understanding-solr-soft-commits-and-data-durability/

DirectUpdateHandler2 Solr commit

CommitTracker 构造了autocommit和autosoftcommit
public final class CommitTracker implements Runnable {

CommitTracker实现了runnable接口,根据solrconfig配置进行初始化

int docsUpperBound = updateHandlerInfo.autoCommmitMaxDocs; // getInt("updateHandler/autoCommit/maxDocs", -1);
int timeUpperBound = updateHandlerInfo.autoCommmitMaxTime; // getInt("updateHandler/autoCommit/maxTime", -1);
commitTracker = new CommitTracker("Hard", core, docsUpperBound, timeUpperBound, updateHandlerInfo.openSearcher, false);//硬提交
    
int softCommitDocsUpperBound = updateHandlerInfo.autoSoftCommmitMaxDocs; // getInt("updateHandler/autoSoftCommit/maxDocs", -1);
int softCommitTimeUpperBound = updateHandlerInfo.autoSoftCommmitMaxTime; // getInt("updateHandler/autoSoftCommit/maxTime", -1); softCommitTracker = new CommitTracker("Soft", core, softCommitDocsUpperBound, softCommitTimeUpperBound, true, true);//软提交

CommitTracker构造函数

public CommitTracker(String name, SolrCore core, int docsUpperBound, int timeUpperBound, boolean openSearcher, boolean softCommit) {

可以看到软提交openSearcher=true,并表示此次为softcommit

而硬提交的openSearcher需要根据配置进行初始化

 

so,如果继续跟着流程发现DirectUpdaterHandler2执行addDoc的时候进行commit判断,一直到SolrCore的openNewSearcher方法

/** Opens a new searcher and returns a RefCounted<SolrIndexSearcher> with its reference incremented.
   *
   * "realtime" means that we need to open quickly for a realtime view of the index, hence don‘t do any
   * autowarming and add to the _realtimeSearchers queue rather than the _searchers queue (so it won‘t
   * be used for autowarming by a future normal searcher).  A "realtime" searcher will currently never
   * become "registered" (since it currently lacks caching).
   *
   * realtimeSearcher is updated to the latest opened searcher, regardless of the value of "realtime".
   *
   * This method acquires openSearcherLock - do not call with searchLock held!
   */
  public RefCounted<SolrIndexSearcher>  openNewSearcher(boolean updateHandlerReopens, boolean realtime) {
    if (isClosed()) { // catch some errors quicker
      throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "openNewSearcher called on closed core");
    }

    SolrIndexSearcher tmp;
    RefCounted<SolrIndexSearcher> newestSearcher = null;

    openSearcherLock.lock();
    try {
      String newIndexDir = getNewIndexDir();
      String indexDirFile = null;
      String newIndexDirFile = null;

      // if it‘s not a normal near-realtime update, check that paths haven‘t changed.
      if (!updateHandlerReopens) {
        indexDirFile = getDirectoryFactory().normalize(getIndexDir());
        newIndexDirFile = getDirectoryFactory().normalize(newIndexDir);
      }

      synchronized (searcherLock) {
        newestSearcher = realtimeSearcher;
        if (newestSearcher != null) {
          newestSearcher.incref();      // the matching decref is in the finally block
        }
      }

      if (newestSearcher != null && (updateHandlerReopens || indexDirFile.equals(newIndexDirFile))) {

        DirectoryReader newReader;
        DirectoryReader currentReader = newestSearcher.get().getRawReader();//获取solrIndexSearcher中的rawReader,内存Reader

        // SolrCore.verbose("start reopen from",previousSearcher,"writer=",writer);

        RefCounted<IndexWriter> writer = getSolrCoreState().getIndexWriter(null);

        try {
          if (writer != null) {
            // if in NRT mode, open from the writer
            newReader = DirectoryReader.openIfChanged(currentReader, writer.get(), true);//是底层直接调用Lucene的IndexWriter的getReader来实现
          } else {
            // verbose("start reopen without writer, reader=", currentReader);
            newReader = DirectoryReader.openIfChanged(currentReader);
            // verbose("reopen result", newReader);
          }
        } finally {
          if (writer != null) {
            writer.decref();
          }
        }

        if (newReader == null) { // the underlying index has not changed at all

          if (realtime) {
            // if this is a request for a realtime searcher, just return the same searcher
            newestSearcher.incref();
            return newestSearcher;

          } else if (newestSearcher.get().isCachingEnabled() && newestSearcher.get().getSchema() == getLatestSchema()) {
            // absolutely nothing has changed, can use the same searcher
            // but log a message about it to minimize confusion

            newestSearcher.incref();
            log.debug("SolrIndexSearcher has not changed - not re-opening: " + newestSearcher.get().getName());
            return newestSearcher;

          } // ELSE: open a new searcher against the old reader...
          currentReader.incRef();
          newReader = currentReader;
        }

        // for now, turn off caches if this is for a realtime reader 
        // (caches take a little while to instantiate)
        final boolean useCaches = !realtime;
        final String newName = realtime ? "realtime" : "main";
        tmp = new SolrIndexSearcher(this, newIndexDir, getLatestSchema(), newName,
                                    newReader, true, useCaches, true, directoryFactory);

      } else {
        // newestSearcher == null at this point

        if (newReaderCreator != null) {
          // this is set in the constructor if there is a currently open index writer
          // so that we pick up any uncommitted changes and so we don‘t go backwards
          // in time on a core reload
          DirectoryReader newReader = newReaderCreator.call();
          tmp = new SolrIndexSearcher(this, newIndexDir, getLatestSchema(),
              (realtime ? "realtime":"main"), newReader, true, !realtime, true, directoryFactory);
        } else  {
          RefCounted<IndexWriter> writer = getSolrCoreState().getIndexWriter(this);
          DirectoryReader newReader = null;
          try {
            newReader = indexReaderFactory.newReader(writer.get(), this);
          } finally {
            writer.decref();
          }
          tmp = new SolrIndexSearcher(this, newIndexDir, getLatestSchema(),
              (realtime ? "realtime":"main"), newReader, true, !realtime, true, directoryFactory);
        }
      }

      List<RefCounted<SolrIndexSearcher>> searcherList = realtime ? _realtimeSearchers : _searchers;
      RefCounted<SolrIndexSearcher> newSearcher = newHolder(tmp, searcherList);    // refcount now at 1

      // Increment reference again for "realtimeSearcher" variable.  It should be at 2 after.
      // When it‘s decremented by both the caller of this method, and by realtimeSearcher being replaced,
      // it will be closed.
      newSearcher.incref();

      synchronized (searcherLock) {
        // Check if the core is closed again inside the lock in case this method is racing with a close. If the core is
        // closed, clean up the new searcher and bail.
        if (isClosed()) {
          newSearcher.decref(); // once for caller since we‘re not returning it
          newSearcher.decref(); // once for ourselves since it won‘t be "replaced"
          throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "openNewSearcher called on closed core");
        }

        if (realtimeSearcher != null) {
          realtimeSearcher.decref();
        }
        realtimeSearcher = newSearcher;
        searcherList.add(realtimeSearcher);
      }

      return newSearcher;

    } catch (Exception e) {
      throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "Error opening new searcher", e);
    }
    finally {
      openSearcherLock.unlock();
      if (newestSearcher != null) {
        newestSearcher.decref();
      }
    }
  }

通过当前内存的reader和writer重新打开一个内存中的reader,调用Lucene中IndexWriter中getReader

 

以上是关于Solr——commit的主要内容,如果未能解决你的问题,请参考以下文章

git 相关命令

Solr01-Solr概述及Solr文件说明

zookeeper 怎样solr

Solr——Solr7安装教程

Solr入门-Solr服务安装(windows系统)

Solr使用——啥是solr