Solr——commit
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Solr——commit相关的知识,希望对你有一定的参考价值。
参考技术A Solr有两种commit的方式在对hard commit进行说明之前需要对transaction log进行说明
tlog的作用是保证数据的一致性(类似于Oracle中的Redo log),避免应用非正常关闭时的数据丢失。
commit操作时会将数据写入到tlog中,然后tlog会将数据的修改反映在索引库中。遇到应用非正常关闭的情况,在应用启动时,系统会将tlog中未操作的数据先写入到索引库中。如果tlog中有大量的未操作的数据,系统启动时恢复的时间会很长。
在commit完成后,是否开启新的searcher,以便能够搜索到新的数据。
开启新searcher时,过期旧searcher的cache(如filterCache, queryResultCache等),对新searcher进行Autowarming操作。
soft commit是solr 4.0中提供的新功能,soft commit是实现Solr的near real time search(NRT)功能的基础
soft commit保证数据的可见性,无论此时数据是否保存在索引库中。
soft commit后将会开启新的searcher,过期旧searcher的cache(如filterCache, queryResultCache等),对新searcher进行Autowarming操作。
如果数据量大,Autowarming操作的时间会很长。一旦Autowarming操作的时间大于soft commit的时间(新的searcher还没有创建完毕,有需要创建更新的searcher),将会一直打开新的searcher,系统资源将会耗尽。因此对于数据量大的应用尽可能的增加soft commit的时间。
大大增加soft commit的时间,避免open too much searcher的问题。
避免因为应用非正常关闭引起的启动恢复时间过长的情况,将hard commit时间尽可能的减少,如15秒。将openSearcher的值设为false。
两者结合着使用,既能保证数据的完整性,又能确保速度。
https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
http://opensourceconnections.com/blog/2013/04/25/understanding-solr-soft-commits-and-data-durability/
DirectUpdateHandler2 Solr commit
CommitTracker 构造了autocommit和autosoftcommit
public final class CommitTracker implements Runnable {
CommitTracker实现了runnable接口,根据solrconfig配置进行初始化
int docsUpperBound = updateHandlerInfo.autoCommmitMaxDocs; // getInt("updateHandler/autoCommit/maxDocs", -1); int timeUpperBound = updateHandlerInfo.autoCommmitMaxTime; // getInt("updateHandler/autoCommit/maxTime", -1); commitTracker = new CommitTracker("Hard", core, docsUpperBound, timeUpperBound, updateHandlerInfo.openSearcher, false);//硬提交 int softCommitDocsUpperBound = updateHandlerInfo.autoSoftCommmitMaxDocs; // getInt("updateHandler/autoSoftCommit/maxDocs", -1);
int softCommitTimeUpperBound = updateHandlerInfo.autoSoftCommmitMaxTime; // getInt("updateHandler/autoSoftCommit/maxTime", -1); softCommitTracker = new CommitTracker("Soft", core, softCommitDocsUpperBound, softCommitTimeUpperBound, true, true);//软提交
CommitTracker构造函数
public CommitTracker(String name, SolrCore core, int docsUpperBound, int timeUpperBound, boolean openSearcher, boolean softCommit) {
可以看到软提交openSearcher=true,并表示此次为softcommit
而硬提交的openSearcher需要根据配置进行初始化
so,如果继续跟着流程发现DirectUpdaterHandler2执行addDoc的时候进行commit判断,一直到SolrCore的openNewSearcher方法
/** Opens a new searcher and returns a RefCounted<SolrIndexSearcher> with its reference incremented. * * "realtime" means that we need to open quickly for a realtime view of the index, hence don‘t do any * autowarming and add to the _realtimeSearchers queue rather than the _searchers queue (so it won‘t * be used for autowarming by a future normal searcher). A "realtime" searcher will currently never * become "registered" (since it currently lacks caching). * * realtimeSearcher is updated to the latest opened searcher, regardless of the value of "realtime". * * This method acquires openSearcherLock - do not call with searchLock held! */ public RefCounted<SolrIndexSearcher> openNewSearcher(boolean updateHandlerReopens, boolean realtime) { if (isClosed()) { // catch some errors quicker throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "openNewSearcher called on closed core"); } SolrIndexSearcher tmp; RefCounted<SolrIndexSearcher> newestSearcher = null; openSearcherLock.lock(); try { String newIndexDir = getNewIndexDir(); String indexDirFile = null; String newIndexDirFile = null; // if it‘s not a normal near-realtime update, check that paths haven‘t changed. if (!updateHandlerReopens) { indexDirFile = getDirectoryFactory().normalize(getIndexDir()); newIndexDirFile = getDirectoryFactory().normalize(newIndexDir); } synchronized (searcherLock) { newestSearcher = realtimeSearcher; if (newestSearcher != null) { newestSearcher.incref(); // the matching decref is in the finally block } } if (newestSearcher != null && (updateHandlerReopens || indexDirFile.equals(newIndexDirFile))) { DirectoryReader newReader; DirectoryReader currentReader = newestSearcher.get().getRawReader();//获取solrIndexSearcher中的rawReader,内存Reader // SolrCore.verbose("start reopen from",previousSearcher,"writer=",writer); RefCounted<IndexWriter> writer = getSolrCoreState().getIndexWriter(null); try { if (writer != null) { // if in NRT mode, open from the writer newReader = DirectoryReader.openIfChanged(currentReader, writer.get(), true);//是底层直接调用Lucene的IndexWriter的getReader来实现 } else { // verbose("start reopen without writer, reader=", currentReader); newReader = DirectoryReader.openIfChanged(currentReader); // verbose("reopen result", newReader); } } finally { if (writer != null) { writer.decref(); } } if (newReader == null) { // the underlying index has not changed at all if (realtime) { // if this is a request for a realtime searcher, just return the same searcher newestSearcher.incref(); return newestSearcher; } else if (newestSearcher.get().isCachingEnabled() && newestSearcher.get().getSchema() == getLatestSchema()) { // absolutely nothing has changed, can use the same searcher // but log a message about it to minimize confusion newestSearcher.incref(); log.debug("SolrIndexSearcher has not changed - not re-opening: " + newestSearcher.get().getName()); return newestSearcher; } // ELSE: open a new searcher against the old reader... currentReader.incRef(); newReader = currentReader; } // for now, turn off caches if this is for a realtime reader // (caches take a little while to instantiate) final boolean useCaches = !realtime; final String newName = realtime ? "realtime" : "main"; tmp = new SolrIndexSearcher(this, newIndexDir, getLatestSchema(), newName, newReader, true, useCaches, true, directoryFactory); } else { // newestSearcher == null at this point if (newReaderCreator != null) { // this is set in the constructor if there is a currently open index writer // so that we pick up any uncommitted changes and so we don‘t go backwards // in time on a core reload DirectoryReader newReader = newReaderCreator.call(); tmp = new SolrIndexSearcher(this, newIndexDir, getLatestSchema(), (realtime ? "realtime":"main"), newReader, true, !realtime, true, directoryFactory); } else { RefCounted<IndexWriter> writer = getSolrCoreState().getIndexWriter(this); DirectoryReader newReader = null; try { newReader = indexReaderFactory.newReader(writer.get(), this); } finally { writer.decref(); } tmp = new SolrIndexSearcher(this, newIndexDir, getLatestSchema(), (realtime ? "realtime":"main"), newReader, true, !realtime, true, directoryFactory); } } List<RefCounted<SolrIndexSearcher>> searcherList = realtime ? _realtimeSearchers : _searchers; RefCounted<SolrIndexSearcher> newSearcher = newHolder(tmp, searcherList); // refcount now at 1 // Increment reference again for "realtimeSearcher" variable. It should be at 2 after. // When it‘s decremented by both the caller of this method, and by realtimeSearcher being replaced, // it will be closed. newSearcher.incref(); synchronized (searcherLock) { // Check if the core is closed again inside the lock in case this method is racing with a close. If the core is // closed, clean up the new searcher and bail. if (isClosed()) { newSearcher.decref(); // once for caller since we‘re not returning it newSearcher.decref(); // once for ourselves since it won‘t be "replaced" throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "openNewSearcher called on closed core"); } if (realtimeSearcher != null) { realtimeSearcher.decref(); } realtimeSearcher = newSearcher; searcherList.add(realtimeSearcher); } return newSearcher; } catch (Exception e) { throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "Error opening new searcher", e); } finally { openSearcherLock.unlock(); if (newestSearcher != null) { newestSearcher.decref(); } } }
通过当前内存的reader和writer重新打开一个内存中的reader,调用Lucene中IndexWriter中getReader
以上是关于Solr——commit的主要内容,如果未能解决你的问题,请参考以下文章