HBase选择Store file做compaction的算法

Posted 2021-07-22 格格巫 MMQ!!

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了HBase选择Store file做compaction的算法相关的知识，希望对你有一定的参考价值。

The algorithm is basically as follows:

Run over the set of all store files, from oldest to youngest

If there are more than 3 (hbase.hstore.compactionThreshold) store files left and the current store file is 20% larger then the sum of all younger store files, and it is larger than the memstore flush size, then we go on to the next, younger, store file and repeat step 2.

Once one of the conditions in step two is not valid anymore, the store files from the current one to the youngest one are the ones that will be merged together. If there are less than the compactionThreshold, no merge will be performed. There is also a limit which prevents more than 10 (hbase.hstore.compaction.max) store files to be merged in one compaction.

与compaction相关的配置参数，可以在Hbase-default.xml或者Hbase-site.xml进行查看或者配置。

this.minFilesToCompact = Math.max(2, conf.getInt(“hbase.hstore.compaction.min”, /old name/ conf.getInt(“hbase.hstore.compactionThreshold”, 3)));this.majorCompactionTime = getNextMajorCompactTime();this.maxFilesToCompact = conf.getInt(“hbase.hstore.compaction.max”, 10);this.minCompactSize = conf.getLong(“hbase.hstore.compaction.min.size”, this.region.memstoreFlushSize);this.maxCompactSize = conf.getLong(“hbase.hstore.compaction.max.size”, Long.MAX_VALUE);

2011/7/11更新选择哪些store files去做min compaction的代码注释：

//
// Compaction
//

/**

Compact the StoreFiles. This method may take some time, so the calling
thread must be able to block for long periods.
During this time, the Store can work as usual, getting values from
StoreFiles and writing new StoreFiles from the memstore.
Existing StoreFiles are not destroyed until the new compacted StoreFile is
completely written-out to disk.
The compactLock prevents multiple simultaneous compactions.
The structureLock prevents us from interfering with other write operations.
We don't want to hold the structureLock for the whole time, as a compact()
can be lengthy and we want to allow cache-flushes during this period.
@param forceMajor True to force a major compaction regardless of thresholds
@return row to split around if a split is needed, null otherwise
@throws IOException
*/
StoreSize compact(final boolean forceMajor) throws IOException {
boolean forceSplit = this.region.shouldForceSplit();
boolean majorcompaction = forceMajor;
synchronized (compactLock) { // 一次只能有一个thread进行compaction, store范围，region范围还是region server范围？
this.lastCompactSize = 0;

// filesToCompact are sorted oldest to newest.
List filesToCompact = this.storefiles;
if (filesToCompact.isEmpty()) {
LOG.debug(this.storeNameStr + “: no store files to compact”);
return null;
}

// Check to see if we need to do a major compaction on this region.
// If so, change doMajorCompaction to true to skip the incremental
// compacting below. Only check if doMajorCompaction is not true.
if (!majorcompaction) {
majorcompaction = isMajorCompaction(filesToCompact);
}

boolean references = hasReferences(filesToCompact);
if (!majorcompaction && !references &&
(forceSplit || (filesToCompact.size() < compactionThreshold))) {
return checkSplit(forceSplit);
}

/* get store file sizes for incremental compacting selection.
- normal skew:
- ```
   older ----> newer
```
- _
- | | _
- | | | | _
- –|-|- |-|- |-|—-------------- minCompactSize（参数配置）
- | | | | | | | | _ | |
- | | | | | | | | | | | |
- | | | | | | | | | | | |
  /
  int countOfFiles = filesToCompact.size();
  long [] fileSizes = new long[countOfFiles];
  long [] sumSize = new long[countOfFiles];
  for (int i = countOfFiles-1; i >= 0; --i) {
  StoreFile file = filesToCompact.get(i);
  Path path = file.getPath();
  if (path == null) {
  LOG.error("Path is null for " + file);
  return null;
  }
  StoreFile.Reader r = file.getReader();
  if (r == null) {
  LOG.error(“StoreFile " + file + " has a null Reader”);
  return null;
  }
  fileSizes[i] = file.getReader().length();
  // calculate the sum of fileSizes[i,i+maxFilesToCompact-1) for algo
  int tooFar = i + this.maxFilesToCompact - 1;
  /*
- e.g:sum_size保存的是相邻的maxFielsToCompact个storeFile大小的和
- index : 0, 1, 2, 3 4, 5
- f size: 10, 20, 15, 25, 15, 10
- fooFar: 2, 3, 4, 5, 6, 7
- s size: 45, 60, 55, 50, 25, 10 (maxFilesToCompact = 3, countOfFiles = 6)
  */
  sumSize[i] = fileSizes[i]
  + ((i+1 < countOfFiles) ? sumSize[i+1] : 0)
  - ((tooFar < countOfFiles) ? fileSizes[tooFar] : 0);
  }
long totalSize = 0;
if (!majorcompaction && !references) {
// we’re doing a minor compaction, let’s see what files are applicable
int start = 0;
double r = this.compactRatio;
```
/* Start at the oldest file and stop when you find the first file that
* meets compaction criteria:
* 从老的storefile到新的storefile进行遍历，停止的条件是当遇到一个storefile的
* 大小小于minCompactSize的时候，或者是小于后面maxFilesToCompact个storefile
* 大小的和乘以compactRatio（默认1.2）
*
* X <= minCompactSize || X <= SUM_ * compactRatio ==> 停止
* X > minCompactSize && x > SUM_ * compactRation  ==> 继续扫描
* X > max(minCompactSize, SUM_ * compactRation)  ==> 继续扫描
*  (1) a recently-flushed, small file (i.e. <= minCompactSize)
*      OR
*  (2) within the compactRatio of sum(newer_files)
* Given normal skew, any newer files will also meet this criteria
*
* Additional Note:
* If fileSizes.size() >> maxFilesToCompact, we will recurse on
* compact().  Consider the oldest files first to avoid a
* situation where we always compact [end-threshold,end).  Then, the
* last file becomes an aggregate of the previous compactions.
*/
/*
* 至少有compactionThreshold这么多个store files
* 至少满足停止条件（1）（2）的时候
* ==>
* 才能进行min compaction
*/
while(countOfFiles - start >= this.compactionThreshold &&
      fileSizes[start] >
        Math.max(minCompactSize, (long)(sumSize[start+1] * r))) {
  ++start;
}

// 确定我们一次min compaction最多只能有maxFilesToCompact个store file
int end = Math.min(countOfFiles, start + this.maxFilesToCompact);
// 包含在这次min compaction里面的store file总大小
totalSize = fileSizes[start]
          + ((start+1 < countOfFiles) ? sumSize[start+1] : 0);

// if we don't have enough files to compact, just wait
if (end - start < this.compactionThreshold) {
  if (LOG.isDebugEnabled()) {
    LOG.debug("Skipped compaction of " + this.storeNameStr
      + " because only " + (end - start) + " file(s) of size "
      + StringUtils.humanReadableInt(totalSize)
      + " meet compaction criteria.");
  }
  return checkSplit(forceSplit);
}

if (0 == start && end == countOfFiles) {
  // we decided all the files were candidates! major compact
  majorcompaction = true;
} else {
  // 从待compaction的store file list中先切除一批满足条件的store file去做min compaction
  filesToCompact = new ArrayList<StoreFile>(filesToCompact.subList(start,
    end));
}
// 进入if的条件是major compaction为false，出来的时候major compaction可能是false，也可能是true
```
} else {
// all files included in this compaction
for (long i : fileSizes) {
totalSize += i;
}
}
this.lastCompactSize = totalSize;

// Max-sequenceID is the last key in the files we’re compacting
long maxId = StoreFile.getMaxSequenceIdInList(filesToCompact);

// Ready to go. Have list of files to compact.
LOG.info(“Started compaction of " + filesToCompact.size() + " file(s) in cf=” +
this.storeNameStr +
(references? “, hasReferences=true,”: " ") + " into " +
region.getTmpDir() + “, seqid=” + maxId +
“, totalSize=” + StringUtils.humanReadableInt(totalSize));
// 选择好了store file， compact就是真正去做merge的函数了
StoreFile.Writer writer = compact(filesToCompact, majorcompaction, maxId);
// Move the compaction into place.
StoreFile sf = completeCompaction(filesToCompact, writer);
if (LOG.isInfoEnabled()) {
LOG.info(“Completed” + (majorcompaction? " major ": " ") +
“compaction of " + filesToCompact.size() +
" file(s), new file=” + (sf == null? “none”: sf.toString()) +
“, size=” + (sf == null? “none”: StringUtils.humanReadableInt(sf.getReader().length())) +
"; total size for store is " + StringUtils.humanReadableInt(storeSize));
}
}
return checkSplit(forceSplit);
}

2011.12.13添加

几个关于compaction的hbase jira:

HBASE-3189 Stagger Major Compactions

HBASE-3209 : New Compaction Algorithm

HBASE-1476 Multithreaded Compactions

HBASE-3857 Change the HFile Format

以上是关于HBase选择Store file做compaction的算法的主要内容，如果未能解决你的问题，请参考以下文章