09-leveldb性能优化

Posted 2022-06-03 anda0109

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了09-leveldb性能优化相关的知识，希望对你有一定的参考价值。

1、WriteOptions如何设置写性能才高？

在写入数据时，即调用Put和Delete接口时，我们需要传入一个WriteOptions类型的参数，例如Put接口如下：
Status Put(const WriteOptions& options, const Slice& key, const Slice& value);
WriteOptions参数该如何设置呢？我们看它的定义：

// Options that control write operations
struct LEVELDB_EXPORT WriteOptions 
  WriteOptions() = default;

  // If true, the write will be flushed from the operating system
  // buffer cache (by calling WritableFile::Sync()) before the write
  // is considered complete.  If this flag is true, writes will be
  // slower.
  //
  // If this flag is false, and the machine crashes, some recent
  // writes may be lost.  Note that if it is just the process that
  // crashes (i.e., the machine does not reboot), no writes will be
  // lost even if sync==false.
  //
  // In other words, a DB write with sync==false has similar
  // crash semantics as the "write()" system call.  A DB write
  // with sync==true has similar crash semantics to a "write()"
  // system call followed by "fsync()".
  bool sync = false;
;

WriteOptions是一个结构体，它有一个成员变量sync，这个sync的作用是什么呢？
看注释，当这个值设置为false的时候，当机器或操作系统挂掉时，最近写入的数据可能会丢失。注意是机器或操作系统挂掉，而不是仅进程挂掉。仅进程挂掉不会有任何数据丢失。当这个值设置为true的时候，无论是系统还是进程挂掉，数据都不会丢失。
这个值的作用就是每次数据写入后，是否同步刷盘，即调用fsync()。也就是数据写入wal时，调用write()后，是否紧接着调用fsync()，如果设置为true就会调用。
当sync设置为true时，写入的数据会同步刷盘，保证了数据即使在机器宕机也不会丢失。但它的代价就是性能的急剧下降。因为每次IO都进行刷盘是非常消耗性能的。
因此，在实际的业务应用中，如果追求高性能、且少量数据丢失是可以容忍的，那么果断将sync设置为false。相反，如果不那么追求性能、但对数据的安全要求非常高，那么你需要将sync设置为true。
当然也有权衡二者的做法，例如每写入100次后，将sync设置成一次true，即每写入100条数据，强制刷一次盘。这样最多丢失最近的100次写入。

2、ReadOptions如何设置读性能才高？

当我们读取数据时，需要传入一个ReadOptions类型的参数，接口如下：
Get(const ReadOptions& options, const Slice& key, std::string* value)
ReadOptions的定义如下：

// Options that control read operations
struct LEVELDB_EXPORT ReadOptions 
  ReadOptions() = default;

  // If true, all data read from underlying storage will be
  // verified against corresponding checksums.
  bool verify_checksums = false;

  // Should the data read for this iteration be cached in memory?
  // Callers may wish to set this field to false for bulk scans.
  bool fill_cache = true;

  // If "snapshot" is non-null, read as of the supplied snapshot
  // (which must belong to the DB that is being read and which must
  // not have been released).  If "snapshot" is null, use an implicit
  // snapshot of the state at the beginning of this read operation.
  const Snapshot* snapshot = nullptr;
;

成员变量verify_checksums代表读取数据时是否进行数据校验，在leveldb中是每一个block对应一个4字节的crc校验。如果设置为true，则每次从磁盘读取出一个块都会进行校验。校验数据当然会消耗CPU资源，查询性能会有一定的降低。但开启它的好处是及时识别损坏的块。
第二个变量fill_cache代表本次读取到的数据所在的块是否进行缓存。理论上，如果将块缓存，则后续的查询动作只需要在内存中进行，可以提高查询的性能。如果你的可用内存够大，block_cache设置得足够大的情况下，开启这个值会有利于提升查询性能。
第三个参数snapshot是指从指定的快照查询，这个与业务有关。与性能关系不大。

3、Compact如何设置来平衡读和写？

在leveldb源码的dbformat.h中有如下常量定义：

namespace config 
static const int kNumLevels = 7;

// Level-0 compaction is started when we hit this many files.
static const int kL0_CompactionTrigger = 4;

// Soft limit on number of level-0 files.  We slow down writes at this point.
static const int kL0_SlowdownWritesTrigger = 8;

// Maximum number of level-0 files.  We stop writes at this point.
static const int kL0_StopWritesTrigger = 12;

// Maximum level to which a new compacted memtable is pushed if it
// does not create overlap.  We try to push to level 2 to avoid the
// relatively expensive level 0=>1 compactions and to avoid some
// expensive manifest file operations.  We do not push all the way to
// the largest level since that can generate a lot of wasted disk
// space if the same key space is being repeatedly overwritten.
static const int kMaxMemCompactLevel = 2;

// Approximate gap in bytes between samples of data read during iteration.
static const int kReadBytesPeriod = 1048576;

  // namespace config

我们主要关注下面三个与compact关系较大的参数：
// 当0层文件数达到该值时触发0层的compcat
static const int kL0_CompactionTrigger = 4;

// 当0层文件数达到该值时降低写入速度.
static const int kL0_SlowdownWritesTrigger = 8;

// 当0层文件数达到该值时停止写入.
static const int kL0_StopWritesTrigger = 12;

上述三个参数实际上是为了平衡写入和compact速度而设置的。当0层文件达到4个时触发compact,当0层文件数达到8个时，说明compact速度慢于写入速度，这个时候降低写入速度；当0层文件数达到12个时，说明写入速度远大于compact速度，这个时候停止数据写入，等待将0层积压的文件compact下去后再进行写入。
这几个值的设置，最重要的是寻求平衡。
kL0_CompactionTrigger设置得越小，compact触发的频率就越高，读写放大就越严重。读写放大反过来影响业务的读写性能。
kL0_CompactionTrigger设置得越大，compact触发的频率就越低，读写放大就越不明显。但是如果业务中如果读取操作比较多，则读取数据时会有严重的读放大，因为0层文件数是Range重叠的，查询一个值时会将0层的每个文件查一遍，读性能很低。
因此，基于业务场景，如果是写多读少的场景，或者对读取性能不那么敏感，我们可以将kL0_CompactionTrigger的值设置得大一些，以提升写性能。如果是写少读多的场景，可以将kL0_CompactionTrigger的值设置得小一些，以提升读性能。

以上是关于09-leveldb性能优化的主要内容，如果未能解决你的问题，请参考以下文章