HBase源码分析之KeyValue

Posted 2020-07-01 lipeng_bigdata

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了HBase源码分析之KeyValue相关的知识，希望对你有一定的参考价值。

HBase内部，单元格Cell的实现为KeyValue，它是HBase某行数据的某个单元格在内存中的组织形式，由Key Length、Value Length、Key、Value四大部分组成。其中，Key又由Row Length、Row、Column Family Length、Column Family、Column Qualifier、Time Stamp、Key Type七部分组成。在HBase1.0.2版本中，它的结构如图：

技术分享

从左到右，依次为：

1、Key Length：存储Key的长度，占4B；

2、Value Length：存储Value的长度，占4B；

3、Key：由Row Length、Row、Column Family Length、Column Family

3.1、Row Length：存储Row的长度，即rowkey的长度，占2B；

3.2、Row：存储Row实际内容，即Rowkey，其大小为Row Length；

3.3、Column Family Length：存储列簇Column Family的长度，占1B；

3.4、Column Family：存储Column Family实际内容，大小为Column Family Length；

3.5、Column Qualifier：存储Column Qualifier对应的数据，既然key中其他所有字段的大小都知道了，整个key的大小也知道了，那么这个Column Qualifier大小也是明确的了，无需再存储其length；

3.6、Time Stamp：存储时间戳Time Stamp，占8B；

3.7、Key Type：存储Key类型Key Type，占1B，Type分为Put、Delete、DeleteColumn、DeleteFamilyVersion、DeleteFamily等类型，标记这个KeyValue的类型；

4、Value：存储单元格Cell对应的实际的值Value。

下面，我们看下HBase中KeyValue是如何实现的。在KeyValue中，有三个十分重要的变量，如下：

  // KeyValue core instance fields.
  // KeyValyeh核心实例存储域
  
  // KeyValue相关的不变byte[]数组，存储KeyValue实际内容
  protected byte [] bytes = null;  // an immutable byte array that contains the KV
  // KeyValue在数组bytes的起始位置
  protected int offset = 0;  // offset into bytes buffer KV starts at
  // KeyValue在数组bytes自起始位置offset后的长度
  protected int length = 0;  // length of the KV starting from offset.

KeyValue内容是存储在byte[]数组bytes中的，它是一个不变的byte[]数组，而存储的起始位置与长度，则分别由offset和length标识。

下面，我们看下KeyValue中获取Key Length、Value Length、Row Length、Column Family、Value等等相关字段的方法，来验证下我们上面罗列出的KeyValue结构。

1、Key Length

  /**
   * @return Length of key portion.
   */
  public int getKeyLength() {
	  
    // 从KeyValue底层byte[]数组bytes中位置offset开始，获取一个int，也就是4B
    return Bytes.toInt(this.bytes, this.offset);
  }

getKeyLength()方法用于获取KeyValue中Key长度Key Length，它从KeyValue底层byte[]数组bytes中位置offset开始，获取一个int，也就是4B，这也就验证了我们上面说的，KeyValue中第一个是Key Length，大小为4B。

2、Value Length

  /**
   * @return Value length
   */
  @Override
  public int getValueLength() {
	  
    // 从KeyValue底层byte[]数组bytes中offset+4开始，获取一个int，也就是4B
    // 也就是说，key length后紧跟着4B是value length
    int vlength = Bytes.toInt(this.bytes, this.offset + Bytes.SIZEOF_INT);
    return vlength;
  }

getValueLength()方法用于获取KeyValue中Value长度Value Length，它从KeyValue底层byte[]数组bytes中offset+4开始，获取一个int，也就是4B，这也就验证了我们上面说的Key Length后紧跟着4B是Value Length。

3、Key起始位置

  /**
   * @return Key offset in backing buffer..
   */
  public int getKeyOffset() {
	  
	// ROW_OFFSET为key length、value length之后的位置
    return this.offset + ROW_OFFSET;
  }

ROW_OFFSET为Key Length、Value Length之后的位置，定义如下：

  // How far into the key the row starts at. First thing to read is the short
  // that says how long the row is.
  public static final int ROW_OFFSET =
    Bytes.SIZEOF_INT /*keylength*/ +
    Bytes.SIZEOF_INT /*valuelength*/;

getKeyOffset()方法用于获取KeyValue中Key的起始位置，它的取值为整个KeyValue的起始位置offset加上ROW_OFFSET，而ROW_OFFSET为Key Length和Value Length所占大小，这也就验证了Key Length和Value Length之后就是Key。

4、Value起始位置

  /**
   * @return the value offset
   */
  @Override
  public int getValueOffset() {
	  
	// Key的起始位置，再加上Key的长度，就是Value的起始位置
    int voffset = getKeyOffset() + getKeyLength();
    return voffset;
  }

getValueOffset()方法用于获取KeyValue中Value的起始位置，它的值为通过getKeyOffset()方法获取的Key的起始位置，再加上通过getKeyLength()方法获取的Key的长度，这也就验证了KeyValue中继Key Length、Value Length、Key之后，就是Value。

5、Row Length

  /**
   * @return Row length
   */
  @Override
  public short getRowLength() {
	  
	// 从KeyValue底层byte[]数组bytes中key起始位置开始，获取一个short，也就是2B
	// getKeyOffset()起始时获取的key length加value length后的位置
	// 也就是说，key length后紧跟着4B是value length，而value length后就是key的开始，
	// 而key前面的2B是row length
    return Bytes.toShort(this.bytes, getKeyOffset());
  }

getRowLength()方法用于获取KeyValue中Row长度Row Length，它从KeyValue底层byte[]数组bytes中key起始位置开始，获取一个short，也就是2B，这也就证明了Row Length是Key中第一个字段。

5、Row起始位置

  /**
   * @return Row offset
   */
  @Override
  public int getRowOffset() {
	  
    // key的起始位置再加2B，即row length之后就是row
    return getKeyOffset() + Bytes.SIZEOF_SHORT;
  }

getRowOffset()方法用于获取KeyValue中Row的起始位置，它的取值为Key的起始位置再加2B，即Row Length之后就是Row，与上面所讲一致！

6、Row

  /**
   * Primarily for use client-side.  Returns the row of this KeyValue in a new
   * byte array.<p>
   *
   * If server-side, use {@link #getBuffer()} with appropriate offsets and
   * lengths instead.
   * @return Row in a new byte array.
   */
  @Deprecated // use CellUtil.getRowArray()
  public byte [] getRow() {
    return CellUtil.cloneRow(this);
  }

getRow()方法用于获取Row内容，它通过CellUtil的cloneRow()方法，传入本身KeyValue实例，返回一个byte[]，而cloneRow()方法如下：

  public static byte[] cloneRow(Cell cell){
	  
	// output为一个大小为row length的byte[]数组
    byte[] output = new byte[cell.getRowLength()];
    
    // 将row从cell中copy至output
    copyRowTo(cell, output, 0);
    return output;
  }

可以看到，先构造一个Row Length大小的byte[]数组output，这也就意味着Row的大小是由之前的Row Length对应的值确定的。然后，调用copyRowTo()方法，将KeyValue中Row存储的内容拷贝至output数组并返回。而copyRowTo()方法，将cell（也就是KeyValue）中byte[]数组bytes，从Row的起始位置Row Offset处开始，拷贝到目标byte[]数组destination（也就是output），从0开始，拷贝数据的长度为Row Length，也就是会填满整个destination（output），代码如下：

  public static int copyRowTo(Cell cell, byte[] destination, int destinationOffset) {
    
	// 将cell中byte[]数组bytes，从row offset处开始，拷贝到目标byte[]数组destination，从0开始，拷贝数据的长度为row length，
	// 也就是会填满整个destination
	System.arraycopy(cell.getRowArray(), cell.getRowOffset(), destination, destinationOffset,
      cell.getRowLength());
	
	// 返回数据拷贝的终止点
    return destinationOffset + cell.getRowLength();
  }

7、Family起始位置

  /**
   * @return Family offset
   */
  @Override
  public int getFamilyOffset() {
	  
	
    return getFamilyOffset(getRowLength());
  }

  /**
   * @return Family offset
   */
  private int getFamilyOffset(int rlength) {
	  
	// 获取family的起始位置：整个KeyValue起始位置offset + ROW_OFFSET（Key Length + Value Length） + 2B（Row Length） + 实际Row大小rlength + 1B（Family Length）
    return this.offset + ROW_OFFSET + Bytes.SIZEOF_SHORT + rlength + Bytes.SIZEOF_BYTE;
  }

getFamilyOffset()方法用于获取KeyValue中Family的起始位置，它是整个KeyValue起始位置offset，加上ROW_OFFSET，也就是Key Length、Value Length所占大小，然后再加上Row Length所占大小2B，和通过getRowLength()方法获取的实际Row大小rlength，最后加上1B，即Family Length所占大小。这也就说明了，Key中ROw Length、Row之后就是Family Length和Family，而Family Length大小占1B。

8、Family Length

  /**
   * @return Family length
   */
  @Override
  public byte getFamilyLength() {
    return getFamilyLength(getFamilyOffset());
  }

  /**
   * @return Family length
   */
  public byte getFamilyLength(int foffset) {
	  
	// family起始位置减1，这个1B就是family length
    return this.bytes[foffset-1];
  }

getFamilyLength()方法用于获取KeyValue中Family长度Family Length，它是通过由getFamilyOffset()方法获取的Family位置减1来获取的，与上面得到的验证一致，Family前面1B就是Family Length。

9、Qualifier起始位置

  /**
   * @return Qualifier offset
   */
  @Override
  public int getQualifierOffset() {
    return getQualifierOffset(getFamilyOffset());
  }

  /**
   * @return Qualifier offset
   */
  private int getQualifierOffset(int foffset) {
	  
	// Family起始位置加上Family长度Family Length
    return foffset + getFamilyLength(foffset);
  }

getQualifierOffset()方法用于获取KeyValue中Qualifier的起始位置，它实际上是通过Family的起始位置再加上Family的长度Family Length，这也就说明了Family后就是Qualifier。

10、Qualifier长度

  /**
   * @return Qualifier length
   */
  @Override
  public int getQualifierLength() {
    return getQualifierLength(getRowLength(),getFamilyLength());
  }

  /**
   * @return Qualifier length
   */
  private int getQualifierLength(int rlength, int flength) {
	// Key长度减去Row长度、Family长度、Row Length长度、Family Length长度、Time Stamp长度、Key Type长度
    return getKeyLength() - (int) getKeyDataStructureSize(rlength, flength, 0);
  }

getQualifierLength()方法，用于获取KeyValue中Qualifier长度，KeyValue中并没有直接存储Qualifier长度，而是通过Key的总长度减去Key中除Qualifier外其它各部分长度来得到的，实际上是一个计算的过程，为Key长度减去Row长度、Family长度、Row Length长度、Family Length长度、Time Stamp长度、Key Type长度的和。

11、Timestamp起始位置

  /**
   * @return Timestamp offset
   */
  public int getTimestampOffset() {
    return getTimestampOffset(getKeyLength());
  }

  /**
   * @param keylength Pass if you have it to save on a int creation.
   * @return Timestamp offset
   */
  private int getTimestampOffset(final int keylength) {
	// Key的起始位置加上Key的长度，再减去Time Stamp和Key Type所占大小
    return getKeyOffset() + keylength - TIMESTAMP_TYPE_SIZE;
  }

getTimestampOffset()方法用于获取KeyValue中Time Stamp的起始位置，它是通过Key的起始位置加上Key的长度，再减去Time Stamp和Key Type所占大小来计算得到的。这意味着，在Key中，Time Stamp处于倒数第二个位置，也就是在Qualifier之后，在Key Type之前，而Key Type则居于最后。
12、获取TimeStamp

  /**
   *
   * @return Timestamp
   */
  @Override
  public long getTimestamp() {
    return getTimestamp(getKeyLength());
  }

  /**
   * @param keylength Pass if you have it to save on a int creation.
   * @return Timestamp
   */
  long getTimestamp(final int keylength) {
	  
	// 获取TimeStamp起始位置tsOffset
    int tsOffset = getTimestampOffset(keylength);
    
    // 从bytes中tsOffset位置开始读取一个Long，即8B
    return Bytes.toLong(this.bytes, tsOffset);
  }

getTimestamp()方法是用来获取KeyValue中TimeStamp的，它先获取TimeStamp起始位置tsOffset，然后从bytes中tsOffset位置开始读取一个Long，即8B，这与上面提到的TimeStamp占8B是一致的。

13、获取Key Type

  /**
   * @return Type of this KeyValue.
   */
  @Deprecated
  public byte getType() {
    return getTypeByte();
  }

  /**
   * @return KeyValue.TYPE byte representation
   */
  @Override
  public byte getTypeByte() {
	  
	// 整个KeyValue的位置offset + Key长度 - 1 + Key Length所占长度和Value Length所占长度和
	// 即Key Type位于整个Key的最后一个1B
    return this.bytes[this.offset + getKeyLength() - 1 + ROW_OFFSET];
  }

getType()和getTypeByte()方法用来获取KeyValue中Key Type值，它是通过在bytes中，从整个KeyValue的位置offset + Key长度 - 1 + Key Length所占长度和Value Length所占长度和位置处获取的一个Byte来得到的，即Key Type位于整个Key的最后一个1B，这与上面所述也是一致的。

14、获取Value值

  /**
   * Returns value in a new byte array.
   * Primarily for use client-side. If server-side, use
   * {@link #getBuffer()} with appropriate offsets and lengths instead to
   * save on allocations.
   * @return Value in a new byte array.
   */
  @Deprecated // use CellUtil.getValueArray()
  public byte [] getValue() {
    return CellUtil.cloneValue(this);
  }

getValue()方法是用来从KeyValue中获取Value值得，它是整个Cell实际存储的内容，通过CellUtil的cloneValue()方法，传入KeyValue自身实力来获得。我们来看下这个cloneValue()方法：

  public static byte[] cloneValue(Cell cell){
	  
	// 创建Value Length大小的byte[]数组output
    byte[] output = new byte[cell.getValueLength()];
    
    // 将cell中的value值copy至output
    copyValueTo(cell, output, 0);
    return output;
  }

cloneValue()方法首先创建Value Length大小的byte[]数组output，然后调用copyValue()方法，将cell中的value值copy至output。而copyValue()方法很简单，从bytes数组的Value Offset处开始拷贝Value Length大小，至destination，代码如下：

  public static int copyValueTo(Cell cell, byte[] destination, int destinationOffset) {
    
	// 从bytes数组的Value Offset处开始拷贝Value Length大小，至destination
	System.arraycopy(cell.getValueArray(), cell.getValueOffset(), destination, destinationOffset,
        cell.getValueLength());
    return destinationOffset + cell.getValueLength();
  }

以上是关于HBase源码分析之KeyValue的主要内容，如果未能解决你的问题，请参考以下文章