关于HBase MVCC的设计原理以及MVCC所引起的一个scan问题

Posted 柠檬大数据

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了关于HBase MVCC的设计原理以及MVCC所引起的一个scan问题相关的知识,希望对你有一定的参考价值。



  最近在使用HBase0.94版本的时,偶尔会出现,HRegionInfo was null or empty in Meta 的警告

  java.io.IOException: HRegionInfo was null or empty in Meta for writetest, row=lot_let,9399239430349923234234,99999999999999

  at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:170)


  在客户端的MetaScanner.metaScan实现中

  metaTable = new HTable(configuration, HConstants.META_TABLE_NAME);

  Result startRowResult = metaTable.getRowOrBefore(searchRow,HConstants.CATALOG_FAMILY);

  if (startRowResult == null) { throw new TableNotFoundException("Cannot find row in .META. for table: " + Bytes.toString(tableName) + ", row=" + Bytes.toStringBinary(searchRow)); }

  byte[] value = startRowResult.getValue(HConstants.CATALOG_FAMILY,

  HConstants.REGIONINFO_QUALIFIER);

  if (value == null || value.length == 0) { throw new IOException("HRegionInfo was null or empty in Meta for " + Bytes.toString(tableName) + ", row=" + Bytes.toStringBinary(searchRow)); }

  可以发现在扫描MetaScanner,rowkey所在的范围在Meta 表中不存在;通过RPC定位到服务端的实现


  HRegion中:

   public Result getClosestRowBefore(final byte [] row, final byte [] family)

   throws IOException {

   if (coprocessorHost != null) {

   Result result = new Result();

   if (coprocessorHost.preGetClosestRowBefore(row, family, result)) {

    return result;

   }

   }

   // look across all the HStores for this region and determine what the

   // closest key is across all column families, since the data may be sparse

   checkRow(row, "getClosestRowBefore");

   startRegionOperation();

   this.readRequestsCount.increment();

   try {

   Store store = getStore(family);

   // get the closest key. (HStore.getRowKeyAtOrBefore can return null)

   KeyValue key = store.getRowKeyAtOrBefore(row);

   Result result = null;

   if (key != null) {

   Get get = new Get(key.getRow());

   get.addFamily(family);

   result = get(get, null);

   }

   if (coprocessorHost != null) {

   coprocessorHost.postGetClosestRowBefore(row, family, result);

   }

   return result;

   } finally {

   closeRegionOperation();

   }

   }

  在 KeyValue key = store.getRowKeyAtOrBefore(row);中获得了Meta表的rowkey,但是在后续的实现中

   if (key != null) {

   Get get = new Get(key.getRow());

   get.addFamily(family);

   result = get(get, null);

   }

  获得空的result导致了这个问题;

  为什么会存在这个现象。


  先讲一下HBase 的MVCC的原理,

  MVCC是保证数据一致性的手段,HBase在写数据的过程中,需要经过好几个阶段,写HLog,写memstore,更新MVCC;

  只有更新了MVCC,才算真正memstore写成功,其中事务的隔离需要有mvcc的来控制,比如读数据不可以获取别的线程还未提交的数据。

  1、put、delete数据都会调用applyFamilyMapToMemstore

  HRegion中

  private long applyFamilyMapToMemstore(Map<byte[], List<KeyValue>> familyMap,

   MultiVersionConsistencyControl.WriteEntry localizedWriteEntry) {

   long size = 0;

   boolean freemvcc = false;


   try {

   if (localizedWriteEntry == null) {

  //开始一个写memstore,mvcc中的memstoreWrite++,并add待write pending队列中

   localizedWriteEntry = mvcc.beginMemstoreInsert();

   freemvcc = true;

   }


   for (Map.Entry<byte[], List<KeyValue>> e : familyMap.entrySet()) {

   byte[] family = e.getKey();

   List<KeyValue> edits = e.getValue();


   Store store = getStore(family);

   for (KeyValue kv: edits) {

   kv.setMemstoreTS(localizedWriteEntry.getWriteNumber());

   size += store.add(kv);

   }

   }

   } finally {

   if (freemvcc) {

   mvcc.completeMemstoreInsert(localizedWriteEntry);

   }

   }


   return size;

   }


   mvcc.completeMemstoreInsert,更新mvcc 的memstoreRead,也就是可以读的位置, 并通知readWaiters.notifyAll(),释放因flushcache调用waitForRead引起的阻塞;

  waitForRead参见以下代码:

   public void waitForRead(WriteEntry e) {

   boolean interrupted = false;

   synchronized (readWaiters) {

  //小于,表示还有写未提交

   while (memstoreRead < e.getWriteNumber()) {

   try {

   readWaiters.wait(0);

   } catch (InterruptedException ie) {

   // We were interrupted... finish the loop -- i.e. cleanup --and then

   // on our way out, reset the interrupt flag.

   interrupted = true;

   }

   }

   }

   if (interrupted) Thread.currentThread().interrupt();

   }


  2、 在flushcache的过程中,获取到memstore中的keyvalues后,会调用mvcc.waitForRead(w)(因memstore所有的keyvalue,包括还未真正提交的,所以要等待其他事务提交后,才可以进行后续的flush操作,保证事务的一致性。

   w = mvcc.beginMemstoreInsert();

   mvcc.advanceMemstore(w);

   mvcc.waitForRead(w);


  3、scan数据

  在RegionScannerImpl.next方法实现中:

   public synchronized boolean next(List<KeyValue> outResults, int limit)

   throws IOException {

   if (this.filterClosed) {

   throw new UnknownScannerException("Scanner was closed (timed out?) " +

   "after we renewed it. Could be caused by a very slow scanner " +

   "or a lengthy garbage collection");

   }

   startRegionOperation();

   readRequestsCount.increment();

   try {


   // This could be a new thread from the last time we called next().

  //this.readPoint在构造的时,初始化(readpoint为当前hregion的mvcc中的memstoreRead,为当前可读的点)和当前线程绑定

   MultiVersionConsistencyControl.setThreadReadPoint(this.readPt);


  在MemStore中过滤掉还未提交的事务(新的keyvalue中有最新的point)


   protected KeyValue getNext(Iterator<KeyValue> it) {

   long readPoint = MultiVersionConsistencyControl.getThreadReadPoint();


   while (it.hasNext()) {

   KeyValue v = it.next();

  //过滤掉大于当前线程readPoint的keyvalue

   if (v.getMemstoreTS() <= readPoint) {

   return v;

   }

   }


   return null;

   }


  纵观MVCC的整个过程,再分析HRegion中的getClosestRowBefore方法实现,

  KeyValue key = store.getRowKeyAtOrBefore(row);

这个调用不会进行MVCC的控制,可以读到memstore中所有的数据

而get方法是会进行MVCC进行控制的,所以一种可能情况是在get调用的时, store.getRowKeyAtOrBefore(row)读到的key值还未提交,

所有都过滤掉了,查询范围为null。

  扫二维码加关注,回复“已关注”查看更多大数据相关学习资料。



以上是关于关于HBase MVCC的设计原理以及MVCC所引起的一个scan问题的主要内容,如果未能解决你的问题,请参考以下文章

Hbase 写入机制详解与MVCC机制

mysql mvcc 原理详解

Mysql—4种隔离级别以及MVCC一致性视图的实现原理

一致性事务--关于MVCC机制的理解

MySQL锁机制与MVCC原理--推荐阅读

Mysql原理篇之事务隔离级别和MVCC--13