Leveldb数据Compaction源码分析

Posted 叶长风

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Leveldb数据Compaction源码分析相关的知识,希望对你有一定的参考价值。

Leveldb数据Compaction源码分析(1)

这一节来讲Leveldb的数据压缩过程,上一节讲了Leveldb的数据寻找过程,文章地址为:但是最后在讲Leveldb中的Leveln的层级寻找时,我想应该是有没有看懂的,直接二分法找到sstable,然后加载缓存就能找到文件,看源码或许有些疑惑,但是这个是和Leveldb的数据压缩过程是有关的,这节就来讲Leveldb的数据压缩过程。

Compaction


Leveldb中有两种数据压缩模式,分为minor Compaction和major Compaction,minor就是把memtable中的数据导出到sstable中,而major过程则要合并不同的sstable,这个过程比较复杂,在后续源码中讲解,先说minor compaction。

minor Compaction


minor compaction就是当内存中的memtable大小达到一定值时将数据保存到sstable文件中..首先看数据压缩源码,为:

private void maybeScheduleCompaction()
    
        checkState(mutex.isHeldByCurrentThread());

        if (backgroundCompaction != null) 
            // Already scheduled
        
        else if (shuttingDown.get()) 
            // DB is being shutdown; no more background compactions
        
        else if (immutableMemTable == null &&
                manualCompaction == null &&
                !versions.needsCompaction()) 
            // No work to be done
        
        else 
            backgroundCompaction = compactionExecutor.submit(new Callable<Void>()
            
                @Override
                public Void call()
                        throws Exception
                
                    try 
                        backgroundCall();
                    
                    catch (DatabaseShutdownException ignored) 
                    
                    catch (Throwable e) 
                        backgroundException = e;
                    
                    return null;
                
            );
        
    

这里启动了一个线程不断的进行压缩的方法,我们转到backgroundCall()方法,

 mutex.lock();
        try 
            if (backgroundCompaction == null) 
                return;
            

            try 
                if (!shuttingDown.get()) 
                    backgroundCompaction();
                
            
            finally 
                backgroundCompaction = null;
            

转到backgroundCompaction()方法,如下:

        private void backgroundCompaction()
            throws IOException
    
        checkState(mutex.isHeldByCurrentThread());

        compactMemTableInternal();

        Compaction compaction;
        if (manualCompaction != null) 
            compaction = versions.compactRange(manualCompaction.level,
                    new InternalKey(manualCompaction.begin, MAX_SEQUENCE_NUMBER, VALUE),
                    new InternalKey(manualCompaction.end, 0, DELETION));
        
        else 
            compaction = versions.pickCompaction();
        
        

我们看compactMemTableInternal()方法,这个主要就是minor compaction。

private void compactMemTableInternal()
            throws IOException
    
        checkState(mutex.isHeldByCurrentThread());
        if (immutableMemTable == null) 
            return;
        

        try 
            // Save the contents of the memtable as a new Table
            VersionEdit edit = new VersionEdit();
            Version base = versions.getCurrent();
            writeLevel0Table(immutableMemTable, edit, base);

            if (shuttingDown.get()) 
                throw new DatabaseShutdownException("Database shutdown during memtable compaction");
            

            // Replace immutable memtable with the generated Table
            edit.setPreviousLogNumber(0);
            edit.setLogNumber(log.getFileNumber());  // Earlier logs no longer needed
            versions.logAndApply(edit);

            immutableMemTable = null;

            deleteObsoleteFiles();
        
        finally 
            backgroundCondition.signalAll();
        
    

首先判断immutableMemTable是否为null,为null则直接返回,这种情况一般是Leveldb刚刚被实例化的时候,immutableMemTable这个是否没有写入数据,接下来就是方法writeLevel0Table(),源码为:

private void writeLevel0Table(MemTable mem, VersionEdit edit, Version base)
            throws IOException
    
        checkState(mutex.isHeldByCurrentThread());

        // skip empty mem table
        if (mem.isEmpty()) 
            return;
        

        // write the memtable to a new sstable
        long fileNumber = versions.getNextFileNumber();
        pendingOutputs.add(fileNumber);
        mutex.unlock();
        FileMetaData meta;
        try 
            meta = buildTable(mem, fileNumber);
        
        finally 
            mutex.lock();
        
        pendingOutputs.remove(fileNumber);

        // Note that if file size is zero, the file has been deleted and
        // should not be added to the manifest.
        int level = 0;
        if (meta != null && meta.getFileSize() > 0) 
            Slice minUserKey = meta.getSmallest().getUserKey();
            Slice maxUserKey = meta.getLargest().getUserKey();
            if (base != null) 
                level = base.pickLevelForMemTableOutput(minUserKey, maxUserKey);
            
            edit.addFile(level, meta);
        
    

取出当前保存的下一个文件编号,将mem中的数据保存到文件中,同时返回文件元数据meta对象,meta保存在version中,方便查找数据,我们再看buildTable过程:

private FileMetaData buildTable(SeekingIterable<InternalKey, Slice> data, long fileNumber)
            throws IOException
    
        File file = new File(databaseDir, Filename.tableFileName(fileNumber));
        try 
            InternalKey smallest = null;
            InternalKey largest = null;
            FileChannel channel = new FileOutputStream(file).getChannel();
            try 
                TableBuilder tableBuilder = new TableBuilder(options, channel, new InternalUserComparator(internalKeyComparator));

                for (Entry<InternalKey, Slice> entry : data) 
                    // update keys
                    InternalKey key = entry.getKey();
                    if (smallest == null) 
                        smallest = key;
                    
                    largest = key;

                    tableBuilder.add(key.encode(), entry.getValue());
                

                tableBuilder.finish();
            
            finally 
                try 
                    channel.force(true);
                
                finally 
                    channel.close();
                
            

            if (smallest == null) 
                return null;
            
            FileMetaData fileMetaData = new FileMetaData(fileNumber, file.length(), smallest, largest);

            // verify table can be opened
            tableCache.newIterator(fileMetaData);

            pendingOutputs.remove(fileNumber);

            return fileMetaData;

        
        catch (IOException e) 
            file.delete();
            throw e;
        
    

这个方法就是将memtable中的内容写入到文件中,不进行任务文件或者数据的压缩,同时组装当前文件的元数据并返回当前的元数据。

从这就能知道先前为什么level0的查找为什么需要每一个文件进行排序和根据最新的编辑时间进行查找,因为minor compaction过一段时间就会进行一次,同时不做任何去重的操作,因此多个文件之间多半有一些key都是重复的,需要找到最新更新过的key。


Major compaction的过程比较长,这一节就不再讲述了,放到下一节再说。

以上是关于Leveldb数据Compaction源码分析的主要内容,如果未能解决你的问题,请参考以下文章

LevelDB 源码剖析Compaction模块:Minor CompactionMajor Compaction文件选取执行流程垃圾回收

LevelDB 源码剖析Compaction模块:Minor CompactionMajor Compaction文件选取执行流程垃圾回收

LevelDB 源码剖析Compaction模块:Minor CompactionMajor Compaction文件选取执行流程垃圾回收

[leveldb] Compaction

再学LevelDB Compaction

再学LevelDB Compaction