C++源码剖析——deque

Posted 2023-04-03 落樱弥城

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了C++源码剖析——deque相关的知识，希望对你有一定的参考价值。

前言：之前看过侯老师的《STL源码剖析》但是那已经是多年以前的，现在工作中有时候查问题和崩溃都需要了解实际工作中使用到的STL的实现。因此计划把STL的源码再过一遍。
摘要：本文描述了llvm中libcxx的deque的实现。
关键字：deque
其他：参考代码LLVM-libcxx
注意：参考代码时llvm的实现，与gnu和msvc的实现都有区别。

deque是STL中的双向队列序列容器，可以在队列的队头和队尾插入或者pop元素。因为限制了操作元素的节点因此操作基本上是常数的。deque的实现相比vector更加复杂，前者采用了多段内存段的实现而不是一整块连续的内存。因此不能假定deque的内存时完全连续的。

1 `deque_map`

C++标准库中的队列实现是一个一个block组成的，并不是一整块连续的内存，每个block内存时连续的，而block的地址存储在__map__中。__map__就是一个序列化容器，类似于vector的内部实现，其内部存储的是每个block的指针，而每个block的指针指向一块连续的内存，每块连续内存的大小是__block_size * sizeof(value_type)。

template <class _ValueType, class _DiffType>
struct __deque_block_size 
  static const _DiffType value = sizeof(_ValueType) < 256 ? 4096 / sizeof(_ValueType) : 16;
;

static const difference_type __block_size;
__map __map_;
size_type __start_;
__compressed_pair<size_type, allocator_type> __size_;

因此我们先简单看下__map__的结构，可以看到__map就是一个split_buffer，在vector源码中有用到这个结构但是只是作为size计算的一个中间产物，并没有什么用。

using __map = __split_buffer<pointer, __pointer_allocator>;

split_buffer

template <class _Tp, class _Allocator = allocator<_Tp> >
struct __split_buffer

    static const difference_type __block_size;
    pointer __first_;
    pointer __begin_;
    pointer __end_;
    __compressed_pair<pointer, allocator_type> __end_cap_;
;

split_buffer基本上和vector一致，从construct_to_end的实现可见一斑。稍有不同就是元素起始位置不是容器开头而是__first__，即distance(__first__, __begin__)可能不为0，并且前者会频繁调整元素的位置。

template <class _Tp, class _Allocator>
template <class _InputIter>
_LIBCPP_CONSTEXPR_SINCE_CXX20 __enable_if_t<__is_exactly_cpp17_input_iterator<_InputIter>::value>
__split_buffer<_Tp, _Allocator>::__construct_at_end(_InputIter __first, _InputIter __last)

    __alloc_rr& __a = this->__alloc();
    for (; __first != __last; ++__first)
    
        if (__end_ == __end_cap())
        
            size_type __old_cap = __end_cap() - __first_;
            size_type __new_cap = _VSTD::max<size_type>(2 * __old_cap, 8);
            __split_buffer __buf(__new_cap, 0, __a);
            for (pointer __p = __begin_; __p != __end_; ++__p, (void) ++__buf.__end_)
                __alloc_traits::construct(__buf.__alloc(),
                        _VSTD::__to_address(__buf.__end_), _VSTD::move(*__p));
            swap(__buf);
        
        __alloc_traits::construct(__a, _VSTD::__to_address(this->__end_), *__first);
        ++this->__end_;

split_buffer会在插入元素时会调整当前元素的位置使得元素始终在整个内存空间的中间，这样能够保证在插入元素时头部和尾部始终由足够的空间进行插入元素，而不至于频繁申请内存。

如果没有足够的空间触发分配内存时的策略也与vector不同，重新分配完之后会将当前使用的空间移动到当前容器的正中间。

void __split_buffer<_Tp, _Allocator>::push_back(const_reference __x)

    if (__end_ == __end_cap())
    
        if (__begin_ > __first_)
        //如果当前容器前半部分由空闲就会将容器内元素前移，push_front刚好相反会将元素后移
            difference_type __d = __begin_ - __first_;
            __d = (__d + 1) / 2;
            __end_ = _VSTD::move(__begin_, __end_, __begin_ - __d);
            __begin_ -= __d;
        
        else
        
            size_type __c = std::max<size_type>(2 * static_cast<size_t>(__end_cap() - __first_), 1);
            //push_front的时候这里略有不同
            //push_front   __split_buffer<value_type, __alloc_rr&> __t(__c, (__c + 3) / 4, __alloc());
            __split_buffer<value_type, __alloc_rr&> __t(__c, __c / 4, __alloc());   //开始位置的指针刚好让已用空间在中间
            __t.__construct_at_end(move_iterator<pointer>(__begin_),
                                   move_iterator<pointer>(__end_));
            _VSTD::swap(__first_, __t.__first_);
            _VSTD::swap(__begin_, __t.__begin_);
            _VSTD::swap(__end_, __t.__end_);
            _VSTD::swap(__end_cap(), __t.__end_cap());
        
    
    __alloc_traits::construct(__alloc(), _VSTD::__to_address(__end_), __x);
    ++__end_;

2 `push_back`

直接看下push_back的实现，了解如何插入元素，对deque内存结构也会有完整的认识。流程比较简单就是检查是否有足够的的空间，没有的话就会申请，否则直接在尾部创建。

template <class _Tp, class _Allocator>
void deque<_Tp, _Allocator>::push_back(const value_type& __v)

    allocator_type& __a = __alloc();
    if (__back_spare() == 0)
        __add_back_capacity();
    // __back_spare() >= 1
    __alloc_traits::construct(__a, _VSTD::addressof(*end()), __v);
    ++__size();

从deque的size相关的函数实现我们能够大致判断内存结构，可以看到__start__指向的是当前队列的开头，而这个值时相对于整个队里的大小，即取值范围为[0 , block() * __block_size__ - 1]。通过__start__,__block_size__,__map__.size()，__size_就可以计算处当前队列中的尾部以及元素的数量。

    size_type __capacity() const
        return __map_.size() == 0 ? 0 : __map_.size() * __block_size - 1;
    
    size_type __block_count() const
        return __map_.size();
    
    size_type __front_spare() const
        return __start_;
    
    size_type __front_spare_blocks() const 
      return __front_spare() / __block_size;
    
    size_type __back_spare() const
        return __capacity() - (__start_ + size());
    
    size_type __back_spare_blocks() const 
      return __back_spare() / __block_size;

从上面我们能大致看出deque的内存结构：

从上面的代码中能够看出针对三种情况进行了不同的处理：

当队列前有整块block时，直接将前面的block移动到后面，这样就不用重新分配内存了；
当map头部仍然头未使用的block时，直接分配新的block尾插；
1. 如果split_buffer尾部有空间则直接allocate新的block；
2. 如果split_buffer尾部没有空间就先头差再pop尾插，这样能够确保已经使用的空间刚好在split_buffer的正中间，也不会带来额外的开销；
当map内完全没有空闲空间时，重新整块map;

template <class _Tp, class _Allocator>
void
deque<_Tp, _Allocator>::__add_back_capacity()

    allocator_type& __a = __alloc();
    if (__front_spare() >= __block_size)   //队列头有整块block
        __start_ -= __block_size;
        pointer __pt = __map_.front();
        __map_.pop_front();
        __map_.push_back(__pt);
    
    // Else if __nb <= __map_.capacity() - __map_.size() then we need to allocate __nb buffers
    else if (__map_.size() < __map_.capacity())
       // we can put the new buffer into the map, but don't shift things around
        // until it is allocated.  If we throw, we don't need to fix
        // anything up (any added buffers are undetectible)
        if (__map_.__back_spare() != 0)         //__map__的尾部还有足够的空间push_back不会触发新分配内存直接插入
            __map_.push_back(__alloc_traits::allocate(__a, __block_size));
        else
            //这里这样做是为了让可用空间刚好在split_buffer的正中间
            __map_.push_front(__alloc_traits::allocate(__a, __block_size));
            // Done allocating, reorder capacity
            pointer __pt = __map_.front();
            __map_.pop_front();
            __map_.push_back(__pt);
        
    
    // Else need to allocate 1 buffer, *and* we need to reallocate __map_.
    else//重新分配整块map
        __split_buffer<pointer, __pointer_allocator&>
            __buf(std::max<size_type>(2* __map_.capacity(), 1),
                  __map_.size(),
                  __map_.__alloc());

        typedef __allocator_destructor<_Allocator> _Dp;
        unique_ptr<pointer, _Dp> __hold(
            __alloc_traits::allocate(__a, __block_size),
                _Dp(__a, __block_size));
        __buf.push_back(__hold.get());
        __hold.release();

        for (__map_pointer __i = __map_.end();
                __i != __map_.begin();)
            __buf.push_front(*--__i);
        _VSTD::swap(__map_.__first_, __buf.__first_);
        _VSTD::swap(__map_.__begin_, __buf.__begin_);
        _VSTD::swap(__map_.__end_, __buf.__end_);
        _VSTD::swap(__map_.__end_cap(), __buf.__end_cap());

push_front的逻辑基本相同，只不过方向相反。emplace_back和push_back的逻辑上是一样的，唯一的区别是对象构建的时机和vector中二者的差别相同。

3 `pop_front`

pop_front实现比较简单就是析构对应点的对象。

template <class _Tp, class _Allocator>
void deque<_Tp, _Allocator>::pop_front()
    allocator_type& __a = __alloc();
    __alloc_traits::destroy(__a, _VSTD::__to_address(*(__map_.begin() +
                                                    __start_ / __block_size) +
                                                    __start_ % __block_size));
    --__size();
    ++__start_;
    __maybe_remove_front_spare();

如果队列中空闲的比较多就会尝试回收，这里的阈值是2个block。

bool __maybe_remove_front_spare(bool __keep_one = true) 
    if (__front_spare_blocks() >= 2 || (!__keep_one && __front_spare_blocks())) 
    __alloc_traits::deallocate(__alloc(), __map_.front(),
                                __block_size);
    __map_.pop_front();
    __start_ -= __block_size;
    return true;
    
    return false;

4 迭代器

deque的迭代器时__deque_iterator。

using iterator = __deque_iterator<value_type, pointer, reference, __map_pointer, difference_type>;

因为deque是非连续内存，因此deque_iterator需要对一些节点计算进行处理。deque_iterator中包含一个__m_iter_就是一个队列中map的普通指针。__m_iter_表示当前迭代器指向的map节点，而__ptr_表示当前队列中元素在当前block的位置。

class _LIBCPP_TEMPLATE_VIS __deque_iterator
    typedef _MapPointer __map_iterator;
public:
    typedef _Pointer  pointer;
    typedef _DiffType difference_type;
private:
    __map_iterator __m_iter_;
    pointer        __ptr_;

    static const difference_type __block_size;
;

我们简单看下迭代器如何自增和自减。从下面的实现可以看出就是进行简单的游标标记，如果当前指针超过当前block则map自增。dequeue也是相同的道理，先计算block的索引，再根据block内的索引来计算具体的位置。

    //deque_iterator
    _LIBCPP_HIDE_FROM_ABI __deque_iterator& operator++()
        if (++__ptr_ - *__m_iter_ == __block_size)
        
            ++__m_iter_;
            __ptr_ = *__m_iter_;
        
        return *this;
    
    //deque
      _LIBCPP_HIDE_FROM_ABI iterator begin() _NOEXCEPT 
      __map_pointer __mp = __map_.begin() + __start_ / __block_size;
      return iterator(__mp, __map_.empty() ? 0 : *__mp + __start_ % __block_size);

根据上面的代码可以看出一个deque的内存布局。

STL—deque使用及源码剖析

deque概述

deque容器为一个给定类型的元素进行线性处理，像向量一样，它能够快速地随机访问任一个元素，并且能够高效地插入和删除容器的尾部元素。但它又与vector不同，deque支持高效插入和删除容器的头部元素，因此也叫做双端队列。
deque容器可以在双端插入和删除,其底层是分段连续的,对于使用者来说造成了一种连续的假象。

deque的使用

deque类常用的函数有：

(1) 构造函数

deque():创建一个空deque

deque(int nSize):创建一个deque,元素个数为nSize

deque(int nSize,const T& t):创建一个deque,元素个数为nSize,且值均为t

deque(const deque &):复制构造函数

(2) 增加函数

void push_front(const T& x):双端队列头部增加一个元素X

void push_back(const T& x):双端队列尾部增加一个元素x

iterator insert(iterator it,const T& x):双端队列中某一元素前增加一个元素x

void insert(iterator it,int n,const T& x):双端队列中某一元素前增加n个相同的元素x

void insert(iterator it,const_iterator first,const_iteratorlast):双端队列中某一元素前插入另一个相同类型向量的[forst,last)间的数据

(3) 删除函数

Iterator erase(iterator it):删除双端队列中的某一个元素

Iterator erase(iterator first,iterator last):删除双端队列中[first,last）中的元素

void pop_front():删除双端队列中最前一个元素

void pop_back():删除双端队列中最后一个元素

void clear():清空双端队列中最后一个元素

(4) 遍历函数

reference at(int pos):返回pos位置元素的引用

reference front():返回首元素的引用

reference back():返回尾元素的引用

iterator begin():返回向量头指针，指向第一个元素

iterator end():返回指向向量中最后一个元素下一个元素的指针（不包含在向量中）

reverse_iterator rbegin():反向迭代器，指向最后一个元素

reverse_iterator rend():反向迭代器，指向第一个元素的前一个元素

(5) 判断函数

bool empty() const:向量是否为空，若true,则向量中无元素

(6) 大小函数

Int size() const:返回向量中元素的个数

int max_size() const:返回最大可允许的双端对了元素数量值

(7) 其他函数

void swap(deque&):交换两个同类型向量的数据

void assign(int n,const T& x):向量中第n个元素的值设置为x

#include <deque>
#include <algorithm>
#include <iostream>
using namespace std;

void main() 
	//初始化
	deque<int> d;
	deque<int> d2(5);	//5个结点的deque
	deque<int> d3(5, 1);//5个结点元素为1的deque
	deque<int> d4(d3);	//用d3初始化d4
	deque<int> d5(d4.begin(), d4.end());

	//算法操作
	reverse(d5.begin(), d5.end());	//翻转
	sort(d5.begin(), d5.end());		//排序

	//遍历操作
	for (auto i : d) 
		cout << i << endl;
	
	for (auto i = d.begin(); i != d.end(); i++) 
		cout << *i << endl;
	

	//成员函数
	d.assign(d5.begin(), d5.end());
	d.assign(5, 6);
	d.back();
	d.front();
	d.empty();
	d.erase(d.begin()++);	//删除第2个结点
	d.insert(d.begin(), 5);	//在首部插入元素5
	d.insert(d.begin(), 3, 0);	//在首部插入3个0
	d.max_size();
	d.pop_back();
	d.pop_front();
	d.push_back(5);
	d.push_front(5);
	d.swap(d2);
	d.resize(2);	//修改为2个结点大小

deque源码剖析

deque内部是分段连续的，对使用者表现为连续

template<class T, class Alloc =alloc, size_t BufSiz = 0>
class deque 
public:
    typedef T value_type;
    typedef _deque_iterator<T, T &, T *, BufSiz> iterator;
protected:
    typedef pointer *map_pointer;   // T**
protected:
    iterator start;
    iterator finish;
    map_pointer map;		// 控制中心,数组中每个元素指向一个buffer
    size_type map_size;
public:
    iterator begin()  return start; 
    iterator end()  return finish; 
    size_type size() const  return finish - start; 
    // ...
;

控制中心

deque::map的类型为二重指针T**，称为控制中心，其中每个元素指向一个buffer

迭代器

template<class T, class Ref, class Ptr, size_t BufSiz>
struct __deque_iterator 
    // 定义5个关联类型
    typedef random_access_iterator_tag	iterator_category; 	// 关联类型1
    typedef T 							value_type;       	// 关联类型2
    typedef ptrdiff_t 					difference_type;	// 关联类型3
    typedef Ptr 						pointer;			// 关联类型4
    typedef Ref 						reference;			// 关联类型5

    typedef size_t size_type;
    typedef T **map_pointer;
    typedef __deque_iterator self;

    // 迭代器核心字段:4个指针
    T *cur;     		// 指向当前元素
    T *first;   		// 指向当前buffer的开始
    T *last;    		// 指向当前buffer的末尾
    map_pointer node;   // 指向控制中心
    // ...
;

迭代器deque::iterator的核心字段是4个指针:cur指向当前元素、first和last分别指向当前buffer的开始和末尾、node指向控制中心

迭代器deque::iterator模拟空间的连续性：

template<class T, class Ref, class Ptr, size_t BufSiz>
struct __deque_iterator 
    // 迭代器核心字段:4个指针
    T *cur;            	// 指向当前元素
    T *first;        	// 指向当前buffer的开始
    T *last;            // 指向当前buffer的末尾
    map_pointer node;   // 指向控制中心
    // ...

    
    difference_type operator-(const self &x) const 
        return 
            difference_type(buffer_size()) * (node - x.node - 1) +	// 两根迭代器间的长度
               (cur - first) +      								// 当前迭代器到当前buffer末尾的长度
               (x.last - x.cur);    								// 迭代器x到其buffer首部的长度
    

    self &operator++() 
        ++cur;				// 切换至下一元素
        if (cur == last)  	// 若到达buffer末尾,则跳转至下一buffer的起点
            set_node(node + 1);
            cur = first; 
        
        return *this;
    

    void set_node(map_pointer new_node)  	// 设置当前元素所在的buffer为new_node
        node = new_node;
        first = *new_node;
        last = first + difference_type(buffer_size());
    

    self operator++(int) 
        self tmp = *this;
        ++*this;
        return tmp;
    

    self &operator+=(difference_type n) 
        difference_type offset = n + (cur - first);
        if (offset >= 0 && offset < difference_type(buffer_size()))	
            // 若目标位置在同一buffer内,则直接跳转
            cur += n;
        else 
			// 若目标位置不在同一buffer内,则先切换buffer,再在buffer内寻址
            difference_type node_offset = 
                offset > 0 ? offset / difference_type(buffer_size())
                               : -difference_type((-offset - 1) / buffer_size()) - 1;
            set_node(node + node_offset);
            cur = first + (offset - node_offset * difference_type(buffer_size()));
        
        return *this;
    

    self operator+(difference_type n) const 
        self tmp = *this;
        return tmp += n;
    

    self &operator-=(difference_type n)  return *this += -n; 

    self operator-(difference_type n) const 
        self tmp = *this;
        return tmp -= n;
    

    reference operator[](difference_type n) const  return *(*this + n); 
;

insert方法

deque::insert方法先判断插入元素在容器的前半部分还是后半部分,再将数据往比较短的那一半推

iterator insert(iterator position, const value_type &x) 
    if (position.cur == start.cur)         	// 若插入位置是容器首部,则直接push_front
        push_front(x);
        return start;
     else if (position.cur == finish.cur) 	// 若插入位置是容器尾部,则直接push_back
        push_back(x);
        iterator tmp = finish;
        --tmp;
        return tmp;
     else 
        return insert_aux(position, x);
    


template<class T, class Alloc, size_t BufSize>
typename deque<T, Alloc, BufSize>::iterator deque<T, Alloc, BufSize>::insert_aux(iterator pos, const value_type &x) 
    difference_type index = pos - start;    // 插入点前的元素数
    value_type x_copy = x;
    if (index < size() / 2)     	  		// 1. 如果插入点前的元素数较少,则将前半部分元素向前推
        push_front(front());        		// 1.1. 在容器首部创建元素
        // ...
        copy(front2, pos1, front1); 		// 1.2. 将前半部分元素左移
     else                         		// 2. 如果插入点后的元素数较少,则将后半部分元素向后推
        push_back(back());          		// 2.1. 在容器末尾创建元素
        copy_backward(pos, back2, back1); 	// 2.2. 将后半部分元素右移
    
    *pos = x_copy;		// 3. 在插入位置上放入元素
    return pos;