redis源码解析——ziplist

Posted A_BCDE_

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了redis源码解析——ziplist相关的知识,希望对你有一定的参考价值。

版本:redis - 5.0.4
参考资料:redis设计与实现
文件:src下的ziplist.c ziplist.h

一、基础知识

压缩列表是Redis为了节约内存而开发的,是由一系列特殊编码的连续内存块组成的顺序性数据结构。一个压缩列表可以包含任意多个节点,每个节点可以保存一个字节数组或者一个整数值(短字符串小的整数)。
约等于连续内存的双向链表,使用p指针+长度寻址,数据过多时查询性能差

1、压缩列表的各个组成部分及详细说明

ziplist是由以下几部分组成:
| zlbytes | zltail | zllen | entry1 | entry2 | … | entryN | zlend|

 *  [0f 00 00 00] [0c 00 00 00] [02 00] [00 f3] [02 f6] [ff]
 *        |             |          |       |       |     |
 *     zlbytes        zltail     zllen    "2"     "5"   end
属性类型长度用途
zlbytesuint32_t4字节记录整个压缩列表占用的内存字节数:在对压缩列表进行内存重分配,或者计算zlend的位置时使用
zltailuint32_t4字节记录压缩列表表尾节点距离压缩列表的起始地址有多少个字节:通过这个偏移量,程序无需遍历整个压缩列表就可以确定表尾节点的地址
zllenuint16_t2字节记录了压缩列表包含的节点数量
entryN列表节点不定压缩列表包含的各个节点,节点的长度由节点保存的内存决定
zlenduint8_t1字节特殊值oxFF(十进制255),用于标记压缩列表的末尾

2、列表节点

typedef struct zlentry 
    unsigned int prevrawlensize;//前一节点encoding的长度
    unsigned int prevrawlen;//前一个节点的长度,小于254 为一个字节,大于则为五个字节
    unsigned int lensize;//当前节点encoding的长度(byte)
    unsigned int len;//当前节点长度,使用了多少byte
    unsigned int headersize;//prevrawlensize + lensize
    unsigned char encoding; /* 数据类型ZIP_STR_* 或ZIP_INT_* 编码格式
    							However for 4 bits immediate integers this can assume a range of values and must be range-checked. */
    unsigned char *p;//指向每一个节点的开始,也就是prevrawlen的位置
 zlentry;
  • 存储长度的数值采用小端字节序
  • 每个节点大小不同
  • 由于存储了每个节点的字节数,无需遍历,用指针p加减字节数,就能让p指向自己需要的地址空间,获得需要的值。

3、encoding

ziplist节约内存很重要的方式就是encoding,我们要弄清楚encoding是什么:

encoding:记录了节点所保存数据的类型以及长度。

  • 一字节、两字节或者五字节长,值最高位为00、01、或者10是字节数组编码:这种编码表示节点保存着字节数组,数组长度由编码去掉最高两位的其他位记录。
  • 一字节长,值的最高位为11的是整数编码:表示保存的是整数值,整数值类型和长度由编码去除最高两位的其他位记录。

字节数组编码

编码编码长度保存的值
00bbbbbb1 字节长度小于等于 63 字节的字节数组
01bbbbbb xxxxxxxx2 字节长度小于等于 16383 字节的字节数组
10______ aaaaaaaa bbbbbbbb cccccccc dddddddd5 字节长度小于等于 4294967295 的字节数组

整数编码

编码编码长度保存的值
110000001 字节int16_t 类型的整数
110100001 字节int32_t 类型的整数
111000001 字节int64_t 类型的整数
111100001 字节24 位有符号整数
111111101 字节8 位有符号整数
1111xxxx1 字节无, 因为编码本身的 xxxx 四个位已经保存了一个介于 0 和12 之间的值

二、连锁更新

前提
每个节点都存储了前一个节点的长度;
根据节点长度不同选择不同字节大小的空来存储长度信息(节省空间);

情景:假设有一个ziplist,存储了几个长度在250到253字节(next的prelen只需要1字节)的节点(只有这几个节点)。现在插入一个大于等于254字节(next的prelen需要5字节)大小的新节点在ziplist头部。

结果
node1的prelen只有一个字节大小,不够,需要重新申请空间,变成五个字节。申请之后,node1大小超过254,也需要它的下一个节点也需要五个字节来存储它的长度,即node2。
同理,node2申请空间,扩大,则node3也需要,之后node4,node5都需要。
我们只是插入了一个新节点,但是整个list都改变了,每个节点都要重新申请空间。这种连续的多次空间扩展称之为连锁更新

连锁更新出现的要求
连续的,多个,长度介于250到253字节的节点;
概率很低,所以连锁更新对ziplist的影响不大

/*
  插入节点时, 我们需要将下一个节点的前一节点长度字段设置值为插入节点的长度。
  可能会发生这种情况:下一个节点的prelen字段需要增长, 1字节->5字节。
  这只发生在有节点插入的情况下 (会导致 realloc 和 memmove)。
  当有连续节点大小接近zip _ big _ prvlen时, 这种效果可能会在ziplist中级联,

  注意:这种效果也可能发生在反向, 其中prevlen字段所需的字节可能会缩小。
  链式的节点 先增长,再缩小会导致频繁的空间resize,因此prevlen字段缩小的情况被故意忽略,
  字段长度允许保持大于必要的字节,。

  指针p 指向不需要插入后的第一个节点。
 */
unsigned char *__ziplistCascadeUpdate(unsigned char *zl, unsigned char *p) 
    size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), rawlen, rawlensize;
    size_t offset, noffset, extra;
    unsigned char *np;
    zlentry cur, next;

    //链表不为空
    while (p[0] != ZIP_END) 
        //获取p所指向的节点的全部信息,存在cur指向的空间中
        zipEntry(p, &cur);
        rawlen = cur.headersize + cur.len;
        rawlensize = zipStorePrevEntryLength(NULL,rawlen);

        //没有下一个节点,结束
        if (p[rawlen] == ZIP_END) break;
        //有下一个节点,取得信息,放在next中
        zipEntry(p+rawlen, &next);

        //判断next的prevrawlen是否被更改:没有(不发生更新),结束
        if (next.prevrawlen == rawlen) break;

        //prevrawlen字段需要扩展
        if (next.prevrawlensize < rawlensize) 
            /* The "prevlen" field of "next" needs more bytes to hold
             * the raw length of "cur". */
            offset = p-zl;
            extra = rawlensize-next.prevrawlensize;
            zl = ziplistResize(zl,curlen+extra);
            p = zl+offset;

            /* Current pointer and offset for next element. */
            np = p+rawlen;
            noffset = np-zl;

            /* Update tail offset when next element is not the tail element. */
            if ((zl+intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))) != np) 
                ZIPLIST_TAIL_OFFSET(zl) =
                    intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+extra);
            

            /* Move the tail to the back. */
            memmove(np+rawlensize,
                np+next.prevrawlensize,
                curlen-noffset-next.prevrawlensize-1);
            zipStorePrevEntryLength(np,rawlen);

            /* Advance the cursor */
            p += rawlen;
            curlen += extra;
         else //需要紧缩
            if (next.prevrawlensize > rawlensize) 
                /* This would result in shrinking, which we want to avoid.
                 * So, set "rawlen" in the available bytes. */
                zipStorePrevEntryLengthLarge(p+rawlen,rawlen);
             else 
                zipStorePrevEntryLength(p+rawlen,rawlen);
            

            /* Stop here, as the raw length of "next" has not changed. */
            break;
        
    
    return zl;


 /*
 指针p指向前一个节点,len为当前长度
 如果前一个节点更改大小, 此函数返回编码前一节点长度所需的字节数的差异。
 如果需要更多的空间, 该函数返回正数;如果需要较少的空间, 则返回负数; 如果需要相同的空间, 则返回0。
 */
int zipPrevLenByteDiff(unsigned char *p, unsigned int len) 
    unsigned int prevlensize;
	
	//编码前一节点长度所需的字节数,存在prevlensize中
    ZIP_DECODE_PREVLENSIZE(p, prevlensize);
	//zipStorePrevEntryLength(NULL, len):获取编码len长度所需的字节数
    return zipStorePrevEntryLength(NULL, len) - prevlensize;

三、ziplist.h

//生成
unsigned char *ziplistNew(void);
//把 second 追加到 first, 合并这两个
unsigned char *ziplistMerge(unsigned char **first, unsigned char **second);
//从头或者从尾插入s
unsigned char *ziplistPush(unsigned char *zl, unsigned char *s, unsigned int slen, int where);
//获取指定下标的
unsigned char *ziplistIndex(unsigned char *zl, int index);
//下一个节点
unsigned char *ziplistNext(unsigned char *zl, unsigned char *p);
//前一个节点
unsigned char *ziplistPrev(unsigned char *zl, unsigned char *p);
//获取
unsigned int ziplistGet(unsigned char *p, unsigned char **sval, unsigned int *slen, long long *lval);
//插入
unsigned char *ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen);
//删除
unsigned char *ziplistDelete(unsigned char *zl, unsigned char **p);
//从index开始删除num个
unsigned char *ziplistDeleteRange(unsigned char *zl, int index, unsigned int num);
//比较
unsigned int ziplistCompare(unsigned char *p, unsigned char *s, unsigned int slen);
//查找
unsigned char *ziplistFind(unsigned char *p, unsigned char *vstr, unsigned int vlen, unsigned int skip);
//长度
unsigned int ziplistLen(unsigned char *zl);
//原始长度
size_t ziplistBlobLen(unsigned char *zl);
//打印第一个
void ziplistRepr(unsigned char *zl);

quickList

  • ziplist需要连续内存,申请内存效率低
    限制ziplist长度和entry大小
  • ziplist如何存储大数据
    数据分片,多个ziplist
  • 管理多个ziplist
    使用quickList
/* quicklistNode is a 32 byte struct describing a listpack for a quicklist.
 * We use bit fields keep the quicklistNode at 32 bytes.
 * count: 16 bits, max 65536 (max lp bytes is 65k, so max count actually < 32k).
 * encoding: 2 bits, RAW=1, LZF=2.
 * container: 2 bits, PLAIN=1 (a single item as char array), PACKED=2 (listpack with multiple items).
 * recompress: 1 bit, bool, true if node is temporary decompressed for usage.
 * attempted_compress: 1 bit, boolean, used for verifying during testing.
 * extra: 10 bits, free for future use; pads out the remainder of 32 bits */
typedef struct quicklistNode 
    struct quicklistNode *prev;
    struct quicklistNode *next;
    unsigned char *entry;
    size_t sz; //当前ziplist字节数
    unsigned int count : 16;//当前ziplist节点数
    unsigned int encoding : 2;   /* RAW==1 or LZF==2 */
    unsigned int container : 2;  /* 数据容器类型 PLAIN==1 or PACKED==2 */
    unsigned int recompress : 1; //是否被解压
    unsigned int attempted_compress : 1; /* node can't compress; too small */
    unsigned int dont_compress : 1; /* prevent compression of entry that will be used later */
    unsigned int extra : 9; //预留字段
 quicklistNode;


/* quicklist is a 40 byte struct (on 64-bit systems) describing a quicklist.
 * 'count' is the number of total entries.
 * 'len' is the number of quicklist nodes.
 * 'compress' is: 0 if compression disabled, otherwise it's the number
 *                of quicklistNodes to leave uncompressed at ends of quicklist.
 * 'fill' is the user-requested (or default) fill factor.
 * 'bookmarks are an optional feature that is used by realloc this struct,
 *      so that they don't consume memory when not used. */
typedef struct quicklist 
    quicklistNode *head;
    quicklistNode *tail;
    unsigned long count;//所有ziplist的节点数量
    unsigned long len;//ziplist的数量
    signed int fill : QL_FILL_BITS;//ziplist的entry上限
    unsigned int compress : QL_COMP_BITS; //首尾不压缩的节点数量
    unsigned int bookmark_count: QL_BM_BITS;
    quicklistBookmark bookmarks[];
 quicklist;
  • 中间节点压缩 节省内存

redis源码学习redis 专属“链表”:ziplist

问题抛出

用过 Python 的列表吗?就是那种可以存储任意类型数据的,支持随机读取的数据结构。
没有用过的话那就没办法了。

本质上这种列表可以使用数组、链表作为其底层结构,不知道Python中的列表是以什么作为底层结构的。
但是redis的列表既不是用链表,也不是用数组作为其底层实现的,原因也显而易见:数组不方便,弄个二维的?柔性的?怎么写?链表可以实现,通用链表嘛,数据域放 void* 就可以实现列表功能。但是,链表的缺点也很明显,容易造成内存碎片。

在这个大环境下,秉承着“能省就省”的指导思想,请你设计一款数据结构。


结构设计

这个图里要注意,右侧是没有记录“当前元素的大小”的

这个图挺详细哈,都省得我对每一个字段释义了,整挺好。

其他话,文件开头的注释也讲的很清楚了。(ziplist.c)

/* The ziplist is a specially encoded dually linked list that is designed
 * to be very memory efficient. It stores both strings and integer values,
 * where integers are encoded as actual integers instead of a series of
 * characters. It allows push and pop operations on either side of the list
 * in O(1) time. However, because every operation requires a reallocation of
 * the memory used by the ziplist, the actual complexity is related to the
 * amount of memory used by the ziplist.
 *
 * ----------------------------------------------------------------------------
 *
 * ZIPLIST OVERALL LAYOUT
 * ======================
 *
 * The general layout of the ziplist is as follows:
 *
 * <zlbytes> <zltail> <zllen> <entry> <entry> ... <entry> <zlend>
 *
 * NOTE: all fields are stored in little endian, if not specified otherwise.
 *
 * <uint32_t zlbytes> is an unsigned integer to hold the number of bytes that
 * the ziplist occupies, including the four bytes of the zlbytes field itself.
 * This value needs to be stored to be able to resize the entire structure
 * without the need to traverse it first.
 *
 * <uint32_t zltail> is the offset to the last entry in the list. This allows
 * a pop operation on the far side of the list without the need for full
 * traversal.
 *
 * <uint16_t zllen> is the number of entries. When there are more than
 * 2^16-2 entries, this value is set to 2^16-1 and we need to traverse the
 * entire list to know how many items it holds.
 *
 * <uint8_t zlend> is a special entry representing the end of the ziplist.
 * Is encoded as a single byte equal to 255. No other normal entry starts
 * with a byte set to the value of 255.
 *
 * ZIPLIST ENTRIES
 * ===============
 *
 * Every entry in the ziplist is prefixed by metadata that contains two pieces
 * of information. First, the length of the previous entry is stored to be
 * able to traverse the list from back to front. Second, the entry encoding is
 * provided. It represents the entry type, integer or string, and in the case
 * of strings it also represents the length of the string payload.
 * So a complete entry is stored like this:
 *
 * <prevlen> <encoding> <entry-data>
 *
 * Sometimes the encoding represents the entry itself, like for small integers
 * as we'll see later. In such a case the <entry-data> part is missing, and we
 * could have just:
 *
 * <prevlen> <encoding>
 *
 * The length of the previous entry, <prevlen>, is encoded in the following way:
 * If this length is smaller than 254 bytes, it will only consume a single
 * byte representing the length as an unsinged 8 bit integer. When the length
 * is greater than or equal to 254, it will consume 5 bytes. The first byte is
 * set to 254 (FE) to indicate a larger value is following. The remaining 4
 * bytes take the length of the previous entry as value.
 *
 * So practically an entry is encoded in the following way:
 *
 * <prevlen from 0 to 253> <encoding> <entry>
 *
 * Or alternatively if the previous entry length is greater than 253 bytes
 * the following encoding is used:
 *
 * 0xFE <4 bytes unsigned little endian prevlen> <encoding> <entry>
 *
 * The encoding field of the entry depends on the content of the
 * entry. When the entry is a string, the first 2 bits of the encoding first
 * byte will hold the type of encoding used to store the length of the string,
 * followed by the actual length of the string. When the entry is an integer
 * the first 2 bits are both set to 1. The following 2 bits are used to specify
 * what kind of integer will be stored after this header. An overview of the
 * different types and encodings is as follows. The first byte is always enough
 * to determine the kind of entry.
 *
 * |00pppppp| - 1 byte
 *      String value with length less than or equal to 63 bytes (6 bits).
 *      "pppppp" represents the unsigned 6 bit length.
 * |01pppppp|qqqqqqqq| - 2 bytes
 *      String value with length less than or equal to 16383 bytes (14 bits).
 *      IMPORTANT: The 14 bit number is stored in big endian.
 * |10000000|qqqqqqqq|rrrrrrrr|ssssssss|tttttttt| - 5 bytes
 *      String value with length greater than or equal to 16384 bytes.
 *      Only the 4 bytes following the first byte represents the length
 *      up to 2^32-1. The 6 lower bits of the first byte are not used and
 *      are set to zero.
 *      IMPORTANT: The 32 bit number is stored in big endian.
 * |11000000| - 3 bytes
 *      Integer encoded as int16_t (2 bytes).
 * |11010000| - 5 bytes
 *      Integer encoded as int32_t (4 bytes).
 * |11100000| - 9 bytes
 *      Integer encoded as int64_t (8 bytes).
 * |11110000| - 4 bytes
 *      Integer encoded as 24 bit signed (3 bytes).
 * |11111110| - 2 bytes
 *      Integer encoded as 8 bit signed (1 byte).
 * |1111xxxx| - (with xxxx between 0000 and 1101) immediate 4 bit integer.
 *      Unsigned integer from 0 to 12. The encoded value is actually from
 *      1 to 13 because 0000 and 1111 can not be used, so 1 should be
 *      subtracted from the encoded 4 bit value to obtain the right value.
 * |11111111| - End of ziplist special entry.
 *
 * Like for the ziplist header, all the integers are represented in little
 * endian byte order, even when this code is compiled in big endian systems.
 *
 * EXAMPLES OF ACTUAL ZIPLISTS
 * ===========================
 *
 * The following is a ziplist containing the two elements representing
 * the strings "2" and "5". It is composed of 15 bytes, that we visually
 * split into sections:
 *
 *  [0f 00 00 00] [0c 00 00 00] [02 00] [00 f3] [02 f6] [ff]
 *        |             |          |       |       |     |
 *     zlbytes        zltail    entries   "2"     "5"   end
 *
 * The first 4 bytes represent the number 15, that is the number of bytes
 * the whole ziplist is composed of. The second 4 bytes are the offset
 * at which the last ziplist entry is found, that is 12, in fact the
 * last entry, that is "5", is at offset 12 inside the ziplist.
 * The next 16 bit integer represents the number of elements inside the
 * ziplist, its value is 2 since there are just two elements inside.
 * Finally "00 f3" is the first entry representing the number 2. It is
 * composed of the previous entry length, which is zero because this is
 * our first entry, and the byte F3 which corresponds to the encoding
 * |1111xxxx| with xxxx between 0001 and 1101. We need to remove the "F"
 * higher order bits 1111, and subtract 1 from the "3", so the entry value
 * is "2". The next entry has a prevlen of 02, since the first entry is
 * composed of exactly two bytes. The entry itself, F6, is encoded exactly
 * like the first entry, and 6-1 = 5, so the value of the entry is 5.
 * Finally the special entry FF signals the end of the ziplist.
 *
 * Adding another element to the above string with the value "Hello World"
 * allows us to show how the ziplist encodes small strings. We'll just show
 * the hex dump of the entry itself. Imagine the bytes as following the
 * entry that stores "5" in the ziplist above:
 *
 * [02] [0b] [48 65 6c 6c 6f 20 57 6f 72 6c 64]
 *
 * The first byte, 02, is the length of the previous entry. The next
 * byte represents the encoding in the pattern |00pppppp| that means
 * that the entry is a string of length <pppppp>, so 0B means that
 * an 11 bytes string follows. From the third byte (48) to the last (64)
 * there are just the ASCII characters for "Hello World".
 *
 * ----------------------------------------------------------------------------
 *
 * Copyright (c) 2009-2012, Pieter Noordhuis <pcnoordhuis at gmail dot com>
 * Copyright (c) 2009-2017, Salvatore Sanfilippo <antirez at gmail dot com>
 * All rights reserved.
 */

看完了么?接下来就是基操阶段了,对于任何一种数据结构,基操无非增删查改。

实际节点

typedef struct zlentry 
    unsigned int prevrawlensize; /* Bytes used to encode the previous entry len*/
    unsigned int prevrawlen;     /* Previous entry len. */
    unsigned int lensize;        /* Bytes used to encode this entry type/len.
                                    For example strings have a 1, 2 or 5 bytes
                                    header. Integers always use a single byte.*/
    unsigned int len;            /* Bytes used to represent the actual entry.
                                    For strings this is just the string length
                                    while for integers it is 1, 2, 3, 4, 8 or
                                    0 (for 4 bit immediate) depending on the
                                    number range. */
    unsigned int headersize;     /* prevrawlensize + lensize. */
    unsigned char encoding;      /* Set to ZIP_STR_* or ZIP_INT_* depending on
                                    the entry encoding. However for 4 bits
                                    immediate integers this can assume a range
                                    of values and must be range-checked. */
    unsigned char *p;            /* Pointer to the very start of the entry, that
                                    is, this points to prev-entry-len field. */
 zlentry;

基本操作

我觉得这张图还是要再摆一下:

这个图里要注意,右侧是没有记录“当前元素的大小”的

真实插入的是这个函数:

讲真,头皮有点发麻。那么我们等下还是用老套路,按步骤拆开来看。

/* Insert item at "p". */
unsigned char *__ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen) 
    size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), reqlen;
    unsigned int prevlensize, prevlen = 0;
    size_t offset;
    int nextdiff = 0;
    unsigned char encoding = 0;
    long long value = 123456789; /* initialized to avoid warning. Using a value
                                    that is easy to see if for some reason
                                    we use it uninitialized. */
    zlentry tail;

    /* Find out prevlen for the entry that is inserted. */
    if (p[0] != ZIP_END) 
        ZIP_DECODE_PREVLEN(p, prevlensize, prevlen);
     else 
        unsigned char *ptail = ZIPLIST_ENTRY_TAIL(zl);
        if (ptail[0] != ZIP_END) 
            prevlen = zipRawEntryLength(ptail);
        
    

    /* See if the entry can be encoded */
    if (zipTryEncoding(s,slen,&value,&encoding)) 
        /* 'encoding' is set to the appropriate integer encoding */
        reqlen = zipIntSize(encoding);
     else 
        /* 'encoding' is untouched, however zipStoreEntryEncoding will use the
         * string length to figure out how to encode it. */
        reqlen = slen;
    
    /* We need space for both the length of the previous entry and
     * the length of the payload. */
    reqlen += zipStorePrevEntryLength(NULL,prevlen);
    reqlen += zipStoreEntryEncoding(NULL,encoding,slen);

    /* When the insert position is not equal to the tail, we need to
     * make sure that the next entry can hold this entry's length in
     * its prevlen field. */
    int forcelarge = 0;
    nextdiff = (p[0] != ZIP_END) ? zipPrevLenByteDiff(p,reqlen) : 0;
    if (nextdiff == -4 && reqlen < 4) 
        nextdiff = 0;
        forcelarge = 1;
    

    /* Store offset because a realloc may change the address of zl. */
    offset = p-zl;
    zl = ziplistResize(zl,curlen+reqlen+nextdiff);
    p = zl+offset;

    /* Apply memory move when necessary and update tail offset. */
    if (p[0] != ZIP_END) 
        /* Subtract one because of the ZIP_END bytes */
        memmove(p+reqlen,p-nextdiff,curlen-offset-1+nextdiff);

        /* Encode this entry's raw length in the next entry. */
        if (forcelarge)
            zipStorePrevEntryLengthLarge(p+reqlen,reqlen);
        else
            zipStorePrevEntryLength(p+reqlen,reqlen);

        /* Update offset for tail */
        ZIPLIST_TAIL_OFFSET(zl) =
            intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+reqlen);

        /* When the tail contains more than one entry, we need to take
         * "nextdiff" in account as well. Otherwise, a change in the
         * size of prevlen doesn't have an effect on the *tail* offset. */
        zipEntry(p+reqlen, &tail);
        if (p[reqlen+tail.headersize+tail.len] != ZIP_END) 
            ZIPLIST_TAIL_OFFSET(zl) =
                intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff);
        
     else 
        /* This element will be the new tail. */
        ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(p-zl);
    

    /* When nextdiff != 0, the raw length of the next entry has changed, so
     * we need to cascade the update throughout the ziplist */
    if (nextdiff != 0) 
        offset = p-zl;
        zl = __ziplistCascadeUpdate(zl,p+reqlen);
        p = zl+offset;
    

    /* Write the entry */
    p += zipStorePrevEntryLength(p,prevlen);
    p += zipStoreEntryEncoding(p,encoding,slen);
    if (ZIP_IS_STR(encoding)) 
        memcpy(p,s,slen);
     else 
        zipSaveInteger(p,value,encoding);
    
    ZIPLIST_INCR_LENGTH(zl,1);
    return zl;

对“链表”插入数据有几个步骤?
1、偏移
2、插进去
3、缝合

那这个“列表”,比较特殊一点,特殊在哪里?特殊在它比较紧凑,而且数据类型,其实也就两种,要么integer,要么string。所以它的步骤是?
1、数据重新编码
2、解析数据并分配空间
3、接入数据


重新编码

什么是重新编码?插入一个元素,是不是需要对:“前一个元素的大小、本身大小、当前元素编码” 这些数据进行一个统计,然后一并插入。就编这个。

插入位置无非三个,头中尾。
头:前一个元素大小为0,因为前面没有元素。
中:待插入位置后一个元素记录的“前一个元素大小”,当然,之后本身大小就成为了后一个元素眼中的“前一个元素大小”。
尾:那就要把三个字段加起来了。

具体怎么重新编码就不看了吧,这篇本来就已经很长了。


解析数据

再往下就是解析数据了。
首先尝试将数据解析为整数,如果可以解析,就按照压缩列表整数类型编码存储;如果解析失败,就按照压缩列表字节数组类型编码存储。

解析之后,数值存储在 value 中,编码格式存储在 encoding中。如果解析成功,还要计算整数所占字节数。变量 reqlen 存储当前元素所需空间大小,再累加其他两个字段的空间大小,就是本节点所需空间大小了。


重新分配空间

看注释这架势,咋滴,还存在没地方给它塞?

来我们看看。

这里的分配空间不是简单的就新插进来的数据多少空间就分配多少,如果没有仔细阅读上面那段英文的话,嗯,可以选择绕回去仔细阅读一下那个节点组成。特别是那个:

/*
* The length of the previous entry, <prevlen>, is encoded in the following way:
* If this length is smaller than 254 bytes, it will only consume a single
* byte representing the length as an unsinged 8 bit integer. When the length
* is greater than or equal to 254, it will consume 5 bytes. The first byte is
* set to 254 (FE) to indicate a larger value is following. The remaining 4
* bytes take the length of the previous entry as value.
*/

所以这个 previous 就是个不确定因素。有可能人家本来是 1 1 排列的,中间插进来一个之后变成 1 1 5 排列了;也有可能人家是1 5 排列的、5 1 排列的,总之就是不确定。

所以,在 entryX 的位置插入一个数据之后,entryX+1 的 previous 可能不变,可能加四,也可能减四,谁也说不准。说不准那不就得测一下嘛。所以就测一下,仅此而已。


接入数据

数据怎么接入?鉴于这里真心不是链表,是列表。
所以,按数组那一套来。对。

很麻烦吧。其实不麻烦,你在redis里见过它给你中间插入的机会了吗?更不要说头插了,你见过它给你头插的机会了吗?

插个题外话:大数据插入时,数组不一定输给链表。在尾插的时候,数组的优势是远超链表的(当然,仅限于尾插)。在我两个月前的博客里有做过这一系列的实验。


删就不写了吧,增的逆操作,从系列开始就没写过删。不过这里删就不可避免的大量数据进行复制了(如果不真删,只是做个删除标志呢?这样会省时间,但是时候会造成内存碎片化。不过可以设计一个定期调整内存的函数,比方说重用三分之一的块之后紧凑一下?内存不够用的时候紧凑一下?STL就是这么干的)。


查也没啥好讲的了吧,这个数据结构的应用场景一般就是对键进行检索,这里就是个值,不一样的是这个值是一串的。
所以除了提供原有的前后向遍历之外,还提供了 range 查询,不难的。

以上是关于redis源码解析——ziplist的主要内容,如果未能解决你的问题,请参考以下文章

Redis 内存压缩实战

Redis 内存压缩实战

redis hashtag一文搞懂,源码解析

Redis源码解析-全览

Redis源码解析:13Redis中的事件驱动机制

曹工说Redis源码-- redis server 主循环大体流程解析