redis存储结构

Posted 2023-02-22 haozi_ncepu

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了redis存储结构相关的知识，希望对你有一定的参考价值。

Base 2.8.7
Redis是一个包含了很多Key-Value对的大字典，这个字典支持的Value非常丰富，可以为字符串、哈希表、列表、集合和有序集，基于这些类型丰富的value，扩展出了功能强大的操作，例如hmset、lpush、sadd等

字典

字典是Redis最基础的数据结构，一个字典即一个DB，Redis支持多DB

Redis字典采用Hash表实现，针对碰撞问题，其采用的方法为“链地址法”，即将多个哈希值相同的节点串连在一起，从而解决冲突问题。
“链地址法”的问题在于当碰撞剧烈时，性能退化严重，例如：当有n个数据，m个槽位，如果m=1，则整个Hash表退化为链表，查询复杂度O(n)
为了避免Hash碰撞攻击，redis随机化了Hash表种子
Redis的方案是“双buffer”，正常流程使用一个buffer，当发现碰撞剧烈（判断依据为当前槽位数和Key数的对比），分配一个更大的buffer，然后逐步将数据从老的buffer迁移到新的buffer。

Redis字典结构如下：
[cpp] view plain copy

typedef struct dict
dictType *type;
void *privdata;
dictht ht[2]; //双buffer
int rehashidx;
int iterators;
dict;
typedef struct dictht
dictEntry **table; //hash链表
unsigned long size;
unsigned long sizemask;
unsigned long used;
dictht;
//数据节点<K,V>
typedef struct dictEntry
void *key;
union
void *val;
uint64_t u64;
int64_t s64;
v;
struct dictEntry *next;
dictEntry;

redisObject是真正存储redis各种类型的结构，其内容如下：
[cpp] view plain copy

typedef struct redisObject
unsigned type:4; //逻辑类型
unsigned notused:2; /* Not used */
unsigned encoding:4; //物理存储类型
unsigned lru:22; /* lru time (relative to server.lruclock) */
int refcount;
void *ptr; //具体数据
robj;

其中type即redis支持的逻辑类型，包括：
[cpp] view plain copy

#define REDIS_STRING 0
#define REDIS_LIST 1
#define REDIS_SET 2
#define REDIS_ZSET 3
#define REDIS_HASH 4

enconding为物理存储方式，一种逻辑类型可以使用不同的存储方式，包括：
[cpp] view plain copy

#define REDIS_ENCODING_RAW 0 /* Raw representation */
#define REDIS_ENCODING_INT 1 /* Encoded as integer */
#define REDIS_ENCODING_HT 2 /* Encoded as hash table */
#define REDIS_ENCODING_ZIPMAP 3 /* Encoded as zipmap */
#define REDIS_ENCODING_LINKEDLIST 4 /* Encoded as regular linked list */
#define REDIS_ENCODING_ZIPLIST 5 /* Encoded as ziplist */
#define REDIS_ENCODING_INTSET 6 /* Encoded as intset */
#define REDIS_ENCODING_SKIPLIST 7 /* Encoded as skiplist */

字符串

Redis的所有的key都采用字符串保存，另外，Redis也支持字符串类型的value。
字符串类型即前文中看到的REDIS_STRING，其物理实现（enconding）可以为 REDIS_ENCODING_INT或REDIS_ENCODING_RAW
REDIS_ENCODING_INT保存为long型，即redis会尝试将一个字符串转化为Long，可以转换的话，即保存为REDIS_ENCODING_INT
否则，Redis会将REDIS_STRING保存为字符串类型，即REDIS_ENCODING_RAW
字符串类型在redis中用sds封装，主要为了解决长度计算和追加效率的问题，其定义如下：
[cpp] view plain copy

typedef char *sds;
struct sdshdr
int len; // buf 已占用长度
int free; // buf 剩余可用长度
char buf[];// 柔性数组，实际保存字符串数据的地方
;
static inline size_t sdslen(const sds s)
struct sdshdr *sh = (void*)(s-(sizeof(struct sdshdr)));
return sh->len;
static inline size_t sdsavail(const sds s)
struct sdshdr *sh = (void*)(s-(sizeof(struct sdshdr)));
return sh->free;

有时间的同学可以详细看下Sds.h和Sds.c两个文件，还是很有意思的。

Hash表

Redis支持Value为Hash表，其逻辑类型为REDIS_HASH，REDIS_HASH可以有两种encoding方式： REDIS_ENCODING_ZIPLIST 和 REDIS_ENCODING_HT
REDIS_ENCODING_HT即前文提到的字典的实现
REDIS_ENCODING_ZIPLIST即ZIPLIST，是一种双端列表，且通过特殊的格式定义，压缩内存适用，以时间换空间。ZIPLIST适合小数据量的读场景，不适合大数据量的多写/删除场景
Hash表默认的编码格式为REDIS_ENCODING_ZIPLIST，在收到来自用户的插入数据的命令时：
1,调用hashTypeTryConversion函数检查键/值的长度大于配置的hash_max_ziplist_value（默认64）
2,调用hashTypeSet判断节点数量大于配置的hash_max_ziplist_entries （默认512）

以上任意条件满足则将Hash表的数据结构从REDIS_ENCODING_ZIPLIST转为REDIS_ENCODING_HT

列表

Redis支持Value为一个列表，其逻辑类型为REDIS_SET，REDIS_SET有两种encoding方式，REDIS_ENCODING_ZIPLIST和REDIS_ENCODING_LINKEDLIST
REDIS_ENCODING_ZIPLIST同上
REDIS_ENCODING_LINKEDLIST是比较正统双端链接表的实现:
[cpp] view plain copy

typedef struct listNode
struct listNode *prev;
struct listNode *next;
void *value;
listNode;

列表的默认编码格式为REDIS_ENCODING_ZIPLIST，当满足以下条件时，编码格式转换为REDIS_ENCODING_LINKEDLIST
1,元素大小大于list-max-ziplist-value（默认64）
2,元素个数大于配置的list-max-ziplist-entries（默认512）

集合

Redis支持Value为集合，其逻辑类型为REDIS_SET，REDIS_SET有两种encoding方式： REDIS_ENCODING_INTSET 和 REDIS_ENCODING_HT（同上）
集合的元素类型和数量决定了encoding方式，默认采用REDIS_ENCODING_INTSET ，当满足以下条件时，转换为REDIS_ENCODING_HT：
1. 元素类型不是整数
2. 元素个数超过配置的“set-max-intset-entries”（默认512）
REDIS_ENCODING_INTSET是一个有序数组，使用的数据结构如下：
[cpp] view plain copy

typedef struct intset
uint32_t encoding; //3种类型：INTSET_ENC_INT16、INTSET_ENC_INT32、INTSET_ENC_INT64
uint32_t length; //元素个数
int8_t contents[]; //元素实际存放的位置，按序排放