随着nosql风潮兴起,redis作为当中一个耀眼的明星,也越来越多的被关注和使用,我在工作中也广泛的用到了redis来充当cache和key-value DB,但当大家发现数据越来越多时,不禁有些担心,redis能撑的住吗,虽然官方已经有漂亮的benchmark,自己也可以做做压力测试,但是看看源码,也是确认问题最直接的办法之一。比如目前我们要确认的一个问题是,redis是如何删除过期数据的?
用一个可以"find reference"的IDE,沿着setex(Set the value and expiration of a key)命令一窥究竟:
void setexCommand(redisClient *c) {
c->argv[3] = tryObjectEncoding(c->argv[3]);
setGenericCommand(c,0,c->argv[1],c->argv[3],c->argv[2]);
}
setGenericCommand是一个实现set,setnx,setex的通用函数,参数设置不同而已。
void setCommand(redisClient *c) {
c->argv[2] = tryObjectEncoding(c->argv[2]);
setGenericCommand(c,0,c->argv[1],c->argv[2],NULL);
}
void setnxCommand(redisClient *c) {
c->argv[2] = tryObjectEncoding(c->argv[2]);
setGenericCommand(c,1,c->argv[1],c->argv[2],NULL);
}
void setexCommand(redisClient *c) {
c->argv[3] = tryObjectEncoding(c->argv[3]);
setGenericCommand(c,0,c->argv[1],c->argv[3],c->argv[2]);
}
再看setGenericCommand:
1 void setGenericCommand(redisClient *c, int nx, robj *key, robj *val, robj *expire) {
2 long seconds = 0; /* initialized to avoid an harmness warning */
3
4 if (expire) {
5 if (getLongFromObjectOrReply(c, expire, &seconds, NULL) != REDIS_OK)
6 return;
7 if (seconds <= 0) {
8 addReplyError(c,"invalid expire time in SETEX");
9 return;
10 }
11 }
12
13 if (lookupKeyWrite(c->db,key) != NULL && nx) {
14 addReply(c,shared.czero);
15 return;
16 }
17 setKey(c->db,key,val);
18 server.dirty++;
19 if (expire) setExpire(c->db,key,time(NULL)+seconds);
20 addReply(c, nx ? shared.cone : shared.ok);
21 }
22
13行处理"Set the value of a key, only if the key does not exist"的场景,17行插入这个key,19行设置它的超时,注意时间戳已经被设置成了到期时间。这里要看一下redisDb(即c->db)的定义:
typedef struct redisDb {
dict *dict; /* The keyspace for this DB */
dict *expires; /* Timeout of keys with a timeout set */
dict *blocking_keys; /* Keys with clients waiting for data (BLPOP) */
dict *io_keys; /* Keys with clients waiting for VM I/O */
dict *watched_keys; /* WATCHED keys for MULTI/EXEC CAS */
int id;
} redisDb;
仅关注dict和expires,分别来存key-value和它的超时,也就是说如果一个key-value是有超时的,那么它会存在dict里,同时也存到expires里,类似这样的形式:dict[key]:value,expires[key]:timeout.
当然key-value没有超时,expires里就不存在这个key。剩下setKey和setExpire两个函数无非是插数据到两个字典里,这里不再详述。
那么redis是如何删除过期key的呢。
通过查看dbDelete的调用者,首先注意到这一个函数,是用来删除过期key的。
1 int expireIfNeeded(redisDb *db, robj *key) {
2 time_t when = getExpire(db,key);
3
4 if (when < 0) return 0; /* No expire for this key */
5
6 /* Don't expire anything while loading. It will be done later. */
7 if (server.loading) return 0;
8
9 /* If we are running in the context of a slave, return ASAP:
10 * the slave key expiration is controlled by the master that will
11 * send us synthesized DEL operations for expired keys.
12 *
13 * Still we try to return the right information to the caller,
14 * that is, 0 if we think the key should be still valid, 1 if
15 * we think the key is expired at this time. */
16 if (server.masterhost != NULL) {
17 return time(NULL) > when;
18 }
19
20 /* Return when this key has not expired */
21 if (time(NULL) <= when) return 0;
22
23 /* Delete the key */
24 server.stat_expiredkeys++;
25 propagateExpire(db,key);
26 return dbDelete(db,key);
27 }
28
ifNeed表示能删则删,所以4行没有设置超时不删,7行在"loading"时不删,16行非主库不删,21行未到期不删。25行同步从库和文件。
再看看哪些函数调用了expireIfNeeded,有lookupKeyRead,lookupKeyWrite,dbRandomKey,existsCommand,keysCommand。通过这些函数命名可以看出,只要访问了某一个key,顺带做的事情就是尝试查看过期并删除,这就保证了用户不可能访问到过期的key。但是如果有大量的key过期,并且没有被访问到,那么就浪费了许多内存。Redis是如何处理这个问题的呢。
dbDelete的调用者里还发现这样一个函数:
1 /* Try to expire a few timed out keys. The algorithm used is adaptive and
2 * will use few CPU cycles if there are few expiring keys, otherwise
3 * it will get more aggressive to avoid that too much memory is used by
4 * keys that can be removed from the keyspace. */
5 void activeExpireCycle(void) {
6 int j;
7
8 for (j = 0; j < server.dbnum; j++) {
9 int expired;
10 redisDb *db = server.db+j;
11
12 /* Continue to expire if at the end of the cycle more than 25%
13 * of the keys were expired. */
14 do {
15 long num = dictSize(db->expires);
16 time_t now = time(NULL);
17
18 expired = 0;
19 if (num > REDIS_EXPIRELOOKUPS_PER_CRON)
20 num = REDIS_EXPIRELOOKUPS_PER_CRON;
21 while (num--) {
22 dictEntry *de;
23 time_t t;
24
25 if ((de = dictGetRandomKey(db->expires)) == NULL) break;
26 t = (time_t) dictGetEntryVal(de);
27 if (now > t) {
28 sds key = dictGetEntryKey(de);
29 robj *keyobj = createStringObject(key,sdslen(key));
30
31 propagateExpire(db,keyobj);
32 dbDelete(db,keyobj);
33 decrRefCount(keyobj);
34 expired++;
35 server.stat_expiredkeys++;
36 }
37 }
38 } while (expired > REDIS_EXPIRELOOKUPS_PER_CRON/4);
39 }
40 }
41
这个函数的意图已经有说明:删一点点过期key,如果过期key较少,那也只用一点点cpu。25行随机取一个key,38行删key成功的概率较低就退出。这个函数被放在一个cron里,每毫秒被调用一次。这个算法保证每次会删除一定比例的key,但是如果key总量很大,而这个比例控制的太大,就需要更多次的循环,浪费cpu,控制的太小,过期的key就会变多,浪费内存——这就是时空权衡了。
最后在dbDelete的调用者里还发现这样一个函数:
/* This function gets called when 'maxmemory' is set on the config file to limit
* the max memory used by the server, and we are out of memory.
* This function will try to, in order:
*
* - Free objects from the free list
* - Try to remove keys with an EXPIRE set
*
* It is not possible to free enough memory to reach used-memory < maxmemory
* the server will start refusing commands that will enlarge even more the
* memory usage.
*/
void freeMemoryIfNeeded(void)
这个函数太长就不再详述了,注释部分说明只有在配置文件中设置了最大内存时候才会调用这个函数,而设置这个参数的意义是,你把redis当做一个内存cache而不是key-value数据库。
以上3种删除过期key的途径,第二种定期删除一定比例的key是主要的删除途径,第一种“读时删除”保证过期key不会被访问到,第三种是一个当内存超出设定时的暴力手段。由此也能看出redis设计的巧妙之处,