1)数据结构
基于HyperLogLog算法,极小空间完成独立数量统计
本质还是string。
127.0.0.1 6379: type hyperLogLog_key
string
2)三个命令
pfadd key element [element ...] 向hyperloglog添加元素
pfcount key [key ...] 计算hyperloglog的独立总数
pfmerge destkey sourcekey [sourcekey ...] 合并多个hyperloglog
pfselftsest
pfdebug arg arg ...options...
127.0.0.1:6379> pfadd userIds "uuid-1" "uuid-2" "uuid-3" "uuid-4"
(integer) 1
127.0.0.1:6379> pfcount userIds
(integer) 4
127.0.0.1:6379> pfadd userIds "uuid-1" "uuid-2" "uuid-3" "uuid-90"
(integer) 1
127.0.0.1:6379> pfcount userIds
(integer) 5
127.0.0.1:6379> pfadd userIds2 "uuid-4" "uuid-5" "uuid-6" "uuid-7"
(integer) 1
127.0.0.1:6379> pfcount userIds2
(integer) 4
127.0.0.1:6379> pfcount userIds1 userIds2
(integer) 4
127.0.0.1:6379> pfmerge userIdMerge userIds1 userIds2
OK
127.0.0.1:6379> pfcount userIdMerge
(integer) 4
127.0.0.1:6379> pfmerge userIdMerge userIds userIds2
OK
127.0.0.1:6379> pfcount userIdMerge
(integer) 8
127.0.0.1:6379> PFSELFTEST
OK
3)内存消耗:百万独立用户
elements=""
key="2019_05_01:unique:ids"
for i in `seq 1 1000000`
do
elements="${elements} uuid-"${i}
if[[ $(i % 1000) == 0]] # ???
then
redis-cli pfadd ${key} ${elements}
elements=""
fi
done
memory use
相对于bigmap 和set节省内存,小的惊人。
4)使用经验
- 是否能容忍错误?(错误率:0.81%)
pfcount 2019_05_01:unique:ids (integer)
1009886
- 是否需要单条数据?(无法获取到)
网友评论