美文网首页
一致性 hash 数据分布测试

一致性 hash 数据分布测试

作者: 董泽润 | 来源:发表于2019-05-11 15:36 被阅读0次

    关于什么是 Consistent Hashing 可以参考这篇文章。一般用于对于数据状态有一定要求的缓存场景,比如 web session 数据。

    session consistent hash
    节点根据算法,映射到环中。用户请求时,根据 id 求出 hash 值,按顺(逆)时针找到的第一个节点,就是存储数据节点。一般为了更加均匀些,会将节点按 replica(weight) 复制多个虚拟节点映射到环中。

    数据不均匀问题

    了解 LB 的都知道,lvs 这些负载均衡设备能做到请求均匀,但是经典实现的一致性 hash 是做不到的。具体场景,可能差别更大。测试脚本放到了 consistenthash balance test github,感兴趣的可以看下,测试场景如下:

    节点数量:10
    虚拟系数:3, 10, 50, 100, 200, 400, 600, 800, 1000
    测试算法:MurMurHash, Crc32, Fnv1, CityHash
    请求数量:50w
    
    方差

    方差可以很好的衡量数据分布程度,可以看到,Crc32 是最差的,Fnv1 在数量很少时也差,MurMurHash 和 CityHash 在虚拟节点超过 100 个后讯速收敛。

    diff ratio

    diff ratio 指数据分布最多的与最小的差值,除以平均值。从数据分布可以看到, MurMurHash, CityHash 首选的 hash 算法。replica 虚节点数量可以调大一些,毕竟动态增减节点不是常态,当然算法性能也是一种衡量标准。

    另一种方案

    冒泡哥提到了一种实现,提前预计算好节点顺序,可以减少分布 diff ratio. 他在生产环境,只用 20 replica 就降低到了 5% 当然了前提是,增加减节点,必须符合一致性 hash 的思想,数据尽可能少的移动。

    测试数据

    MurMur  replica:3   var:43930   max:72370   min:35411   diff:36959  ratio:73.918000
    MurMur  replica:10  var:69441   max:85238   min:19257   diff:65981  ratio:131.962000
    MurMur  replica:50  var:20520   max:59708   min:40669   diff:19039  ratio:38.078000
    MurMur  replica:100 var:12556   max:57515   min:42045   diff:15470  ratio:30.940000
    MurMur  replica:200 var:11539   max:55858   min:41998   diff:13860  ratio:27.720000
    MurMur  replica:400 var:7161    max:53385   min:45323   diff:8062   ratio:16.124000
    MurMur  replica:600 var:5586    max:51697   min:45913   diff:5784   ratio:11.568000
    MurMur  replica:800 var:4101    max:52118   min:48041   diff:4077   ratio:8.154000
    MurMur  replica:1000    var:4369    max:52888   min:48356   diff:4532   ratio:9.064000
    Crc32   replica:3   var:76353   max:97097   min:23872   diff:73225  ratio:146.450000
    Crc32   replica:10  var:30219   max:61334   min:31891   diff:29443  ratio:58.886000
    Crc32   replica:50  var:12231   max:55475   min:43926   diff:11549  ratio:23.098000
    Crc32   replica:100 var:22105   max:58828   min:37776   diff:21052  ratio:42.104000
    Crc32   replica:200 var:28465   max:59481   min:35807   diff:23674  ratio:47.348000
    Crc32   replica:400 var:25440   max:58781   min:37059   diff:21722  ratio:43.444000
    Crc32   replica:600 var:33926   max:61694   min:35887   diff:25807  ratio:51.614000
    Crc32   replica:800 var:33978   max:61327   min:36914   diff:24413  ratio:48.826000
    Crc32   replica:1000    var:27789   max:60515   min:37709   diff:22806  ratio:45.612000
    Fnv1    replica:3   var:348461  max:377986  min:5997    diff:371989 ratio:743.978000
    Fnv1    replica:10  var:200673  max:231641  min:14451   diff:217190 ratio:434.380000
    Fnv1    replica:50  var:41072   max:84924   min:38135   diff:46789  ratio:93.578000
    Fnv1    replica:100 var:11567   max:59547   min:47254   diff:12293  ratio:24.586000
    Fnv1    replica:200 var:6465    max:53451   min:46127   diff:7324   ratio:14.648000
    Fnv1    replica:400 var:5126    max:52454   min:47757   diff:4697   ratio:9.394000
    Fnv1    replica:600 var:3748    max:52520   min:48394   diff:4126   ratio:8.252000
    Fnv1    replica:800 var:6089    max:52003   min:46211   diff:5792   ratio:11.584000
    Fnv1    replica:1000    var:5881    max:53028   min:46902   diff:6126   ratio:12.252000
    Cityhash    replica:3   var:77979   max:99378   min:11576   diff:87802  ratio:175.604000
    Cityhash    replica:10  var:60194   max:84045   min:25002   diff:59043  ratio:118.086000
    Cityhash    replica:50  var:26496   max:67141   min:38121   diff:29020  ratio:58.040000
    Cityhash    replica:100 var:11862   max:56008   min:44075   diff:11933  ratio:23.866000
    Cityhash    replica:200 var:9921    max:54711   min:44592   diff:10119  ratio:20.238000
    Cityhash    replica:400 var:5542    max:53629   min:47391   diff:6238   ratio:12.476000
    Cityhash    replica:600 var:6823    max:53417   min:46261   diff:7156   ratio:14.312000
    Cityhash    replica:800 var:5100    max:51615   min:46577   diff:5038   ratio:10.076000
    Cityhash    replica:1000    var:4884    max:52151   min:47166   diff:4985   ratio:9.970000
    

    相关文章

      网友评论

          本文标题:一致性 hash 数据分布测试

          本文链接:https://www.haomeiwen.com/subject/sgexaqtx.html