美文网首页
Redis分布式锁

Redis分布式锁

作者: 爱恨_交加 | 来源:发表于2020-08-26 17:56 被阅读0次

    Distributed locks with Redis

    Redis分布式锁

    Distributed locks are a very useful primitive in many environments where different processes must operate with shared resources in a mutually exclusive way.
    在许多环境中,分布式锁是一个非常有用的原语,在这些环境中,不同进程必须以互斥的方式操作共享资源。

    There are a number of libraries and blog posts describing how to implement a DLM (Distributed Lock Manager) with Redis, but every library uses a different approach, and many use a simple approach with lower guarantees compared to what can be achieved with slightly more complex designs.
    有很多库和博客文章描述了如何用Redis实现DLM(分布式锁管理器),但是每个库都使用不同的方法,而且很多库使用的是一种简单的方法,与稍微复杂一点的设计相比,这种方法的保证更低。

    This page is an attempt to provide a more canonical algorithm to implement distributed locks with Redis. We propose an algorithm, called Redlock, which implements a DLM which we believe to be safer than the vanilla single instance approach. We hope that the community will analyze it, provide feedback, and use it as a starting point for the implementations or more complex or alternative designs.
    这一页试图提供一个更规范的算法来实现Redis分布式锁。我们提出了一种称为Redlock的算法,它实现了一种我们认为比普通的单实例方法更安全的DLM。我们希望社区能够分析它,提供反馈,并将其作为实现或更复杂或替代设计的起点。

    Implementations

    实现

    Before describing the algorithm, here are a few links to implementations already available that can be used for reference.

    以下是不同语言对Redlock算法的一些具体实现

    Safety and Liveness guarantees

    保证安全和活性

    We are going to model our design with just three properties that, from our point of view, are the minimum guarantees needed to use distributed locks in an effective way.
    我们将使用三个属性来建模我们的设计,从我们的角度来看,这三个属性是有效使用分布式锁所需的最小保证。

    • Safety property: Mutual exclusion. At any given moment, only one client can hold a lock.
      安全性:互斥。在任何给定时刻,只有一个客户端可以持有一个锁。
    • Liveness property A: Deadlock free. Eventually it is always possible to acquire a lock, even if the client that locked a resource crashes or gets partitioned.
      活性A:无死锁。锁定资源的客户端即使崩溃或分区了,最终也总是可能获得锁。
    • Liveness property B: Fault tolerance. As long as the majority of Redis nodes are up, clients are able to acquire and release locks.
      活性B:容错。只要大部分Redis节点都在运行,客户端就能够获取和释放锁。

    Why failover-based implementations are not enough

    为什么基于故障转移的实现是不够的

    To understand what we want to improve, let’s analyze the current state of affairs with most Redis-based distributed lock libraries.
    为了理解我们想要改进什么,让我们分析一下大多数基于redis的分布式锁库的当前状态。

    The simplest way to use Redis to lock a resource is to create a key in an instance. The key is usually created with a limited time to live, using the Redis expires feature, so that eventually it will get released (property 2 in our list). When the client needs to release the resource, it deletes the key.
    使用Redis锁定资源的最简单的方法是在实例中创建一个key。通常在创建key时使用Redis的过期功能给key设置一个活性时间限制,所以最终它会被释放(活性A的无死锁保证)。当客户端需要释放资源时,它会删除key。

    Superficially this works well, but there is a problem: this is a single point of failure in our architecture. What happens if the Redis master goes down? Well, let’s add a slave! And use it if the master is unavailable. This is unfortunately not viable. By doing so we can’t implement our safety property of mutual exclusion, because Redis replication is asynchronous.
    表面上看,这没什么,但有一个问题:这是我们架构中的单点故障。如果Redis主机坏了怎么办?好吧,让我们添加一个从机!并且在主服务器不可用时使用它。不幸的是,这是不可行的。这样做,我们就不能实现互斥的安全属性,因为Redis复制(Redis副本)是异步的。

    There is an obvious race condition with this model:
    这个模型有一个明显的竞态条件:

    1. Client A acquires the lock in the master.
      客户端A在主机中获得锁
    2. The master crashes before the write to the key is transmitted to the slave.
      在写入的key被传输到从机之前,主机就崩溃了。
      奴隶被提升为主人。
    3. The slave gets promoted to master.
      从机被提升为主机
    4. Client B acquires the lock to the same resource A already holds a lock for. SAFETY VIOLATION!
      客户端B获得A已经为其持有锁的同一资源的锁。安全违反!

    Sometimes it is perfectly fine that under special circumstances, like during a failure, multiple clients can hold the lock at the same time. If this is the case, you can use your replication based solution. Otherwise we suggest to implement the solution described in this document.
    有时,在特殊情况下,比如发生故障时,多个客户端可以同时持有锁,这是完全没有问题的。如果是这种情况,可以使用基于副本的解决方案。否则,我们建议实现本文档中描述的解决方案。

    Correct implementation with a single instance

    单实例情况下的正确实现

    Before trying to overcome the limitation of the single instance setup described above, let’s check how to do it correctly in this simple case, since this is actually a viable solution in applications where a race condition from time to time is acceptable, and because locking into a single instance is the foundation we’ll use for the distributed algorithm described here.
    在试图克服上述单一实例设置的限制之前,让我们在这个简单的场景下检查如何正确地实现,因为在可以接受不时的竞态条件的应用中,这实际上是一个可行的解决方案。因为在锁定单个实例的基础上,我们将使用这里描述的分布式算法。

    To acquire the lock, the way to go is the following:
    获取锁的方法如下:

    SET resource_name my_random_value NX PX 30000
    

    The command will set the key only if it does not already exist (NX option), with an expire of 30000 milliseconds (PX option). The key is set to a value “myrandomvalue”. This value must be unique across all clients and all lock requests.
    该命令将只会在key不存在的情况下设置它(NX选项),过期时间为30000毫秒(PX选项)。key的值是“myrandomvalue”。这个值在所有客户端和所有锁请求中必须是唯一的。

    Basically the random value is used in order to release the lock in a safe way, with a script that tells Redis: remove the key only if it exists and the value stored at the key is exactly the one I expect to be. This is accomplished by the following Lua script:
    基本上,随机值是用来释放锁的一个安全方式,通过脚本告诉Redis:只有当它存在,并且key存储的值正是它期望的才删除key(即要求随机值存在并且与key值相同)。这可以通过下面的Lua脚本完成:

    if redis.call("get",KEYS[1]) == ARGV[1] then
        return redis.call("del",KEYS[1])
    else
        return 0
    end
    

    This is important in order to avoid removing a lock that was created by another client. For example a client may acquire the lock, get blocked in some operation for longer than the lock validity time (the time at which the key will expire), and later remove the lock, that was already acquired by some other client. Using just DEL is not safe as a client may remove the lock of another client. With the above script instead every lock is “signed” with a random string, so the lock will be removed only if it is still the one that was set by the client trying to remove it.
    重要的是:要避免删除由另一个客户机创建的锁。例如,客户机可能会获得锁,在某个操作中被阻塞的时间超过锁的有效期(key将过期的时间),然后删除其他客户机已经获得的锁。仅仅使用DEL是不安全的,因为客户机可能会删除另一个客户机的锁。在上面的脚本中,每个锁都是用一个随机字符串“签名”的,所以只有当它仍然是客户端试图删除它时所设置的锁时,才会删除它。

    What should this random string be? I assume it’s 20 bytes from /dev/urandom, but you can find cheaper ways to make it unique enough for your tasks. For example a safe pick is to seed RC4 with /dev/urandom, and generate a pseudo random stream from that. A simpler solution is to use a combination of unix time with microseconds resolution, concatenating it with a client ID, it is not as safe, but probably up to the task in most environments.
    这个随机字符串应该是什么?我假设它是来自/dev/urandom 20个字节,但是你可以找到更简便的方法使它对你的任务足够唯一。例如,一个安全的选择是使用/dev/urandom的RC4作为种子,并从中生成一个伪随机流。一种更简单的解决方案是结合使用具有微秒级分辨率的unix时间,并将其与客户机ID连接起来,虽然不那么安全,但在大多数环境中可能可以完成这项任务。

    The time we use as the key time to live, is called the “lock validity time”. It is both the auto release time, and the time the client has in order to perform the operation required before another client may be able to acquire the lock again, without technically violating the mutual exclusion guarantee, which is only limited to a given window of time from the moment the lock is acquired.
    我们将key生存的时间称为“锁有效性时间”。它既是自动释放的时间,也是客户端在另一个客户端可以再次获得锁之前执行操作的时间,在技术上不违反互斥保证,这仅限于从获得锁的那一刻起给定的时间窗口。

    So now we have a good way to acquire and release the lock. The system, reasoning about a non-distributed system composed of a single, always available, instance, is safe. Let’s extend the concept to a distributed system where we don’t have such guarantees.
    所以现在我们有了一个获取和释放锁的好方法。这个由单个始终可用的实例组成的非分布式系统是安全的。让我们将这个概念扩展到一个没有这样保证的分布式系统中。

    相关文章

      网友评论

          本文标题:Redis分布式锁

          本文链接:https://www.haomeiwen.com/subject/jtltsktx.html