美文网首页程序员Java技术
改进版Snowflake全局ID生成器-uid-generato

改进版Snowflake全局ID生成器-uid-generato

作者: shiy4n | 来源:发表于2019-06-19 21:15 被阅读51次

    本文主要介绍 uid-generator (一种全局ID服务实现)

    uid-generator介绍

    全局ID服务是分布式服务中的基础服务,需要保持全局唯一,高效,高可靠性。有些时候还可能要求保持单调,但也并非一定要严格递增或者递减

    全局ID也可以通过数据库的自增主键来获取,但是如果要求QPS很高显然是不现实的

    uid-generator是对Snowflake算法的改进,也引入了高性能队列disruptor中RingBuffer思想,进一步提升了效率

      +------+----------------------+----------------+-----------+
      | sign |     delta seconds    | worker node id | sequence  |
      +------+----------------------+----------------+-----------+
        1bit          28bits              22bits         13bits
    
    • sign 符号位 保证为正数

    • delta seconds 当前时间与约定时间的差值

    • word node id机器码

    • sequence 同一时刻支持并发数

    上图就是snowflake算法生成的64位的长整型构成

    uid-generator的work node id 使用了数据库自增主键的key,每次重启服务都需要刷新,这也保证了集群中全局ID的唯一性

    worker node id字段处理

    uid-generator使用数据库主键作为worker node id

    这样看来这个worker node id其实可以有很丰富的扩展性,只要对表结构稍微修改,就可以记录使得worker node id 有更有意义的含义。

    例如使用uid-generator生成的值作为表的主键ID,可以通过对WORKER_NODE表增加一列表名记录表,这样通过id就反向查找对应的表名

    sequence字段的处理

    uid-generator中实现了原生的snowflake以及缓存版的。这两个版本对于sequence字段的处理有所不同

    DefaultUidGenerator.java

    /**
    * Get UID
    *
    * @return UID
    * @throws UidGenerateException in the case: Clock moved backwards; Exceeds the max timestamp
    */
    protected synchronized long nextId() {
        long currentSecond = getCurrentSecond();
    
        // Clock moved backwards, refuse to generate uid
        if (currentSecond < lastSecond) {
            long refusedSeconds = lastSecond - currentSecond;
            throw new UidGenerateException("Clock moved backwards. Refusing for %d seconds", refusedSeconds);
        }
    
        // At the same second, increase sequence
        if (currentSecond == lastSecond) {
            sequence = (sequence + 1) & bitsAllocator.getMaxSequence();
            // Exceed the max sequence, we wait the next second to generate uid
            if (sequence == 0) {
                currentSecond = getNextSecond(lastSecond);
            }
    
        // At the different second, sequence restart from zero
        } else {
            sequence = 0L;
        }
    
        lastSecond = currentSecond;
    
        // Allocate bits for UID
        return bitsAllocator.allocate(currentSecond - epochSeconds, workerId, sequence);
    }
    
    

    DefaultUidGenerator 并发通过 函数加锁控制;获取seq时通过时间判断是否需要调到下一秒

    CachedUidGenerator.java

     /**
        * Padding buffer fill the slots until to catch the cursor
        */
    public void paddingBuffer() {
        LOGGER.info("Ready to padding buffer lastSecond:{}. {}", lastSecond.get(), ringBuffer);
    
        // is still running
        if (!running.compareAndSet(false, true)) {
            LOGGER.info("Padding buffer is still running. {}", ringBuffer);
            return;
        }
    
        // fill the rest slots until to catch the cursor
        boolean isFullRingBuffer = false;
        while (!isFullRingBuffer) {
            List<Long> uidList = uidProvider.provide(lastSecond.incrementAndGet());
            for (Long uid : uidList) {
                isFullRingBuffer = !ringBuffer.put(uid);
                if (isFullRingBuffer) {
                    break;
                }
            }
        }
    
        // not running now
        running.compareAndSet(true, false);
        LOGGER.info("End to padding buffer lastSecond:{}. {}", lastSecond.get(), ringBuffer);
    }
    

    CachedUidGenerator 加锁通过CAS操作;由于需要一次填充完缓存,所以选取了一次填充一秒内所有的seq,以此保证了seq在一秒内的唯一性。这样带来的一个小弊端是不能通过id看出来这个id生成的时间

    CachedUidGenerator核心RingBuffer实现

    RingBuffer是一个环形数组,通过两个指针,tailcursor来实现复用槽

    在这里需要介绍一下FalseShare陷阱,由于tail和cursor指针在高并发情况下变动频繁,如果tail,cursor处于同一个缓存中,将会频繁导致缓存失效,可以看到uid-generator已经考虑了这个问题

    通过对PaddedAtomicLong进行填充,保证了每一个long值都在不同的缓存行中,解决了这个问题

    RingBuffer基本都用位运算取代了乘除以及取模计算,提高了计算效率

    /**
        * Put an UID in the ring & tail moved<br>
        * We use 'synchronized' to guarantee the UID fill in slot & publish new tail sequence as atomic operations<br>
        * 
        * <b>Note that: </b> It is recommended to put UID in a serialize way, cause we once batch generate a series UIDs and put
        * the one by one into the buffer, so it is unnecessary put in multi-threads
        *
        * @param uid
        * @return false means that the buffer is full, apply {@link RejectedPutBufferHandler}
        */
    public synchronized boolean put(long uid) {
        long currentTail = tail.get();
        long currentCursor = cursor.get();
    
        // tail catches the cursor, means that you can't put any cause of RingBuffer is full
        long distance = currentTail - (currentCursor == START_POINT ? 0 : currentCursor);
        if (distance == bufferSize - 1) {
            rejectedPutHandler.rejectPutBuffer(this, uid);
            return false;
        }
    
        // 1. pre-check whether the flag is CAN_PUT_FLAG
        int nextTailIndex = calSlotIndex(currentTail + 1);
        if (flags[nextTailIndex].get() != CAN_PUT_FLAG) {
            rejectedPutHandler.rejectPutBuffer(this, uid);
            return false;
        }
    
        // 2. put UID in the next slot
        // 3. update next slot' flag to CAN_TAKE_FLAG
        // 4. publish tail with sequence increase by one
        slots[nextTailIndex] = uid;
        flags[nextTailIndex].set(CAN_TAKE_FLAG);
        tail.incrementAndGet();
    
        // The atomicity of operations above, guarantees by 'synchronized'. In another word,
        // the take operation can't consume the UID we just put, until the tail is published(tail.incrementAndGet())
        return true;
    }
    

    RingBuffer的put方法中可以看到整个的流程,首先是函数加锁,加锁的原因在注释部分也解释了,由于是每次批量存入的,多线程put操作是没有必要的,之后第一步计算tail与cursor距离当前数组是否还可以继续填充。这里还有另外一个标识位用来判断当前槽是否可以做PUT以及TAKE操作,更像是双保险,防止上一个判断距离结束了之后tail位置有变动,导致槽位被覆盖

    同样对于take操作

        /**
         * Take an UID of the ring at the next cursor, this is a lock free operation by using atomic cursor<p>
         * 
         * Before getting the UID, we also check whether reach the padding threshold, 
         * the padding buffer operation will be triggered in another thread<br>
         * If there is no more available UID to be taken, the specified {@link RejectedTakeBufferHandler} will be applied<br>
         * 
         * @return UID
         * @throws IllegalStateException if the cursor moved back
         */
        public long take() {
            // spin get next available cursor
            long currentCursor = cursor.get();
            long nextCursor = cursor.updateAndGet(old -> old == tail.get() ? old : old + 1);
    
            // check for safety consideration, it never occurs
            Assert.isTrue(nextCursor >= currentCursor, "Curosr can't move back");
    
            // trigger padding in an async-mode if reach the threshold
            long currentTail = tail.get();
            if (currentTail - nextCursor < paddingThreshold) {
                LOGGER.info("Reach the padding threshold:{}. tail:{}, cursor:{}, rest:{}", paddingThreshold, currentTail,
                        nextCursor, currentTail - nextCursor);
                bufferPaddingExecutor.asyncPadding();
            }
    
            // cursor catch the tail, means that there is no more available UID to take
            if (nextCursor == currentCursor) {
                rejectedTakeHandler.rejectTakeBuffer(this);
            }
    
            // 1. check next slot flag is CAN_TAKE_FLAG
            int nextCursorIndex = calSlotIndex(nextCursor);
            Assert.isTrue(flags[nextCursorIndex].get() == CAN_TAKE_FLAG, "Curosr not in can take status");
    
            // 2. get UID from next slot
            // 3. set next slot flag as CAN_PUT_FLAG.
            long uid = slots[nextCursorIndex];
            flags[nextCursorIndex].set(CAN_PUT_FLAG);
    
            // Note that: Step 2,3 can not swap. If we set flag before get value of slot, the producer may overwrite the
            // slot with a new UID, and this may cause the consumer take the UID twice after walk a round the ring
            return uid;
        }
    

    正如注释中所说,take部分并没有并发限制,在剩余可用槽位小于一个阈值的时候,会触发一次填充操作

    CachedUidGenerator 对于填充有两种处理,一个是低于阈值填充,一种是开启Schedule,定时填充,定时填充可选

    uid-generator可靠性很高,除了workid依赖数据库之外基本不依赖外部中间件,全局ID在分布式服务中扮演关键角色,一旦服务出错,解决起来也很棘手。

    相关文章

      网友评论

        本文标题:改进版Snowflake全局ID生成器-uid-generato

        本文链接:https://www.haomeiwen.com/subject/dwmlqctx.html