OpenZeppelin库EnumerableSet数据结构的分

作者: 渚清与沙白 | 来源:发表于2023-08-17 14:40 被阅读0次

OpenZeppelin库是基于Solidity 0.8.0版本进行分析。(https://docs.openzeppelin.com/contracts/4.x/)
首先EnumerableSet是一种数据结构，用来存储数据，方便增删改查的操作。

在Solidity语言中，key-value形式的mapping数据类型也支持增删改查，但是它的删除只会将value设置成类型对应的默认值。所以，要实现存储的值真正意义上的删除，需要借助数组来实现。使用数组还可以获取数组长度（_length函数），检查元素是否存在(_contains函数)，便于查找(_at函数)，这些mapping都无法单独实现。

于是，其中就定义了一个结构体Set。

struct Set {
        // Storage of set values
        bytes32[] _values;
        // Position of the value in the `values` array, plus 1 because index 0
        // means a value is not in the set.
        mapping(bytes32 => uint256) _indexes;
    }

_values变量是一个字节数组用来存储值，_indexes是一个mapping类型，用来记录值在数组中的位置。
_indexes变量的key，bytes32是我们需要存储的值；_indexes变量的value，uint256就是value在数组中的索引位置。

add函数

// 存储值value在数组中的位置是否为0，为0表示数组中不存在，否则数组中存在该value
function _contains(Set storage set, bytes32 value) private view returns (bool) {
        return set._indexes[value] != 0;
}
// 新增 
function _add(Set storage set, bytes32 value) private returns (bool) {
        if (!_contains(set, value)) {
            set._values.push(value);
            // The value is stored at length-1, but we add 1 to all indexes
            // and use 0 as a sentinel value
            set._indexes[value] = set._values.length;
            return true;
        } else {
            return false;
        }
    }

在_add函数中，传递byte32类型的value参数。先检查数组是否存储过该值，已经存储返回false，否则进行存储，返回true。
这里存储使用了数组的push函数，同时在mapping类型的_indexes中记录当前数组的索引位置，这里给的是数组的长度。存储一个值，索引位置就是1，每新增一个值，索引位置会依次累加。这样就实现了新增功能。

remove

function _remove(Set storage set, bytes32 value) private returns (bool) {
        // We read and store the value's index to prevent multiple reads from the same storage slot
        uint256 valueIndex = set._indexes[value];

        if (valueIndex != 0) {
            // Equivalent to contains(set, value)
            // To delete an element from the _values array in O(1), we swap the element to delete with the last one in
            // the array, and then remove the last element (sometimes called as 'swap and pop').
            // This modifies the order of the array, as noted in {at}.

            uint256 toDeleteIndex = valueIndex - 1;
            uint256 lastIndex = set._values.length - 1;

            if (lastIndex != toDeleteIndex) {
                bytes32 lastValue = set._values[lastIndex];

                // Move the last value to the index where the value to delete is
                set._values[toDeleteIndex] = lastValue;
                // Update the index for the moved value
                set._indexes[lastValue] = valueIndex; // Replace lastValue's index to valueIndex
            }

            // Delete the slot where the moved value was stored
            set._values.pop();

            // Delete the index for the deleted slot
            delete set._indexes[value];

            return true;
        } else {
            return false;
        }
    }

_remove删除函数的逻辑是，先找出要删除value的索引位置，这里的if判断同样是检查数组中是否存在该值。

uint256 toDeleteIndex = valueIndex - 1;
uint256 lastIndex = set._values.length - 1;

toDeleteIndex和lastIndex的值为什么要减去1？

前面新增函数提到过，指定索引位置时数组的长度，由于数组的索引下标是从0开始，所以这里需要减去1。

 bytes32 lastValue = set._values[lastIndex];
 // Move the last value to the index where the value to delete is
 set._values[toDeleteIndex] = lastValue;
 // Update the index for the moved value
 set._indexes[lastValue] = valueIndex; // Replace lastValue's index to valueIndex

// 开始删除
// Delete the slot where the moved value was stored
set._values.pop();
// Delete the index for the deleted slot
delete set._indexes[value];

这里是将要删除的value和数组最后一个元素交换位置，使用数组的pop函数，移除最后一个元素，这样就实现了需要删除指定的value。最后，再使用delete指令把mapping中记录的索引位置删除，会自动设置为0，所以前面的_contains函数要判断不等于0。

通过 _add 函数可以看出，EnumerableSet的value具有唯一性，不可重复。
通过 _remove 函数可以看出，EnumerableSet无法保证数据存储的顺序性。

at函数查找

function _at(Set storage set, uint256 index) private view returns (bytes32) {
        return set._values[index];
}

查找就很简单了，直接通过数组的 [ ] 方式来访问。

length长度

  function _length(Set storage set) private view returns (uint256) {
        return set._values.length;
  }

获取EnumerableSet的长度，其实就是数组的长度。
以上函数都是private修饰，只允许在EnumerableSet内访问。所以，我们在使用的时候并不能访问这些函数，所以定义了三种类型的结构体，Bytes32Set、AddressSet和UintSet共开发者使用；可以看出分别支持bytes32、address和uint三种基本类型。

// bytes32
struct Bytes32Set {
    Set _inner;
}
// address
struct AddressSet {
    Set _inner;
}
// uint
struct UintSet {
    Set _inner;
}

同时分别对这三种结构体做了add、remove、at、contains、length、values函数的重载。这些函数都是用了internal修饰，而不是private。

Bytes32Set

function add(Bytes32Set storage set, bytes32 value) internal returns (bool) {
        return _add(set._inner, value);
}
function remove(Bytes32Set storage set, bytes32 value) internal returns (bool) {
        return _remove(set._inner, value);
}
function contains(Bytes32Set storage set, bytes32 value) internal view returns (bool) {
        return _contains(set._inner, value);
}
function length(Bytes32Set storage set) internal view returns (uint256) {
        return _length(set._inner);
}
function at(Bytes32Set storage set, uint256 index) internal view returns (bytes32) {
        return _at(set._inner, index);
}
function values(Bytes32Set storage set) internal view returns (bytes32[] memory) {
        bytes32[] memory store = _values(set._inner);
        bytes32[] memory result;

        /// @solidity memory-safe-assembly
        assembly {
            result := store
        }

        return result;
}

add、remove函数传递的参数是bytes32类型，Bytes32Set类型的变量set调用了_inner属性；即set._inner。

set._inner其实就是一个Set结构，传递到_add和_remove函数中，并且_add和_remove函数接收的value参数正好是byte32类型。所以，Bytes32Set类型在函数调用处理value时不需要进行类型转换。

然而，AddressSet和UintSet则不一样，他们需要进行类型转换，将address和uint基本类型转换成bytes32类型才能调用_add和_remove函数。

AddressSet

function add(AddressSet storage set, address value) internal returns (bool) {
        return _add(set._inner, bytes32(uint256(uint160(value))));
}
function remove(AddressSet storage set, address value) internal returns (bool) {
        return _remove(set._inner, bytes32(uint256(uint160(value))));
}
 function at(AddressSet storage set, uint256 index) internal view returns (address) {
        return address(uint160(uint256(_at(set._inner, index))));
 }

这里重点关注下类型转换，将address类型转换成byte32类型、将byte32类型转换成address类型。

address -> bytes32

add和remove函数在调用_add和_remove函数时，需要传递bytes32类型，所以这里需要对address类型进行转换。

uint160(value)：由于address类型的长度是20个字节，也就是160位，所以先转成了uint160。
uint256(uint160(value))：由于bytes32是32字节，256位，所以用uint256做了一次转换。
bytes32(uint256(uint160(value)))：最后将uint256类型转换成bytes32类型。

byte32 ->address

观察at函数的返回值 return address(uint160(uint256(_at(set._inner, index)))); 把bytes32转换成address。

_at(set._inner, index) 返回值是bytes32类型，256位，所以需要uint256进行转换。
uint256(_at(set._inner, index)) address类型20字节，160位，需要uint160进行转换。
uint160(uint256(_at(set._inner, index))) uint160直接转换成address类型即可。
address(uint160(uint256(_at(set._inner, index))))。

UintSet

function add(UintSet storage set, uint256 value) internal returns (bool) {
       return _add(set._inner, bytes32(value));
}
function remove(UintSet storage set, uint256 value) internal returns (bool) {
       return _remove(set._inner, bytes32(value));
}
function at(UintSet storage set, uint256 index) internal view returns (uint256) {
       return uint256(_at(set._inner, index));
}

UintSet同AddressSet一样，需要进行类型转换，不过转换步骤要简单很多。

uint256 -> bytes32

add和remove函数接收参数是uint256类型，uint256和bytes32类型都是256位，所以直接转换就行，bytes32(value)

bytes32-> uint256

这里同理，直接转换即可。

总结

EnumerableSet具有元素无序性、唯一性。
类型不同时，需要进行类型转换。
查找元素时，没有使用for循环，效率高。

思考

为什么不实现一个修改的函数？

 // Move the last value to the index where the value to delete is
 set._values[toDeleteIndex] = lastValue;

从_remove函数的实现来看，要实现修改其实也是可以的，通过传参value查找mapping中记录的索引位置，再通过数组访问索引位置，重新赋值即可。
为什么源码库没有实现这个修改操作呢？

Solidity中有很多基础数据类型，uint8、uint120、int16、bytes20等等，而EnumerableSet仅仅支持bytes32、address和uint256三种基础数据类型。这些数据类型应该如何使用EnumerableSet的一些函数呢？
也许通过类型转换，转换成EnumerableSet需要的类型就能满足要求了。
从EnumerableSet名字可以看出，为什么是Set？是否与Java中Set相似？
也许是因为数据存储的无序性，不能保证数据存储顺序的一致性，所以命名为EnumerableSet。

OpenZeppelin库EnumerableSet数据结构的分

add函数

remove

toDeleteIndex和lastIndex的值为什么要减去1？

at函数查找

length长度

Bytes32Set

AddressSet

address -> bytes32

byte32 ->address

UintSet

uint256 -> bytes32

bytes32-> uint256

总结

思考

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

OpenZeppelin库EnumerableSet数据结构的分

add函数

remove

toDeleteIndex和lastIndex的值为什么要减去1？

at函数 查找

length长度

Bytes32Set

AddressSet

address -> bytes32

byte32 ->address

UintSet

uint256 -> bytes32

bytes32-> uint256

总结

思考

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

at函数查找