ZXAlgorithm - C8 Hash and Heap

作者: 左心Chris | 来源:发表于2019-11-21 18:55 被阅读0次

ZXAlgorithm - C8 Hash and Heap
【geekband】系统设计第三周
GeekBand C++系统设计与实践第三周
JS标准库/数组API
代码小工蚁的#《算法图解》#学习笔记-C8贪婪算法
ZXAlgorithm - CAT测试
Happy heap heap
ptmalloc heap_free
(转)Objective-C 拾遗：从Heap and Stac
Heap&HeapSort（MIT算法导论）

Hash
Theorem
HashMap,HashSet,HashTable,
HashFunction 31
Open & close
D: LinkedList
D: null vs available
Rehashing
D: negative number process
Application
LRU
D: HashMap and Node（LinkedList）
Heap
PriorityQueue
TreeMap

0 Outline

这一章的脉络是什么？
介绍Queue和Stack的用处，引申到Hash，然后是介绍hash的基本性质，相关题目如LRU Cache，然后介绍Heap和相关性质，了解TreeMap和PriorityQueue和TreeSet

1 Data structure

Queue, stack, Hash是什么，有哪些操作，复杂度是？
Queue: BFS Push Pop Top都是O(1)
Stack: DFS with no recursion Push Pop Top都是O(1)
Hash: Insert Find Delete都是O(1)

2 Hash

Theorem

什么是Hash？

Hash table/hash map/hash set/cocurrent hash map
java这些数据结构的区别？
https://blog.csdn.net/spicehava1/article/details/80278915
https://blog.csdn.net/weixin_34383618/article/details/85993919
Hashset no key
Hash table: thread safe but slow versus. Hash map
cocurrent hashmap支持线程安全，且比hashtable快
Hash function: magic number 31 and MD5 SHA-1 SHA-2
Sum = 0
Open Hashing vs Closed Hashing
如果键值冲突了怎么处理？
用链表进行处理，
OH: linkedlist
CH: when remove an element, should label available not null
https://blog.csdn.net/yue_hu/article/details/80661438
Rehashing
什么时候Rehash比较好，怎么处理比较好？
饱和度When the size of keys is larger than 1/10 hash table capacity, need to rehash
Rehashing
http://www.lintcode.com/problem/rehashing/ http://www.jiuzhang.com/solutions/rehashing/
Iterate the table, add a while loop inside, use new or dummy to new
D: newIndex = (hashTable[i] % newCap + newCap) % newCap
https://stackoverflow.com/questions/90238/whats-the-syntax-for-mod-in-java
Performing the same operation with the "%" or rem operator maintains the sign of the X value. If X is negative you get a result in the range (-Y, 0]. If X is positive you get a result in the range [0, Y).
长度乘以2，遍历原来的表，添加到新的表上面去
一致性hash是什么？
https://juejin.im/post/5ae1476ef265da0b8d419ef2
分布在0-2^32的圆上，节点通过hash分布在圆上？节点的hash值，所有键hash完了顺时针找对应存储的机器，如果添加节点，只要改新增节点和上个节点的数据，如果减少节点，只要改删除节点和上个节点的数据；如果节点过少，就增加多个虚拟节点，尽可能保证数据分布均匀

Application

LRU cache是什么？
可以实现两个操作，获取键值和设置键值，当获取不到的时候，返回-1，设置键值如果cache满了，要把最不常用的那个键值删了
什么是python defaultdict和orderedDict?
https://docs.python.org/zh-cn/3/library/collections.html
defaultdict提供一个默认值得dict，OrderedDict创建记住键值最后插入顺序的有序字典变体很简单。如果新条目覆盖现有条目，则原始插入位置将更改并移至末尾
Python OrderDict()怎么简单实现？
使用move_to_end和popitem，根据last参数来决定先进先出
https://docs.python.org/zh-cn/3/library/functools.html#functools.lru_cache

LRU Cache
https://leetcode.com/problems/lru-cache/
https://leetcode.com/problemset/all/?search=LRU%20Cache
https://www.jiuzhang.com/solutions/?search=LRU%20Cache

Python使用OrderedDict
https://leetcode.com/problems/lru-cache/discuss/45952/Python-concise-solution-with-comments-(Using-OrderedDict).
https://leetcode.com/problems/lru-cache/discuss/45964/Very-short-solution-using-Python's-OrderedDict
pop(key)会弹出key，popitem(last=False)会FIFO，读的时候有的话就先pop再设置值；写的时候，如果不是增加，就跟读类似的操作，如果是增加要看有没有超过容量，超过了再加
Python使用双向链表
https://leetcode.com/problems/lru-cache/discuss/45926/Python-Dict-%2B-Double-LinkedList
就是用一个双向链表存顺序，dict来存内容
Use LinkedList + Hashmap
- Doubly LinkedList
  Store current node in value in hash map
  Current.prev.next = current.next
  Current.next.prev = current. prev
- //Singly LinkedList
  Store prev node in value in hashmap
  Then do prev.next = prev.next.next then can remove the current
Related questitons
Subarray sum
https://leetcode.com/problemset/all/?search=Subarray%20Sum
https://www.jiuzhang.com/solutions/?search=Subarray%20Sum
Copy list with random pointer
https://leetcode.com/problemset/all/?search=Copy%20List%20with%20Random%20Pointer
https://www.jiuzhang.com/solutions/?search=Copy%20List%20with%20Random%20Pointer
Anagrams
https://leetcode.com/problemset/all/?search=Anagrams
https://www.jiuzhang.com/solutions/?search=Anagrams
Longest consecutive sequence
https://leetcode.com/problemset/all/?search=Longest%20Consecutive%20Sequence
https://www.jiuzhang.com/solutions/?search=Longest%20Consecutive%20Sequence

3 Heap

Heap是什么？
支持操作O(log N) Add / O(log N) Remove / O(1) 查询Min or Max，本质是二叉树，
具体在各个语言的实现？
Java: PriorityQueue
C++: priority_queue
Python: heapq 父节点的只小于或等于子节点，具体使用其实是用列表实现的https://blog.csdn.net/jamfiy/article/details/88185512
Priority Queue是什么？
Priority Queue和Heap有什么区别吗？如果是JAVＡ自带的Priority Queue的话，答案是有的(C++的Priority Queue可能也类似，具体要确认一下）。
具体的区别是在remove操作中，Priority Queue的时间复杂度是O(n)，而Heap是O(logn)。因为Priority Queue需要找到这个数据，需要O(n)的时间，而Heap借助了HashMap，所以只需要O(1)的时间就可以找到。那为什么Priority Queue不用HashMap呢？这个就是JAVA提供的函数里面没有加上这一块而已，如果自己编程就也可以是O(logn)。
https://blog.csdn.net/roufoo/article/details/80638476
构建一个heap的复杂度，和遍历一个heap的复杂度？
构建堆，自下而上，复杂度是O(n)
遍历是O(nlogn)
Python heapq有哪些操作？
heappush(heap, item) heappop(heap) heappushpop(heap, item) heapify(x)
heapreplace(heap, item)

PriorityQueue vs Heap

UglyNumber II
https://leetcode.com/problemset/all/?search=Ugly%20Number
https://www.jiuzhang.com/solutions/?search=Ugly%20Number
简单来说就是来一个导入一个，来一个导入一个，每次取个最小的，通过这个最小的再加入新的
S: p2, p3, p5, uglys
D: Use p2, p3, p5 on same queue O(n)
怎样得到最后一个数，应该是把所有之前的数乘以2，或3或5，找出最小
HashMap + heap(poll min and add 23*5)
Top K largest Number II
https://www.jiuzhang.com/solutions/?search=Top%20k%20Largest%20Number
简单来说就是来一个存一个，如果过量了就heappop出来
Use priorityQueue: add();peek();poll(); clear();
Queue is interface but stack is class
Queue is implemented by PriorityQueue andLinkedList
Merge K sorted Lists
https://leetcode.com/problemset/all/?search=Merge%20K%20Sorted%20Lists
https://www.jiuzhang.com/solutions/?search=Merge%20K%20Sorted%20Lists
Heap & PriorityQueue
Divide and conquer
Merge two by two

TreeMap

https://docs.oracle.com/javase/7/docs/api/java/util/TreeMap.html
https://leetcode.com/problems/merge-k-sorted-lists/discuss/10513/108ms-python-solution-with-heapq-and-avoid-changing-heap-size

TreeMap是什么？
与HashMap相比，TreeMap是一个能比较元素大小的Map集合，会对传入的key进行了大小排序。其中，可以使用元素的自然顺序，也可以使用集合中自定义的比较器来进行排序；基于红黑树实现
https://www.jianshu.com/p/2dcff3634326
Java document: Key value pairs(ascending order by key)
Get();set(key,value); ceilingEntry,ceilingKey

Related questions

The Skyline Problem
https://leetcode.com/problemset/all/?search=The%20Skyline%20Problem
https://www.jiuzhang.com/solutions/?search=building%20outline
Top K Frequent Words
https://leetcode.com/problemset/all/?search=Top%20K%20Frequent%20Words
https://www.jiuzhang.com/solutions/?search=Top%20K%20Frequent%20Words
用heapq实现，先把每一个的第一个载入，每载入一个取出下一个，思路还是比较多路的最小数

Python Queue模块里面的PriorityQueue是什么
A typical pattern for entries is a tuple in the form: (priority_number, data).

from Queue import PriorityQueue

class Solution(object):
    def mergeKLists(self, lists):
        """
        :type lists: List[ListNode]
        :rtype: ListNode
        """
        head = point = ListNode(0)
        q = PriorityQueue()
        for l in lists:
            if l:
                q.put((l.val, l))
        while not q.empty():
            val, node = q.get()
            point.next = node
            point = point.next
            node = node.next
            if node:
                q.put((node.val, node))
        return head.next

ZXAlgorithm - C8 Hash and Heap
HashTheoremHashMap,HashSet,HashTable,HashFunction 31Open ...
【geekband】系统设计第三周
海量数据问题处理方法 Hash Bit-Map位图 Bloom Filter（Bit-Map加强版） Heap 双...
GeekBand C++系统设计与实践第三周
5.海量数据处理方法 1)Hash 2)Bit-Map 3)Bloom Filter 4)堆(Heap) 5)双层...
JS标准库/数组API
JS内存有stack、heap。stack中有global/window对象，其是一个hash。其中有标准库（ob...
代码小工蚁的#《算法图解》#学习笔记-C8贪婪算法
代码小工蚁的#《算法图解》#学习笔记-C8贪婪算法C8 贪婪算法greedy algorithms 一、贪婪算法 ...
ZXAlgorithm - CAT测试
https://www.lintcode.com/cat/ Bronze Invert Binary Treeht...
Happy heap heap
Happy pile a pile, stack the square will not fall to win!...
ptmalloc heap_free
arena.c static int heap_trim(heap_info *heap, size_t pad)...
(转)Objective-C 拾遗：从Heap and Stac
Objective-C 拾遗：从Heap and Stack到Block Stack和Heap heap和stac...
Heap&HeapSort（MIT算法导论）
Heap A heap is an implementation of priority queue 堆可能的操作...