17. The Huffman code problem

作者: 何大炮 | 来源:发表于2017-09-25 09:46 被阅读0次

The principle behind this technique is to design a binary character code for each character, using a variable number of bits to represent each character, so as to reduce the total length of the document.

To allow easy, unambiguous decoding, we require the code to be a prefix code: no code is a prefix of another

For example, if the codewords are {0, 01, 11, 001}, the decoding of string like 001 is ambiguous. You can interpret it as 0-01, or 001.

Solution ---- insisting on the prefix-free property: no codeword can be a prefix of another codeword, where codeword is a string of digits to represent a character in the encoding.

Details

Given a character set C and the frequency f (c) of each character c belongs to C in the document A, the problem is to find a prefix code that minimizes the total length of the document.
The tree representing such an optimal code is called a Huffman tree T . We aim to minimize

B(T,A) = sum of f(c)d(c)

——d (c) is the depth of character c in the tree T (numbers of bits used to encode character c), or the length of the codeword of character c.
——B(T,A) is the total number of bits for document A based on the Huffman tree T.

The problem is to design a tree T to minimize B(T,A).

There are no nodes with only a single child.
Rearranging characters within the same level of the tree T does not change B(T,A).
Swapping a character at a higher level with another less-frequent character reduces B(T,A), where the level of the tree root is the lowest level

We have a greedy strategy for the Huffman coding problem:

Process

For efficient implementation of a Huffman tree, we can use a priority queue data structure - a minimum heap

The running time is O(n) +O(n) + O(nlogn) = O(nlogn)

网友评论

本文标题：17. The Huffman code problem

本文链接：https://www.haomeiwen.com/subject/argnsxtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

17. The Huffman code problem

Details

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读