直观理解：Trie树（字典树/前缀树/单词查找树）

作者: 老羊_肖恩 | 来源:发表于2021-12-01 13:49 被阅读0次

直观理解：Trie树（字典树/前缀树/单词查找树）
trie树
数据结构与算法（第一季）：Trie
数据结构-Trie
Trie 和哈夫曼树
数据结构基础--前缀树&&后缀树
19_Trie
以太坊源码阅读笔记-重要的数据结构：MPT
Trie
实现字典树

Trie树又称字典树、前缀树、单词查找树，是一种哈希树的变种的树形结构。典型应用是用于统计，排序和保存大量的字符串（但不仅限于字符串），所以经常被搜索引擎系统用于文本词频统计。它的优点是：利用字符串的公共前缀来减少查询时间，最大限度地减少无谓的字符串比较，查询效率比哈希树高。（以上内容参考百度百科）前缀树包含以下几个属性：
1.根节点不包含任何字符内容，除根节点之外的其他节点只能包含一个字符；
2.从根节点到某一节点，路径上经过的字符连接起来，为该节点对应的字符串；
3.每个节点的所有子节点包含的字符都不相同。
4.除根节点外，每个节点都包含一个是否是单词或字符串结尾的标识。
假设我们存在一个如下的Trie树，根据以上性质可知，该Trie树中存在的单词有："app"、"apple"、"active"、"actor"、"bi"、"cat"、"cute"。

Trie树

根据Trie树的属性，我们给出简单的Trie树创建·和查找的代码示例：

package com.quan.trie;

public class Trie {
    // root for current trie
    TreeNode root;
    // define the TreeNode
    class TreeNode{
        // children of current node.
        TreeNode[] children;

        // whether is end of a word.
        boolean isEnd;

        //Assume that there are only lowercase letters in the word
        public TreeNode (){
            children = new TreeNode[26];
        }
    }

    /** Initialize your data structure here. */
    public Trie() {
        // Initialize the root without value.
        root = new TreeNode();
    }

    /** Inserts a word into the trie. */
    public void insert(String word) {
        if (word == null || word.length() < 1) return;

        TreeNode node = root;
        for (int i = 0, pos = 0; i < word.length(); i++, node = node.children[pos]) {
            pos = word.charAt(i) - 'a';
            if (node.children[pos] == null){
                node.children[pos] = new TreeNode();
            }
        }
        node.isEnd = true;
    }

    /** Returns if the word is in the trie. */
    public boolean search(String word) {
        if (word == null || word.length() < 1) return false;

        TreeNode node = root;
        for (int i = 0, pos = 0; i < word.length(); i++, node = node.children[pos]) {
            pos = word.charAt(i) - 'a';
            if (node.children[pos] == null){
                return false;
            }
        }
        return node.isEnd;
    }

    /** Returns if there is any word in the trie that starts with the given prefix. */
    public boolean startsWith(String prefix) {
        if (prefix == null || prefix.length() < 1) return true;

        TreeNode node = root;
        for (int i = 0, pos = 0; i < prefix.length(); i++, node = node.children[pos]) {
            pos = prefix.charAt(i) - 'a';
            if (node.children[pos] == null){
                return false;
            }
        }
        return true;
    }
}

Trie树可以最大限度地减少无谓的字符串比较，在Trie树中对单个长度为 $\small N$ 的单词的查找时间复杂度可以做到 $\small O(N)$ ，且跟语料库中单词的数量没有太大关系。Trie树的使用场景主要有字符串检索、字符串字典排序、和前缀索引等。