算法学习：字典树

作者: alex很累 | 来源:发表于2023-02-27 22:48 被阅读0次

大厂算法面试之leetcode精讲22.字典树
数据结构之字典树Trie
算法导论：字典树
实现字典树
AC自动机专题整理
字典树
AskMe Spring项目提问功能——前缀树敏感词过滤
Swift 算法实战二(二叉树、二分搜索)
Swift 算法实战一(集合,字典,链表,栈,队列等算法)
2019-03-31字典树Tire

一、概念

字典树，即 Trie 树，又称单词查找树或键树，是一种树形结构。
典型应用是用于统计和排序大量的字符串（但不仅限于字符串），所以经常被搜索引擎系统用于文本词频统计。

优点：最大限度地减少无谓的字符串比较，查询效率比哈希表高。

二、基本性质

结点本身不存完整单词；
从根结点到某一结点，路径上经过的字符连接起来，为该结点对应的字符串；
每个结点的所有子结点路径代表的字符都不相同。

三、核心思想

Trie树的核心思想是拿空间换时间；
利用字符串的公共前缀来降低查询时间的开销以达到提高效率的目的。

四、Java实现

字典树需要实现的功能有：

插入字符串
查询字符串
（这里以26位小写字母为例）

public class Trie {
    // 当前结点的后续结点
    private Trie[] childrens;

    // 最大可表示字符种类
    private static int maxChildrenCount = 26;

    // 到当前结点是否有完整的字符串
    private boolean isEnd;

    public Trie() {
        childrens = new Trie[maxChildrenCount];
        isEnd = false;
    }

    public void insert(String word) {
        Trie node = this;
        for (int i = 0; i < word.length(); i++) {
            char c = word.charAt(i);
            int index = c - 'a';
            if (node.childrens[index] == null) {
                node.childrens[index] = new Trie();
            }
            node = node.childrens[index];
        }
        node.isEnd = true;
    }

    public boolean search(String word) {
        Trie node = prefixLike(word);
        if (node== null || !node.isEnd) {
            return false;
        } else {
            return true;
        }
    }

    public boolean startsWith(String prefix) {
        Trie node = prefixLike(prefix);
        return node != null;
    }

    public Trie prefixLike(String prefix) {
        Trie node = this;
        for (int i = 0; i < prefix.length(); i++) {
            char c = prefix.charAt(i);
            int index = c - 'a';
            if (node.childrens[index] == null) {
                return null;
            }
            node = node.childrens[index];
        }
        return node;
    }
}

五、相关算法题

208. 实现 Trie (前缀树)
可以借助这道题，自己尝试实现字典树。

212. 单词搜索 II

题目描述

给定一个 m x n 二维字符网格 board 和一个单词（字符串）列表 words， 返回所有二维网格上的单词 。
单词必须按照字母顺序，通过 相邻的单元格 内的字母构成，其中“相邻”单元格是那些水平相邻或垂直相邻的单元格。同一个单元格内的字母在一个单词中不允许被重复使用。

示例

解题思路

字典树+回溯

将单词列表words丢到字典树中（可以直接用上题中实现的字典树）；
遍历字符网格board；深度优先搜索所有从当前正在遍历的单元格出发的、由相邻且不重复的单元格组成的路径；如果当前路径是字典树中的单词，则将其添加到结果集中。

Java代码示例

class Solution {
    static int[][] xys = {{1, 0}, {-1, 0}, {0, 1}, {0, -1}};

    public List<String> findWords(char[][] board, String[] words) {
        // 单词列表组成字典树
        Trie trie = new Trie();
        for (String word : words) {
            trie.insert(word);
        }

        // 深度优先搜索找出结果
        Set<String> res = new HashSet<>();
        for (int i = 0; i < board.length; i++) {
            for (int j = 0; j < board[0].length; j++) {
                dfs(res, trie, board, i, j, "");
            }
        }
        return new ArrayList<>(res);
    }

    public void dfs(Set<String> res, Trie trie, char[][] board, int x, int y, String preString) {
        preString = preString + board[x][y];
        Trie find = trie.prefixLike(preString);
        if (find == null) {
            return;
        }
        if (find.isEnd) {
            res.add(preString);
        }
        // 同一个单元格内的字母在一个单词中不允许被重复使用，这里暂存一下这个字母
        char tempChar = board[x][y];
        board[x][y] = '#';
        for (int[] xy : xys) {
            int xx = x + xy[0], yy = y + xy[1];
            if (xx >= 0 && xx < board.length && yy >= 0 && yy < board[0].length && board[xx][yy] != '#') {
                dfs(res, trie, board, xx, yy, preString);
            }
        }
        board[x][y] = tempChar;
    }


    public class Trie {
        private Trie[] childrens;
        private int maxChildrenCount = 26;
        private boolean isEnd;

        public Trie() {
            childrens = new Trie[maxChildrenCount];
            isEnd = false;
        }

        public void insert(String word) {
            Trie node = this;
            for (int i = 0; i < word.length(); i++) {
                char c = word.charAt(i);
                int index = c - 'a';
                if (node.childrens[index] == null) {
                    node.childrens[index] = new Trie();
                }
                node = node.childrens[index];
            }
            node.isEnd = true;
        }

        public boolean search(String word) {
            Trie node = prefixLike(word);
            if (node == null || !node.isEnd) {
                return false;
            } else {
                return true;
            }
        }

        public boolean startsWith(String prefix) {
            Trie node = prefixLike(prefix);
            return node != null;
        }

        public Trie prefixLike(String prefix) {
            Trie node = this;
            for (int i = 0; i < prefix.length(); i++) {
                char c = prefix.charAt(i);
                int index = c - 'a';
                if (node.childrens[index] == null) {
                    return null;
                }
                node = node.childrens[index];
            }
            return node;
        }
    }
}

这里对Trie树进行优化，提升效率。

class Solution {
    int[][] dirs = {{1, 0}, {-1, 0}, {0, 1}, {0, -1}};

    public List<String> findWords(char[][] board, String[] words) {
        Trie trie = new Trie();
        for (String word : words) {
            trie.insert(word);
        }

        Set<String> ans = new HashSet<String>();
        for (int i = 0; i < board.length; ++i) {
            for (int j = 0; j < board[0].length; ++j) {
                dfs(board, trie, i, j, ans);
            }
        }

        return new ArrayList<String>(ans);
    }

    public void dfs(char[][] board, Trie now, int i1, int j1, Set<String> ans) {
        if (!now.children.containsKey(board[i1][j1])) {
            return;
        }
        char ch = board[i1][j1];
        now = now.children.get(ch);
        if (!"".equals(now.word)) {
            ans.add(now.word);
        }

        board[i1][j1] = '#';
        for (int[] dir : dirs) {
            int i2 = i1 + dir[0], j2 = j1 + dir[1];
            if (i2 >= 0 && i2 < board.length && j2 >= 0 && j2 < board[0].length) {
                dfs(board, now, i2, j2, ans);
            }
        }
        board[i1][j1] = ch;
    }
}

class Trie {
    String word;
    Map<Character, Trie> children;
    boolean isWord;

    public Trie() {
        this.word = "";
        this.children = new HashMap<Character, Trie>();
    }

    public void insert(String word) {
        Trie cur = this;
        for (int i = 0; i < word.length(); ++i) {
            char c = word.charAt(i);
            if (!cur.children.containsKey(c)) {
                cur.children.put(c, new Trie());
            }
            cur = cur.children.get(c);
        }
        cur.word = word;
    }
}

大厂算法面试之leetcode精讲22.字典树
大厂算法面试之leetcode精讲22.字典树视频讲解（高效学习）:点击学习[https://xiaochen1...
数据结构之字典树Trie
字典树Trie 字典树也叫前缀树，是一种在字符串查找，前缀匹配等问题广泛应用的算法，为什么使用字典树呢？我们都知道...
算法导论：字典树
算法导论：字典树如果我们给定字符串集合为{b abc abd bcd abcd efg ...
实现字典树
题目：实现字典树（算法课第七课） Trie树，又称为字典树、单词查找树或者前缀树，是一种用于快速检索的多叉数结构...
AC自动机专题整理
AC自动机学习记录参考资料字典树（讲解+模版）AC自动机算法AC自动机算法详解hdu 2222 ac自动机入门...
字典树
功能字典树是用数组存储大量字符串的一种算法字典树算法开辟空间非常大，但是对于字符串插入和查询有很快的速度用法...
AskMe Spring项目提问功能——前缀树敏感词过滤
问题发布功能的重点在于如何实现敏感词过滤，基本的算法是前缀树算法，前缀树也就是字典树，通过前缀树匹配可以加快敏感词...
Swift 算法实战二(二叉树、二分搜索)
Swift 算法实战一(集合,字典,链表,栈,队列等算法) Swift 算法实战二(二叉树、二分搜索) 前言这是...
Swift 算法实战一(集合,字典,链表,栈,队列等算法)
Swift 算法实战一(集合,字典,链表,栈,队列等算法) Swift 算法实战二(二叉树、二分搜索) 前言用 ...
2019-03-31字典树Tire
字典树图示字典树案例