WordBreak

作者: 瞬铭 | 来源:发表于2019-12-19 11:22 被阅读0次

https://leetcode.com/problems/word-break/
给定一个String,和一个Dict,Dict中包含些许英文单词,求这个String是否能被这个Dict中的所有单词拼接起来,即这个String的子串都在这个Dict中
注意,Dict中的单词没有重复,且Dict中的单词可以被重复利用(即:String="applepenapple",Dict=['apple','pen'] 是符合需求的)
example:
String=“leetcode”,Dict=["leet","code"]; return true

解法一,字符串子集问题,DP大法好啊

  • 明确dp[i]的含义:表示前i个字符是否可以被切割(换一种思路可以解释为从左开始,到第i个元素结尾的子串是否可以被切割,不包含index为i的元素)

  • 显然dp[0]是恒等于True的,前0个字符是空串,

  • 状态转移方程 dp[i] = image.png
  • 状态转移方程解释:
    1.从第0个元素到第i个元素(j)寻找s的子集 sub = s.substring(j,i)
    2.判断sub是在Dict中
    3.判断dp[j]是否为True
    4.如果同时满足条件2和条件3,表明dp[i]=true
    5.汉字解释,s.substring(0,i)被分隔成了两段,第一段是s.substring(0,j),这个是否能分隔用dp[j]表示,第二段是 s.substring(j+1,i),判断是否满足需求,就看是否在Dict中。如果两段都满足,表明dp[i]是满足需求的
    talk is cheap, show me your example

String=“leetcode”,Dict=["leet","code"]; 
dp = [T,F,F,F,F,F,F,F,F];//注意dp的长度为s.length + 1
i = 0 ,  略过,因为dp[0] = True
------------------------------------------------------------------------------------
i = 1 , j = 0 ,sub = "l", 不满足需求 (因为l不在Dict中)
------------------------------------------------------------------------------------
i = 2 , j = 0 ,sub = "le",不满足需求 (因为l不在Dict中)
        j = 1,sub = "e",  不满足需求
------------------------------------------------------------------------------------
i = 3 , j = 0 ,sub = "lee",不满足需求 (因为l不在Dict中)
        j = 1,sub = "ee",  不满足需求
        j = 2,sub = "e",  不满足需求
------------------------------------------------------------------------------------
i = 4 , j = 0 ,sub = "leet", 满足需求,  因为“leet” 在Dict中,且dp[0] == True成立,  置dp[4] = True。 
         因为已经确定了DP[4]的值为True,所以跳出i=4的循环
------------------------------------------------------------------------------------
i = 5 , j = 0 ,sub = "leetc",不满足需求 (因为l不在Dict中)
        j = 1,sub = "eetc",  不满足需求
        j = 2,sub = "etc",  不满足需求
        j = 3,sub = "tc",  不满足需求
        j = 4,sub = "c",  不满足需求
------------------------------------------------------------------------------------
....
------------------------------------------------------------------------------------
i = 8 , j = 0 ,sub = "leetcode", 不满足需求 
          j = 1 ,sub = "eetcode", 不满足需求
          j = 2 ,sub = "etcode", 不满足需求
          j = 3 ,sub = "tcode", 不满足需求
          j = 4 ,sub = "code",   sub = “code”,在Dict中,且dp[4] == True 成立,置dp[8] = True;
------------------------------------------------------------------------------------
循环结束,因为dp[8] == True,所以返回True

代码如下

      public boolean wordBreak2(String s, List<String> wordDict) {
        if (s == null || s.length() == 0) {
            return true;
        }

        int len = s.length();

        boolean[] dp = new boolean[len + 1];
        dp[0] = true;
        for (int i = 1; i <= len; i++) {
            for (int j = 0; j < i; j++) {
                
                //子串的获取,左闭区间, 右开区间
                String tmp = s.substring(j, i);

                if (dp[j] && wordDict.contains(tmp)) {
                    dp[i] = true;
                    break;
                }
            }
        }
        return dp[len];
    }

参考资料:https://blog.csdn.net/mine_song/article/details/72081998

解法二 BFS大法好

既然用到了DP,那表明可以用类似的遍历,既然是BFS大法,所以应该有一个Queue

  • 原理跟DP比较类似,也是找到第i个元素前面的子串是否满足需求
  • HashSet visited 的意义在于,避免重复计算某一个i
String=“leetcode”,Dict=["leet","code"]; 
queue初始化为0
-------------------------------------------
queue poll 0 ,
i = 1   nothing happens ,because s.substring(0,1) = 'l'
i = 2 nothing happens,  s.substring(0,1) = 'le'
...
i=4 s.substring(0,4) = 'leet', oh it's a member of Dict,  queue.offer(4),visited.add(4)
-------------------------------------------
queue poll 4 ,
i = 5   nothing happens ,because s.substring(4,5) = 'c'
i = 6 nothing happens,  s.substring(4,6) = 'co'
i = 7 nothing happens,  s.substring(4,7) = 'cod'
i = 8 s.substring(4,8) = "code", oh  ,it's a member of Dict, 
  oh  i equals 8 ,等于数组的长度,找到的最终结果

解析一下上面这个example,如果不要visited这个Set,过程还是比较好理解的~ 为什么要visited这个Set ,看下面这个example就知道了,如果不要visited这个set,下面这个example会有很多重复操作

String 
 = "aaaa"
Dict = ["a","aa","aaa","aaaa","aaaaa","aaaaaa","aaaaaaa","aaaaaaaa","aaaaaaaaa","aaaaaaaaaa"]
-------------------------------------------
queue.add(1)
queue.add(2)
queue.add(3)
queue.add(4)
-------------------------------------------
queue.poll
此时又会出现queue.add(2). 因为s.substring(1,2) = "aa"在Dict中,但是,但是,在上一轮对curIdx = 0 的循环判断中,2已经被加进队列了,这样就多了很多无谓操作
 // BFS
    public boolean wordBreak3(String s, List<String> wordDict) {
        if (wordDict.contains(s)) {
            return true;
        }

        Queue<Integer> queue = new LinkedList<Integer>();
        queue.offer(0);

        Set<Integer> visited = new HashSet<Integer>();
        visited.add(0);
        while (!queue.isEmpty()) {
            int curIdx = queue.poll();
            for (int i = curIdx + 1; i <= s.length(); i++) {
                if (visited.contains(i)) {
                    continue;
                }
                if (wordDict.contains(s.substring(curIdx, i))) {
                    if (i == s.length()) {
                        return true;
                    }
                    queue.offer(i);
                    visited.add(i);
                }
            }
        }
        return false;
    }

解法三 DFS大法好啊

有了BFS,为什么不能有DFS? BFS我不知道~ 我堂堂DFS哥一般都很DP双双出入算法界

  • 汉语解释DFS大法的Helper function 的意思:字符串s从第i个字符开始,到结尾的子串,都是满足条件的
  • 所以对于一个字符串s,可以分为两部分~ part1 = s.substring(0,i) ,part2 = s.substring(i,s.length()); 如果同时满足part1是Dict的子集 且 part2满足Helper function, 则这个字符串就是满足题目条件的
    if (wordDict.contains(s.substring(start, i)) && wordBreakHelper(s, wordDict, i)) { return true; }

talk is cheap,show me your example

String=“leetcode”,Dict=["leet","code"]; 
-------------------------------------------
从start=0为起点,找每一个子串

i= 1  part1 = "l"  ,oh ,不满足条件,因为它不在Dict中
i= 2 part1 = “le”,  oh ,不满足条件
i= 3 part1 = "lee",oh ,不满足条件
i= 4 part1 = "leet",  oh,满足了一个条件,因为leet是在Dict中的,所以只要  继续找start=4的时候,后面的子串是否满足条件,即可
-------------------------------------------
下一层递归
从start=4为起点,找每一个子串
i = 5 part 1 = "c"
...
i = 8 part1 = "code"
--------------------------------------------

代码如下,显然这段代码是有瑕疵的,瑕疵说法见代码后面

public boolean wordBreakOri(String s, List<String> wordDict) {
        return wordBreakHelper(s, wordDict, 0);
    }

    public boolean wordBreakHelper(String s, List<String> wordDict, int start) {
        if (start >= s.length()) {
            return true;
        }

        for (int i = start; i <= s.length(); i++) {
            if (wordDict.contains(s.substring(start, i))
                    && wordBreakHelper(s, wordDict, i)) {
                return true;
            }
        }
        return false;
    }

就如上面的BFS解释的那样,当一个子串有多重分解方法的时候,会出现重复计算
talk is cheap, show me your example

String 
 = "aaaa"
Dict = ["a","aa","aaa","aaaa","aaaaa","aaaaaa","aaaaaaa","aaaaaaaa","aaaaaaaaa","aaaaaaaaaa"]
---------------------------------
start = 0 
i = 1  part1 = "a" ,在Dict中,需要继续找start=1的时候(blablabla..里面肯定又有一大堆递归,递归代号15320)
i=2   part1 = "aa", 在Dict中,需要继续找start=2的时候(blablabla,里面又有一大堆递归)
显然,在代号为15320的递归里面里面已经走了一遍start=2的这个递归了~  但是在start=0的子递归里面,又算了一遍start=2的递归,重复计算了
---------------------------------
我们细看一下递归代号为15320的递归
start=1
i = 2 part1 = "a" 在Dict中,需要继续找start=2的递归,是不是跟上面的递归一样了,是不是,是不是,是不是?

正确代码,用一个int数组memo带代表index是否已经算过了,-1代表没算过,1 代表算过且满足需求,0 表示算过且不满足需求

 public boolean wordBreak(String s, List<String> wordDict) {
        int[] memo = new int[s.length() + 1];
        Arrays.fill(memo, -1);
        return wordBreakDFS(s, wordDict, 0, memo);

    }

    public boolean wordBreakDFS(String s, List<String> wordDict, int start, int[] memo) {
        if (start >= s.length()) {
            return true;
        }

        if (memo[start] != -1) {
            return memo[start] == 1 ? true : false;
        }

        for (int i = start + 1; i <= s.length(); i++) {
            if (wordDict.contains(s.substring(start, i))
                    && wordBreakDFS(s, wordDict, i, memo)) {
                memo[start] = 1;
                return true;
            }
        }
        memo[start] = 0;
        return false;
    }

参考资料:https://www.cnblogs.com/grandyang/p/4257740.html
参考资料:https://leetcode.com/problems/word-break/discuss/43886/Evolve-from-brute-force-to-optimal-a-review-of-all-solutions

相关文章

  • WordBreak

    https://leetcode.com/problems/word-break/给定一个String,和一个Di...

  • WordBreak2

    https://leetcode.com/problems/word-break-ii/给定一个String,和一...

网友评论

      本文标题:WordBreak

      本文链接:https://www.haomeiwen.com/subject/hfkmnctx.html