https://leetcode.com/problems/word-break/
给定一个String,和一个Dict,Dict中包含些许英文单词,求这个String是否能被这个Dict中的所有单词拼接起来,即这个String的子串都在这个Dict中
注意,Dict中的单词没有重复,且Dict中的单词可以被重复利用(即:String="applepenapple",Dict=['apple','pen'] 是符合需求的)
example:
String=“leetcode”,Dict=["leet","code"]; return true
解法一,字符串子集问题,DP大法好啊
-
明确dp[i]的含义:
表示前i个字符是否可以被切割(换一种思路可以解释为从左开始,到第i个元素结尾的子串是否可以被切割,不包含index为i的元素)
-
显然dp[0]是恒等于True的,前0个字符是空串,
-
状态转移方程 dp[i] =
image.png
-
状态转移方程解释:
1.从第0个元素到第i个元素(j)寻找s的子集 sub = s.substring(j,i)
2.判断sub是在Dict中
3.判断dp[j]是否为True
4.如果同时满足条件2和条件3,表明dp[i]=true
5.汉字解释,s.substring(0,i)被分隔成了两段,第一段是s.substring(0,j),这个是否能分隔用dp[j]表示,第二段是 s.substring(j+1,i),判断是否满足需求,就看是否在Dict中。如果两段都满足,表明dp[i]是满足需求的
talk is cheap, show me your example
String=“leetcode”,Dict=["leet","code"];
dp = [T,F,F,F,F,F,F,F,F];//注意dp的长度为s.length + 1
i = 0 , 略过,因为dp[0] = True
------------------------------------------------------------------------------------
i = 1 , j = 0 ,sub = "l", 不满足需求 (因为l不在Dict中)
------------------------------------------------------------------------------------
i = 2 , j = 0 ,sub = "le",不满足需求 (因为l不在Dict中)
j = 1,sub = "e", 不满足需求
------------------------------------------------------------------------------------
i = 3 , j = 0 ,sub = "lee",不满足需求 (因为l不在Dict中)
j = 1,sub = "ee", 不满足需求
j = 2,sub = "e", 不满足需求
------------------------------------------------------------------------------------
i = 4 , j = 0 ,sub = "leet", 满足需求, 因为“leet” 在Dict中,且dp[0] == True成立, 置dp[4] = True。
因为已经确定了DP[4]的值为True,所以跳出i=4的循环
------------------------------------------------------------------------------------
i = 5 , j = 0 ,sub = "leetc",不满足需求 (因为l不在Dict中)
j = 1,sub = "eetc", 不满足需求
j = 2,sub = "etc", 不满足需求
j = 3,sub = "tc", 不满足需求
j = 4,sub = "c", 不满足需求
------------------------------------------------------------------------------------
....
------------------------------------------------------------------------------------
i = 8 , j = 0 ,sub = "leetcode", 不满足需求
j = 1 ,sub = "eetcode", 不满足需求
j = 2 ,sub = "etcode", 不满足需求
j = 3 ,sub = "tcode", 不满足需求
j = 4 ,sub = "code", sub = “code”,在Dict中,且dp[4] == True 成立,置dp[8] = True;
------------------------------------------------------------------------------------
循环结束,因为dp[8] == True,所以返回True
代码如下
public boolean wordBreak2(String s, List<String> wordDict) {
if (s == null || s.length() == 0) {
return true;
}
int len = s.length();
boolean[] dp = new boolean[len + 1];
dp[0] = true;
for (int i = 1; i <= len; i++) {
for (int j = 0; j < i; j++) {
//子串的获取,左闭区间, 右开区间
String tmp = s.substring(j, i);
if (dp[j] && wordDict.contains(tmp)) {
dp[i] = true;
break;
}
}
}
return dp[len];
}
参考资料:https://blog.csdn.net/mine_song/article/details/72081998
解法二 BFS大法好
既然用到了DP,那表明可以用类似的遍历,既然是BFS大法,所以应该有一个Queue
- 原理跟DP比较类似,也是找到第i个元素前面的子串是否满足需求
- HashSet visited 的意义在于,避免重复计算某一个i
String=“leetcode”,Dict=["leet","code"];
queue初始化为0
-------------------------------------------
queue poll 0 ,
i = 1 nothing happens ,because s.substring(0,1) = 'l'
i = 2 nothing happens, s.substring(0,1) = 'le'
...
i=4 s.substring(0,4) = 'leet', oh it's a member of Dict, queue.offer(4),visited.add(4)
-------------------------------------------
queue poll 4 ,
i = 5 nothing happens ,because s.substring(4,5) = 'c'
i = 6 nothing happens, s.substring(4,6) = 'co'
i = 7 nothing happens, s.substring(4,7) = 'cod'
i = 8 s.substring(4,8) = "code", oh ,it's a member of Dict,
oh i equals 8 ,等于数组的长度,找到的最终结果
解析一下上面这个example,如果不要visited这个Set,过程还是比较好理解的~ 为什么要visited这个Set ,看下面这个example就知道了,如果不要visited这个set,下面这个example会有很多重复操作
String
= "aaaa"
Dict = ["a","aa","aaa","aaaa","aaaaa","aaaaaa","aaaaaaa","aaaaaaaa","aaaaaaaaa","aaaaaaaaaa"]
-------------------------------------------
queue.add(1)
queue.add(2)
queue.add(3)
queue.add(4)
-------------------------------------------
queue.poll
此时又会出现queue.add(2). 因为s.substring(1,2) = "aa"在Dict中,但是,但是,在上一轮对curIdx = 0 的循环判断中,2已经被加进队列了,这样就多了很多无谓操作
// BFS
public boolean wordBreak3(String s, List<String> wordDict) {
if (wordDict.contains(s)) {
return true;
}
Queue<Integer> queue = new LinkedList<Integer>();
queue.offer(0);
Set<Integer> visited = new HashSet<Integer>();
visited.add(0);
while (!queue.isEmpty()) {
int curIdx = queue.poll();
for (int i = curIdx + 1; i <= s.length(); i++) {
if (visited.contains(i)) {
continue;
}
if (wordDict.contains(s.substring(curIdx, i))) {
if (i == s.length()) {
return true;
}
queue.offer(i);
visited.add(i);
}
}
}
return false;
}
解法三 DFS大法好啊
有了BFS,为什么不能有DFS? BFS我不知道~ 我堂堂DFS哥一般都很DP双双出入算法界
- 汉语解释DFS大法的Helper function 的意思:字符串s从第i个字符开始,到结尾的子串,都是满足条件的
- 所以对于一个字符串s,可以分为两部分~ part1 = s.substring(0,i) ,part2 = s.substring(i,s.length()); 如果同时满足part1是Dict的子集 且 part2满足Helper function, 则这个字符串就是满足题目条件的
if (wordDict.contains(s.substring(start, i)) && wordBreakHelper(s, wordDict, i)) { return true; }
talk is cheap,show me your example
String=“leetcode”,Dict=["leet","code"];
-------------------------------------------
从start=0为起点,找每一个子串
i= 1 part1 = "l" ,oh ,不满足条件,因为它不在Dict中
i= 2 part1 = “le”, oh ,不满足条件
i= 3 part1 = "lee",oh ,不满足条件
i= 4 part1 = "leet", oh,满足了一个条件,因为leet是在Dict中的,所以只要 继续找start=4的时候,后面的子串是否满足条件,即可
-------------------------------------------
下一层递归
从start=4为起点,找每一个子串
i = 5 part 1 = "c"
...
i = 8 part1 = "code"
--------------------------------------------
代码如下,显然这段代码是有瑕疵的,瑕疵说法见代码后面
public boolean wordBreakOri(String s, List<String> wordDict) {
return wordBreakHelper(s, wordDict, 0);
}
public boolean wordBreakHelper(String s, List<String> wordDict, int start) {
if (start >= s.length()) {
return true;
}
for (int i = start; i <= s.length(); i++) {
if (wordDict.contains(s.substring(start, i))
&& wordBreakHelper(s, wordDict, i)) {
return true;
}
}
return false;
}
就如上面的BFS解释的那样,当一个子串有多重分解方法的时候,会出现重复计算
talk is cheap, show me your example
String
= "aaaa"
Dict = ["a","aa","aaa","aaaa","aaaaa","aaaaaa","aaaaaaa","aaaaaaaa","aaaaaaaaa","aaaaaaaaaa"]
---------------------------------
start = 0
i = 1 part1 = "a" ,在Dict中,需要继续找start=1的时候(blablabla..里面肯定又有一大堆递归,递归代号15320)
i=2 part1 = "aa", 在Dict中,需要继续找start=2的时候(blablabla,里面又有一大堆递归)
显然,在代号为15320的递归里面里面已经走了一遍start=2的这个递归了~ 但是在start=0的子递归里面,又算了一遍start=2的递归,重复计算了
---------------------------------
我们细看一下递归代号为15320的递归
start=1
i = 2 part1 = "a" 在Dict中,需要继续找start=2的递归,是不是跟上面的递归一样了,是不是,是不是,是不是?
正确代码,用一个int数组memo带代表index是否已经算过了,-1代表没算过,1 代表算过且满足需求,0 表示算过且不满足需求
public boolean wordBreak(String s, List<String> wordDict) {
int[] memo = new int[s.length() + 1];
Arrays.fill(memo, -1);
return wordBreakDFS(s, wordDict, 0, memo);
}
public boolean wordBreakDFS(String s, List<String> wordDict, int start, int[] memo) {
if (start >= s.length()) {
return true;
}
if (memo[start] != -1) {
return memo[start] == 1 ? true : false;
}
for (int i = start + 1; i <= s.length(); i++) {
if (wordDict.contains(s.substring(start, i))
&& wordBreakDFS(s, wordDict, i, memo)) {
memo[start] = 1;
return true;
}
}
memo[start] = 0;
return false;
}
参考资料:https://www.cnblogs.com/grandyang/p/4257740.html
参考资料:https://leetcode.com/problems/word-break/discuss/43886/Evolve-from-brute-force-to-optimal-a-review-of-all-solutions
网友评论