Description

Given a string,find the length of the longest substring without repeating characters.

Example 1:

Input: "abcabcbb"
Output: 3 
Explanation: The answer is "abc", with the length of 3.

Example 2:

Input: "bbbbb"
Output: 1
Explanation: The answer is "b", with the length of 1.

Example 3:

Input: "pwwkew"
Output: 3
Explanation: The answer is "wke", with the length of 3. 
             Note that the answer must be a substring, "pwke" is a subsequence and not a substring.

solution 1

根据题干，首先想到的是用set或者map来减少访问字符（其实就是利用直接寻址来提高效率）。

我使用的是map结构，思路是：

按字符串顺序检查每个字符是否在map中，如果存在便删除上一个重复的字符以及其前面的字符，重新计算下一个不重复子串的长度，并且每次都比较最长长度。

show code:

    public static int lengthOfLongestSubstring(String s) {
        int Longest = 0;
        int start = 0;
        int end = 0;
        int len = s.length();
        Map<Character, Integer> map = new HashMap<>();
        
        for (int i = 0; i < len; i++) {
            char c = s.charAt(i);
            if (map.get(c) != null) {
                int subEnd = map.get(c) + 1;
                for (int j = start; j < subEnd; j++) {
                    map.remove(s.charAt(j));
                }
                start = subEnd;
            }
            map.put(c, i);
            end++;
            if (Longest < (end - start)) {
                Longest = end - start;
            }
        }
        return Longest;
    }

时间复杂度为： $O(2n)$ ，最好的情况是每个元素之访问一次，最坏的情况是每个元素都需要访问两次，当字符串的每个不重复子串的长度一致时，出现最坏情况。

空间复杂度为： $O(min(m,n))$ 。从代码上看，起空间复杂度为： $O(n)$ ，但考虑到字符集是有限的，也就是但字符串的长度比字符集大时，必定出现重复，此时会删除元素。

提交代码：

leetcode提交结果

接下来尝试进行优化。

solution 2

既然使用了map，而众所周知，map底层其实是利用了数组的直接寻址来实现的，那么我们能不能直接使用数组来实现呢？
这样我们必须设计一种方法来代替hash函数的作用，来计算出字符在数组中的下标。我们知道每个字符都有对应的数字，比如字符a是97，而且是唯一的，那么我们可以根据该特性来设计了，每个字符的大小直接对应其在数组中的下标便可。

我们假设该字符串是都是英语的符号，而英语用128个符号编码就够了，也就是ASCII码。

show code：

    public int lengthOfLongestSubstring(String s) {
        int[] index = new int[128];
        int Longest = 0, start = 0, end = 0;
        int len = s.length();
        for (int i = 0; i < len; i++) {
            int c = s.charAt(i);
            if (index[c] != 0) {
                int subEnd = index[c];
                for (int j = start; j < subEnd; j++) {
                    index[s.charAt(j)] = 0;
                }
                start = subEnd;
            }
            index[s.charAt(i)] = i + 1;
            end++;
            if (Longest < (end - start)) {
                Longest = end - start;
            }
        }
        return Longest;
    }

时间复杂度跟空间复杂度跟上一个解决方案是一样的，只不过通过用数组代替map，以此来减少操作，提高效率。

提交代码：

leetcode提交结果

效率跟内存占用都有所改善，但是依旧有优化空间。

solution 3

如果想进一步优化，我能想到的是进一步降低时间复杂度，把最坏情况下的2n个数组元素访问降低为n次。如果想降低为n次，那么需要去掉方案2中重复元素的去除操作：

                for (int j = start; j < subEnd; j++) {
                    index[s.charAt(j)] = 0;
                }

基于此种思路，如果不去掉数组中的元素，那么我们就不能通过

 if (index[c] != 0)

这样来检查元素是否存在于数组中了，必须换种思路解决问题。

在思考中，发现根据之前的思路，无非是用两个指针（有别于C中的指针）来指向数组的最长不重复子串的下标，end指针一直向后遍历，而在遇到重复时，更新start指针，判断是否更新不重复子串的长度，然后更新字符对应数组元素的值。

那么完全可以不用删除重复的数组中重复的元素。如果元素不重复，那么index[c]==0，此时不更新start；如果重复，那么当前数组index[c] - 1为已经在数组中的重复元素的下标，那么下个子串应该从该index[c]开始，也就是start = index[c]。而后判断是否更新不重复子串的长度，然后更新字符对应数组元素的值。

show code：

    public int lengthOfLongestSubstring(String s) {
        int[] index = new int[128];
        int Longest = 0, start = 0, end = 0;
        int len = s.length();
        while (end < len) {
            int c = s.charAt(end);
            start = index[c] > start ? index[c] : start;
            ++end;
            Longest = (end - start) > Longest ? (end - start) : Longest;
            index[c] = end;
        }
        return Longest;
    }

该算法的时间复杂度为 $\Theta(n)$ ，最多只访问n个元素。时间复杂度没变，依旧是 $O(min(m,n))$ 。