2020-02-10 刷题 3（字符串）

242 有效的异位数组

标签：哈希表计数器数组
最近做到好多这种题目，都是用哈希表来解决的。不过鉴于本题只有26个小写字母，用数组来实现一个计数器就可以了,统计s的时候++，统计t的时候--，什么时候小于0了，就返回false。
代码：

time:80.03%, memory:5.66%
class Solution {
public:
    bool isAnagram(string s, string t) {
        int cnt_s[26] = {0};
        for(int i = 0; i < s.size(); i++)
            cnt_s[s[i]-'a']++;
        for(int i = 0; i < t.size(); i++){
            cnt_s[t[i]-'a']--;
            if(cnt_s[t[i]-'a'] < 0) return false;
        }
        for(int i = 0; i < 26; i++){
            if(cnt_s[i] > 0) return false;
        }
        return true;
    }
};

8 字符串转换整数（atoi）

标签：字符串，溢出
这个题目很烦，一方面需要留心有没有溢出的情况，另一方面输入的字符串格式有各种各样，题目交代的也不是很清楚，提交了几次报错，最后才纠正把题目要求搞清楚。
这里采用的是提前判断溢出的方式，正数的情况下，n * 10 + int(str[i] - '0') > 2147483647时，下一步就会溢出；负数时，n * 10 - int(str[i] - '0') < -2147483648，下一步也会溢出。
代码：

time: 56.71%, memory: 10.68%
class Solution {
public:
    int myAtoi(string str) {
        int n = 0;
        bool is_neg;
        int i = 0, valid = 0, m;
        while(i < str.size() && str[i] == ' ')
            i++;
        if(i == str.size()) return 0;  // empty string
        if(str[i] == '-') {
            is_neg = true;  
            i++;
            if(!isdigit(str[i])) return 0;
        }
        else if(str[i] == '+'){
            is_neg = false;
            i++;
            if(!isdigit(str[i])) return 0;
        }
        else if(isdigit(str[i])) is_neg = false;
        else return 0;  // not valid
        
        for(; i < str.size(); i++){
            if(!isdigit(str[i])) break;  // partial invalid
            if(is_neg){
                if(n < (-2147483648 + int(str[i] - '0')) / 10.0)
                    return -2147483648;
                n = n * 10 - int(str[i] - '0');
            }
            else{
                if(n > (2147483647 - int(str[i] - '0')) / 10.0)
                    return 2147483647;
                n = n * 10 + int(str[i] - '0');
            }
        }
        return n;
    }
};

28 实现strStr()

暴力解法，时间复杂度是O(mn), 空间复杂度O(1).

time: 10.62%, memory: 64.59%
class Solution {
public:
    int strStr(string haystack, string needle) {
        if(needle.size() == 0) return 0;
        if(haystack.size() < needle.size()) return -1;
        for(int i = 0; i < haystack.size(); i++){
            int p = i, q = 0;
            while(p < haystack.size() && q < needle.size() && haystack[p] == needle[q]){
                p++;
                q++;
            }
            if(q == needle.size()) return i;

        }
        return -1;
    }
};

不过，这种做法还是太慢了，从题解中学到一种Sunday算法，好理解而且速度还很快。题解地址：https://leetcode-cn.com/problems/implement-strstr/solution/python3-sundayjie-fa-9996-by-tes/
首先遍历一遍模式串（短的那个），把所有出现的字符的最小位置偏移记录下来：len-idx。
比如：目标串 a b a a b, 模式串 a a b，则a的最小位置偏移是2，b是1.
第一次匹配
a b a a b
a a b
匹配失败。取模式串右边第一个字符，也就是目标串中的第三个字符，a，其最小偏移位置是2，则下一次，模式串右移2.
第二次匹配
a b a a b
-----a a b
这一次，匹配成功。如果模式串右边的第一个字符没有在模式串中出现，则模式串右移模式串的长度。
虽然，最坏情况下，这种做法的时间复杂度也是O(mn)，但是实际上能节省的操作要多得多。
代码：

time: 93.79%, memory: 15.78%
class Solution {
public:
    int strStr(string haystack, string needle) {
        if(needle.size() == 0) return 0;
        if(haystack.size() < needle.size()) return -1;
        
        // calculate index_map
        map<char, int> index_map;
        for(int i = 0; i < needle.size(); i++){
            index_map[needle[i]] = needle.size() - i;
        }

        // match
        int idx = 0;
        while(idx + needle.size() <= haystack.size()){
            int p = idx, q = 0;
            while(p < haystack.size() && q < needle.size() && haystack[p] == needle[q]){
                p++;
                q++;
            }
            if(q == needle.size()) return idx;
            if(idx + needle.size() >= haystack.size()) return -1;
            if(index_map.count(haystack[idx + needle.size()])){
                idx += index_map[haystack[idx + needle.size()]];
            }
            else{
                idx = idx + needle.size() + 1;
            }
        }
        return -1;
    }
};

KMP算法：
该算法参考邓俊辉老师的数据结构课程。
KMP构造了一个next数组，用于存放当前模式串的字符与文本串失配后，模式串下一个与当前文本串字符做匹配的位置。next的获取方式为：对于模式串P[0, j), n[j]为P[0,j)的最长的相等的前缀与后缀长度。比如abcmabc, 前缀abc与后缀abc可以自匹配，则n[7]=3.如果abcmabcx的x与当前文本串字符失配，则变为abcm的m与当前文本串字符做匹配。
这里，在-1设置了一个通配符作为哨兵（假想的），next数组在0处为-1，即如果P[0]与当前文本串失配，下一次将-1处的通配符移到当前位置，也就相当于模式串整体右移一步。
还需注意的问题是，对于：

当模式串在第四个位置与1失配后，模式串不应该右移1，因为左边都是0，都会与1失配。所以，在构造next数组时，也应该特别注意一下。

time: 93.85%, memory: 6.43%
class Solution {
public:
vector<int> get_next(string P){
    vector<int> next;
    next.push_back(-1);
    int p = 0, q = -1, p_len = P.size();
    while(p < p_len - 1){
        if(q < 0 || P[p] == P[q]){
            p++;
            q++;
            if(P[p] == P[q]) next.push_back(next[q]);   // 如果两个字符相同
            else next.push_back(q);  
        }
        else{
            q = next[q];
        }
    }
    return next;
}
int strStr(string haystack, string needle) {
    if (haystack.size() < needle.size()) return -1;
    // if (needle.size() == 0) return 0;
    vector<int> next = get_next(needle);
    int h_len = haystack.size(), n_len = needle.size();
    int p = 0, q = 0;
    while(q < n_len && p < h_len){
        if(q < 0 || haystack[p] == needle[q]){
            p++;
            q++;
        }
        else
            q = next[q];
    }
    if(q == needle.size()) return (p - q);
    return -1;
}

};

2020-02-10 刷题 3（字符串）

242 有效的异位数组

8 字符串转换整数（atoi）

28 实现strStr()

推荐阅读更多精彩内容