美文网首页
Chapter5——数据结构——字符串

Chapter5——数据结构——字符串

作者: crishawy | 来源:发表于2019-07-25 17:14 被阅读0次

    1. 题目列表

    poj1035,poj3080,poj1936

    2. POJ1035——Spell checker

    2.1 题目描述

    Description:
    You, as a member of a development team for a new spell checking program, are to write a module that will check the correctness of given words using a known dictionary of all correct words in all their forms.
    If the word is absent in the dictionary then it can be replaced by correct words (from the dictionary) that can be obtained by one of the following operations:
    ?deleting of one letter from the word;
    ?replacing of one letter in the word with an arbitrary letter;
    ?inserting of one arbitrary letter into the word.
    Your task is to write the program that will find all possible replacements from the dictionary for every given word.
    Input:
    The first part of the input file contains all words from the dictionary. Each word occupies its own line. This part is finished by the single character '#' on a separate line. All words are different. There will be at most 10000 words in the dictionary.
    The next part of the file contains all words that are to be checked. Each word occupies its own line. This part is also finished by the single character '#' on a separate line. There will be at most 50 words that are to be checked.
    All words in the input file (words from the dictionary and words to be checked) consist only of small alphabetic characters and each one contains 15 characters at most.
    Output:
    Write to the output file exactly one line for every checked word in the order of their appearance in the second part of the input file. If the word is correct (i.e. it exists in the dictionary) write the message: " is correct". If the word is not correct then write this word first, then write the character ':' (colon), and after a single space write all its possible replacements, separated by spaces. The replacements should be written in the order of their appearance in the dictionary (in the first part of the input file). If there are no replacements for this word then the line feed should immediately follow the colon.

    Sample Input:
    i
    is
    has
    have
    be
    my
    more
    contest
    me
    too
    if
    award
    #
    me
    aware
    m
    contest
    hav
    oo
    or
    i
    fi
    mre
    #
    
    Sample Output:
    me is correct
    aware: award
    m: i my me
    contest is correct
    hav: has have
    oo: too
    or:
    i is correct
    fi: i
    mre: more me
    
    2.2 解决思路

    本问题是字符串的模式匹配问题。由于只能删除、修改、插入一个字符,判断是否能还原。我们以两个字符串的长度为标准讨论。

    • 当两个字符串长度相同。那么统计两个字符串对应位的不同字符的个数,若大于1,则不能还原。
    • 当两个字符串的长度差1。那么判断长度小的字符串是否能模式匹配长度大的字符串,若能则可以还原。模式匹配:即判断一个串是否是另外一个串的子串(这里的子串不要求连续)
    • 当两个字符串的长度相差大于1。则不可能还原。

    接下来简单总结下string和map常用的操作函数:
    (1). string:

    • erase(iter):删除单个元素,iter为需要删除的元素的迭代器
    • insert(pos, string):在pos位置插入字符串string
    • insert(iter, string):在iter迭代器位置插入string(注意:insert操作,可以插入的位置有string.length()个)
    • find(string):当参数string是原字符串 的子串时,返回string在其中出现的第一个位置。
    • substr(pos, len):返回从pos号位开始、长度为len的字符串
      (2). map
    • map的迭代器:map<T, T>::iterator it;
    • map的元素插入:map[T1] = T2
    • map查找键:map.find(T)
    • map元素的访问:下标访问或迭代器访问,注意mao的迭代器是一个结构体,包括first和second成员。
    2.3 代码
    #include <cstdio>
    #include <iostream>
    #include <set>
    #include <vector>
    #include <string>
    #include <algorithm>
    #include <map>
    using namespace std;
    
    vector<string> dict;
    
    bool repla(string s1, string s2){
        // 讨论字符串的长度
        int len1 = s1.length();
        int len2 = s2.length();
        int num = 0;
        if (len1 == len2){ // 长度相同 
            for (int i = 0; i < len1; i++)
                if (s1[i] != s2[i])
                    num++;
            if (num == 1) return true;
        }else if(len1 - len2 == 1){ // 长度相差1 
            //短字符串模式匹配长字符串
            for (int i = 0; i < len1; i++){
                if (s1[i] == s2[num])
                    num++;
                if (num == len2) return true;
            } 
        }else if (len2 - len1 == 1){
            // 短字符串模式匹配长字符串
            for (int i = 0; i < len2; i++){
                if (s1[num] == s2[i])
                    num++;
                if (num == len1) return true;
            } 
        }
        return false;
    }
    
    int main(){
        string s;
        int k = 0;
        while (cin>>s && s != "#"){
            dict.push_back(s); //map的插入操作 
        }
        while (cin>>s && s != "#"){
            bool flag = false;
            for (int i = 0; i < dict.size(); i++){
                // 这里也可以使用temp数组保存当前可能结果 
                if (s == dict[i]){
                    cout<<s<<" is correct"<<endl;
                    flag = true;
                    break; 
                }
            }
            if (!flag){
                cout<<s<<":";
                for (int i = 0; i < dict.size(); i++){
                    if (repla(s, dict[i])){
                        cout<<" "<<dict[i];
                    }
                }
                cout<<endl;
            }
        }   
    }
    

    4. POJ1936——All in All

    4.1 问题描述

    Description

    You have devised a new encryption technique which encodes a message by inserting between its characters randomly generated strings in a clever way. Because of pending patent issues we will not discuss in detail how the strings are generated and inserted into the original message. To validate your method, however, it is necessary to write a program that checks if the message is really encoded in the final string.

    Given two strings s and t, you have to decide whether s is a subsequence of t, i.e. if you can remove characters from t such that the concatenation of the remaining characters is s.
    Input

    The input contains several testcases. Each is specified by two strings s, t of alphanumeric ASCII characters separated by whitespace.The length of s and t will no more than 100000.
    Output

    For each test case output "Yes", if s is a subsequence of t,otherwise output "No".

    Sample Input
    
    sequence subsequence
    person compression
    VERDI vivaVittorioEmanueleReDiItalia
    caseDoesMatter CaseDoesMatter
    
    Sample Output
    
    Yes
    No
    Yes
    No
    
    2.2 问题解决

    使用LCS思想,并不需要使用d数组存储,因为存储空间达到10000 * 10000,也可以使用滚动数组来优化,此处不做讨论。

    2.3 代码
    #include <cstdio>
    #include <cstring>
    #include <algorithm>
    #include <vector>
    #include <cmath>
    using namespace std;
    
    const int maxn = 100010;
    char s1[maxn], s2[maxn]; // 注意dp数组和字符串数组的下标对应关系,均从下标1开始 存储字符串 
    
    int main(){
        while (~scanf("%s%s", s1 + 1, s2 + 1)){ // 注意从下标1开始读取字符串 
            int m = strlen(s1 + 1); // 读取字符串长度也应该从下标1开始!! 
            int n = strlen(s2 + 1);
            // lcs做法 
            int i = 1;
            for (int j = 1; j <= n; j++){
                if (s1[i] == s2[j])
                    i += 1;
            } 
            if (i == m + 1) printf("Yes\n");
            else printf("No\n");        
        }
        return 0;   
    } 
    

    5.总结

    字符串系列题目列举有限,应熟悉常见的字符串操作,包括增删改查,以及一些提高的数据结构的使用,包括set、map、string等,可以提高编程效率。熟悉常见的方法,包括暴力模拟、动态规划等。

    对于字符串问题,还应多积累总结。

    相关文章

      网友评论

          本文标题:Chapter5——数据结构——字符串

          本文链接:https://www.haomeiwen.com/subject/qrscrctx.html