使用boost::algorithm库进行字符串擦除和字符串替换

作者: FredricZhu | 来源:发表于2021-04-19 08:11 被阅读0次

使用boost::algorithm库进行字符串擦除和字符串替换
Shell 字符串操作
Android中string.xml应用
Go库
sed中转义字符的替换
boost::algorithm库字符串大小写比较
String类常用方法(字符串比较、查找、替换、截取、拆分)
12. json.loads转换时单引号和u的替换
python拾遗【1】格式化字符串的特殊用法
2018-06-26 python字符串的一些总结

boost::algorithm库相对于std::string的erase和replace方法的好处就在于，可以做擦除第一个，第二个，最后一个，第N个的操作。
代码很简单，直接上代码，
CMakeLists.txt

cmake_minimum_required(VERSION 2.6)
project(lexical_cast)

add_definitions(-std=c++14)

include_directories("/usr/local/include")
link_directories("/usr/local/lib")
file( GLOB APP_SOURCES ${CMAKE_CURRENT_SOURCE_DIR}/*.cpp)
foreach( sourcefile ${APP_SOURCES} )
    file(RELATIVE_PATH filename ${CMAKE_CURRENT_SOURCE_DIR} ${sourcefile})
    string(REPLACE ".cpp" "" file ${filename})
    add_executable(${file} ${sourcefile})
    target_link_libraries(${file} boost_filesystem boost_regex boost_thread boost_system boost_serialization pthread boost_chrono)
endforeach( sourcefile ${APP_SOURCES} )

main.cpp

#include <boost/algorithm/string/erase.hpp>
#include <boost/algorithm/string/replace.hpp>

#include <iostream>
#include <string>

// 名称空间简化，重命名
namespace ba = boost::algorithm;

const std::string str = "Hello, hello, dear reader.";

// 擦除字符串中特定字符的例子
void erasing_examples() {
    std::cout << "\n erase all copy   :" << ba::erase_all_copy(str, ",");
    std::cout << "\n erase first copy :" << ba::erase_first_copy(str, ",");
    std::cout << "\n erase last copy  :" << ba::erase_last_copy(str, ",");
    std::cout << "\n ierase all copy  :" << ba::ierase_all_copy(str, "hello");
    std::cout << "\n ierase nth copy  :" << ba::ierase_nth_copy(str, ",", 1);
}

void replacing_examples() {
    std::cout << "\n replace all copy   :" << ba::replace_all_copy(str, ",", "!");
    std::cout << "\n replace first copy :" << ba::replace_first_copy(str, ",", "!");    
    std::cout << "\n replace head copy  :" << ba::replace_head_copy(str, 6, "Whaaaaaaaaaaa!");
}

int main(int argc, char* argv[]) {
    erasing_examples();
    replacing_examples();
}

程序输出如下，

图片.png

可以简单关注下boost::algorithm::erase_all_copy函数的实现，其他的实现以此类推。

boost::algorithm::erase_all_copy

template<typename SequenceT, typename RangeT>
        inline SequenceT erase_all_copy( 
            const SequenceT& Input,
            const RangeT& Search )
        {
            // 调用 find_format_all_copy
            return ::boost::algorithm::find_format_all_copy( 
                Input, 
                ::boost::algorithm::first_finder(Search),
                ::boost::algorithm::empty_formatter(Input) );
        }

find_format_all_copy调用 detail::find_format_all_copy_impl,
find_format_all_copy_impl调用::boost::algorithm::detail::find_format_all_copy_impl2，

看一下这个函数的实现，重点看我写注释那两句

  template< 
                typename OutputIteratorT,
                typename InputT,
                typename FinderT,
                typename FormatterT,
                typename FindResultT,
                typename FormatResultT >
            inline OutputIteratorT find_format_all_copy_impl2(
                OutputIteratorT Output,
                const InputT& Input,
                FinderT Finder,
                FormatterT Formatter,
                const FindResultT& FindResult,
                const FormatResultT& FormatResult )
            {       
                typedef BOOST_STRING_TYPENAME 
                    range_const_iterator<InputT>::type input_iterator_type; 

                typedef find_format_store<
                        input_iterator_type, 
                        FormatterT,
                        FormatResultT > store_type;

                // Create store for the find result
                store_type M( FindResult, FormatResult, Formatter );

                // Initialize last match
                input_iterator_type LastMatch=::boost::begin(Input);

                // Iterate through all matches
                while( M )
                {
                    // Copy the beginning of the sequence
                    Output = std::copy( LastMatch, M.begin(), Output );
                    // Copy formatted result
                    Output = std::copy( ::boost::begin(M.format_result()), ::boost::end(M.format_result()), Output );

                    // Proceed to the next match
                    // LastMatch=M.end()
                   // 说明LastMatch记录了上一次找到的字符串的结束位置
                   // 例如  在 hello, hello, deal reader这个串中，查找hello
                   // 第一次找到hello后， M.end()所在的位置就是第一个hello之后的 ","所在位置
                   // 这个","的index是  5 
                    LastMatch=M.end();    
                    M=Finder( LastMatch, ::boost::end(Input) );
                }

                // Copy the rest of the sequence
                Output = std::copy( LastMatch, ::boost::end(Input), Output );

                return Output;
            }

再关注一下 Finder函数的实现，
使用的是 ::boost::algorithm::first_finder

 template<typename RangeT>
        inline detail::first_finderF<
            BOOST_STRING_TYPENAME range_const_iterator<RangeT>::type,
            is_equal>
        first_finder( const RangeT& Search )
        {
            // first_find构造了一个仿函数，first_finderF
            return 
                detail::first_finderF<
                    BOOST_STRING_TYPENAME 
                        range_const_iterator<RangeT>::type,
                        is_equal>( ::boost::as_literal(Search), is_equal() ) ;
        }

重点查看一下first_finderF的 operator()(iterater begin, iterator end)操作符重载,
看注释就可以，

  // Operation
                template< typename ForwardIteratorT >
                iterator_range<ForwardIteratorT>
                operator()(
                    ForwardIteratorT Begin,
                    ForwardIteratorT End ) const
                {
                    typedef iterator_range<ForwardIteratorT> result_type;
                    typedef ForwardIteratorT input_iterator_type;
                  
                    // 和上面的分析一致，
                    // 如果这里是第一次hello匹配完的位置，
                   // 那么hello, hello, dear reader变成
                   //  hello, dear reader
                   // begin指向第二个hello前面的空格
                   // end指向字符串末尾，也就是最后reader的r后面

                    for(input_iterator_type OuterIt=Begin;
                        OuterIt!=End;
                        ++OuterIt)
                    {
                        // Sanity check
                        if( boost::empty(m_Search) )
                            return result_type( End, End );
                    
                        input_iterator_type InnerIt=OuterIt;
                        // m_Search是要查找的字符串，
                        // 在父串中查找子串， 
                       search_iterator_type SubstrIt=m_Search.begin();
                        for(;
                            InnerIt!=End && SubstrIt!=m_Search.end();
                            ++InnerIt,++SubstrIt)
                        {
                            // 如果查找过程中遇到不相等，说明未匹配到子串
                            // 退出匹配过程
                            if( !( m_Comp(*InnerIt,*SubstrIt) ) )
                                break;
                        }

                        // Substring matching succeeded
                        // 如果匹配到子串结束
                        // 说明匹配到子串
                       // 返回当前子串的iterator_range<iterator<const char*>, iterator<const char*>>对象
                        if ( SubstrIt==m_Search.end() )
                            return result_type( OuterIt, InnerIt );
                    }

                    return result_type( End, End );
                }

所以boost::algorithm库的子串擦除算法其实也就是逐个子串匹配。
如果父串的长度为m, 子串的长度为n
那么这种匹配的时间复杂度为 m*n。

使用boost::algorithm库进行字符串擦除和字符串替换
boost::algorithm库相对于std::string的erase和replace方法的好处就在于，可以做...
Shell 字符串操作
获取字符串的长度字符串的截取使用匹配模式进行截取字符串替换
Android中string.xml应用
配置字符串替换单个字符串替换使用Context方法,动态替换多个个字符串替换配置字符串中的空格
Go库
标准库字符串处理strings：包含了常规的字符串操作。比如查找、替换、切割等。strconv：包含了字符串和其...
sed中转义字符的替换
sed中转义字符的替换背景在使用sed的时候，经常我们会用来替换字符串，如果被替换的字符串中包含转移字符串怎么...
boost::algorithm库字符串大小写比较
程序代码如下，比较简单，注释也比较详细了。CMakeLists.txt main.cpp 下面的图简单分析了一下b...
String类常用方法(字符串比较、查找、替换、截取、拆分)
字符串比较范例：判断两个字符串字符串查找范例：示例代码替换指使用一个新的字符串替换旧的字符串，支持的方法...
12. json.loads转换时单引号和u的替换
1. 若字符串中有单引号，需替换 2. 若字符串带unicode前缀，也需进行替换若上述情况未进行替换，抛异常『...
python拾遗【1】格式化字符串的特殊用法
使用%格式化字符串时，*可以用后面元组的内容进行替换，例如：使用字典:
2018-06-26 python字符串的一些总结
最近看完了python字符串的内容，总结如下：切片使用替换函数，由于字符串不可变，则替换实际是生成新的字符串 ...