py3笔记21：string使用

作者: _百草_ | 来源:发表于2022-09-10 20:25 被阅读0次

py3笔记21：string使用
requests源码分析（3） _internal_utils.
Java中String转Date Date转String
flink 嵌套 json 案例
Rust 入门 - String
GeekBand C++面向对象高级编程（上）第二周学习笔记
test9: 词条依据第一列，合并多个文件，并统计数量
格式化输出
Java中String类常用方法 + StringBuilder
[C++之旅] 8 string类型的使用

1. 字符串拼接

直接相连

# 两个字符串的值拼接
s="s1""s2"
print(s) #s1s2
# 注：仅使用与字符串常量

运算符+

s1 = "qwertyuiop ASDFGHJKL !@#$%^&*()"
s2 = "1234567890"
s = s1+s2
print(s)
# 注：仅使用相同类型，否则报错 TypeError: can only concatenate str (not "int") to str

数字与字符串拼接：str()&repr()

# 数字与字符串拼接
s1 = "This is"
num = 20
s = s1 + " " + repr(num)
print(s) # This is 20

str(obj):数字等转为字符串；转为适合人类阅读的字符串形式
repr(obj):数字等转为字符串；转为适合解释器阅读的字符串形式（适合开发/调试，若无等价则SyntaxError异常）

s_s1 = str(s1) # str,转为字符串
r_s1 = repr(s1) # repr，转为字符串
print(type(s_s1),s_s1) # <class 'str'> This is
print(type(r_s1),r_s1)  # <class 'str'> 'This is'

str(string),保持了字符串原始样子；repr(string)使用引号将字符串包围（即Python字符串的表达式形式）
另外,在 Python 交互式编程环境中输入一个表达式（变量、加减乘除、逻辑运算等）时，Python 会自动使用 repr() 函数处理该表达式?

2. 截取字符串

即切片

获取单个字符：strname[index]
strname即字符串变量名，index即索引值
索引：

从左为起点，0开始计数
从右为起点，-1开始计数

获取多个字符：strname[start:end:step]
start:默认0，字符串开头索引，包括在内
end:默认字符串长度，不包括在内
step:步长，默认1，即从start开始，每step个距离获取一个字符，直至end

print(s[1:]) #  省略end&step,获取索引1开头到末尾的子串，输出：his is 20
print(s[:4]) #  省略start&step,获取前3个字符，输出：This
print(s[::2])  # 省略start&end,每隔step获取一个字符，输出：Ti s2

3. len():获取字符串长度

内建函数

# 获取字符串长度
print(len(s))
s = "测试Python中文"
print(len(s))  # 10
print(len(s.encode())) # 18
# UTF-8编码(#-*-coding:utf-8-*-)，一个汉字占用3个字节，4(测试中文)*3+6(Python)=18
print(len(s.encode("gbk"))) # 14

在 Python 中，不同的字符所占的字节数不同，数字、英文字母、小数点、下划线以及空格，各占一个字节，而一个汉字可能占 2~4 个字节，具体占多少个，取决于采用的编码方式。例如，汉字在 GBK/GB2312 编码中占用 2 个字节，而在 UTF-8 编码中一般占用 3 个字节。

4.split():分割字符串

str.split(sep, maxsplit) # 使用sep指定分隔符分割str,最多分割maxsplit次，返回list[str1,str2……]
sep:指定分隔符，默认None,表示所有空字符串，如空格、换行符\n、制表符\t等
maxsplit:可选参数，分割次数，最后list长度最大为maxsplit+1。若不指定或=-1，则不限次数

s = "This is 20"
print(s.split(" ", maxsplit=1))

5. join():合并字符串

str.join(iterable) # 采用固定分隔符str连接iterable的多个字符串，生成一个新字符串

# li = [1,2,3,4]
# print("".join(li))  # TypeError: sequence item 0: expected str instance, int found
li2 = ["1","2","3","4"]
print("".join(li2)) # 输出：1234

6. count() :统计次数

str.count(sub,start,end) # 统计str中从索引start到索引end出现子串sub的次数
start:默认0
end:默认len(str)
等同于str[start:end].count(sub)

s = "This is 20"
print(s.count("is"))  # 2
print(s.count("is",3,6)) # 0 ;没有则返回0

7. find()&index():检测是否包含某子串

str.find(sub,start,end) # 从左start到右end检测str,是否存在sub;若存在，返回第一次出现的索引，否则返回-1
str.index(sub,start,end) # 从左start到右end检测str,是否存在sub;若存在，返回第一次出现的索引，否则报错

print(s.find("is")) # 2
print(s.find("bai")) # -1
print(s.rfind("is")) # 5  rfind,从右end开检测，到左start结束
print(s.index("is"))  # -1
print(s.index("bai"))  # ValueError: substring not found

8. ljust() & rjust &center() :对齐方法

s.ljust(width[,fillchar]) :s右侧填充fillchar已满足width长度，s本身不变
s.rjust(width[,fillchar]) :s左侧填充fillchar已满足width长度，s本身不变
s.center(width[,fillchar]) :s首尾填充fillchar已满足width长度(文本居中)，s本身不变
s:要进行填充字符串
width:包括s本身在内，字符串要占的总长度

s = "原字符串，可以填充字符"
# print(len(s)) # 11
# s.ljust()练习
s1 = s.ljust(20)  # 默认填充空格
print(s1)  # 原字符串，可以填充字符  

new_s = s.ljust(20, "h")
print(new_s)  # 原字符串，可以填充字符hhhhhhhhh
print(s)  # 原字符串本身不改变

s2 = s.ljust(10)  # 不会截取字符串
print(s2)  # 原字符串，可以填充字符

s3 = s.rjust(20)  # 默认填充空格
print(s3)  #         原字符串，可以填充字符
s4 = s.rjust(20, "w")
print(s4)  # wwwwwwwww原字符串，可以填充字符

s5 = s.center(20)
print(s5)  #    原字符串，可以填充字符
s6 = s.center(20, "-")
print(s6) #----原字符串，可以填充字符-----

9. startswith()&endswith():

s.startswith(sub[,start[,end]]) 检查s[start:end]是否以sub开头；是返回ture,否则返回false
s.endswith(sub[,start[,end]]) 检查s[start:end]是否以sub结尾；是返回ture,否则返回false
s:表示检索的字符串
参数sub:表示要检索的子串
参数start:检索的开始位置，默认为0
参数end:检索的结束位置，默认为len(s),即字符串s结束

# endswith
"""
判断是否指定后缀结尾
def endswith(self, suffix, start=None, end=None): # real signature unknown; restored from __doc__

    S.endswith(suffix[, start[, end]]) -> bool

    Return True if S ends with the specified suffix, False otherwise.  # 以suffix结尾则返回True,否则返回False
    With optional start, test S beginning at that position. # start非必传参数，开始匹配的S的索引位置
    With optional end, stop comparing S at that position. # end 非必传参数，匹配结束位置，若不传则匹配S的最后一个字符
    suffix can also be a tuple of strings to try.
    return False
"""
list1 = ["12.apk", "23.ipa", "123"]

 for item in list1:
     print(item.endswith("apk"))  # True False False

 # 参数start 的使用
 for item in list1:
     if len(item) > 2:
         print(item.endswith("apk", 4))  # False False False
     else:  # 避免索引报错
         print("length is shorter than 4!")

# 参数end 的使用
for item in list1:
    if len(item) > 2:
        # print(item.endswith("apk", end=-1)) # TypeError: endswith() takes no keyword arguments
        print(item.endswith("apk", 0, -1))  # 输出 False False False
        # print(item.endswith("ap", 0, -1))  # 输出 True False False
    else:
        print("length is shorter than 2!")

10. 大小写转化

s.title() # s字符串中每一个单词的首字母转为大写，返回新字符串;s本身不变
s.lower() # s字符串中所有大写字母转为小写，返回新字符串;s本身不变
s.upper() # s字符串中所有小写字母转为大写，返回新字符串;s本身不变

s = "This is 123! Do you know?"
# 单词首字母大写
print(s.title())  # This Is 123! Do You Know?
# 全部转为小写
print(s.lower())  # this is 123! do you know?
# 全部转为大写
print(s.upper())  # THIS IS 123! DO YOU KNOW?
# 字符串本身不变
print(s)  # This is 123! Do you know?

11. 去除指定字符

s.strip([chars]) # 删除s字符串首尾chars字符，并返回；s本身不变
s.lstrip([chars]) # 删除s字符串开头chars字符，并返回；s本身不变
s.rstrip([chars]) # 删除s字符串结尾chars字符，并返回；s本身不变
chars:指定需要删除字符；默认空格、水平制表符、回车、换行等

# 去除首尾指定字符strip,不传参默认空格；去除头部lstrip();去除尾部指定字符rstrip()
s.strip([chars])
Return a copy of the string with leading头部 and trailing尾部 whitespace remove.
If chars is given and not None, remove characters in chars instead.

s = "\nthis is 123! \r\t"
# print(s)
print(s.strip())  # this is 123!
print(s.lstrip())  # this is 123! \r\t
print(s.rstrip())  # \nthis is 123!

12. format():格式化

详见：fromat的使用

13. encode()&decode():编码转化

bytes:用来表示二进制数
str:用来表示Unicode字符

1) encode()：str类型转为bytes类型

str.encode([encoding="utf-8"][,errors="strict"])str转为bytes类型，即编码

str:要进行转换的字符串
encoding="utf-8":编码时参与的字符编码，默认utf-8
errors="strict":指定错误处理方式

strict:遇到非法字符抛出异常。默认值

ignore:忽略非法字符

replace:用？替换非法字符

xmlcharrefreplace:使用xml的字符串引用。

s = "这是一个字符串"
b = s.encode(encoding="utf-8")  # s本身不改变
print(b)  # b'\xe8\xbf\x99\xe6\x98\xaf\xe4\xb8\x80\xe4\xb8\xaa\xe5\xad\x97\xe7\xac\xa6\xe4\xb8\xb2'
print(type(b))  # <class 'bytes'>

2) decode() bytes类型的二进制数转为str

即解码
bytes.decode([encoding="utf-8"][,errors="strict"])

s2 = b.decode(encoding="utf-8")
print(s2)  # 这是一个字符串
print(type(s2))  # <class 'str'>
# s3 = b.decode(encoding="gbk")  # 解码时与编码的字符类型不一致，则会报错
# UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 8: illegal multibyte sequence

14. dir() & help() :帮助函数

dir(obj)
用于罗列出某个类或某个模块的全部内容，包括变量、方法、函数和类等
注：obj可以不写，则默认当前范围你的变量、方法和定义的类型

print(dir())
# ['__annotations__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'b', 's', 's2']

help(obj)
用于查看某个函数或模块的帮助文档
注：obj可以不写，默认帮助子程序
print(help())
注：以__开头和结尾的方法都是私有的，不能在类的外部调用。

15. replace()

s.replace(old,new[,max]) 把字符串old替换为new,替换次数超过max

s:原字符串
old:被替换的子串
new :用于替换old的新子字符串
max ：可选参数，替换次数不超过max;默认-1，即不限制

s = "aabcabdabe"

# 方法1：replace
s1 = s.replace("ab", "")
print(s1)  # cd
# 方法2;re.sub() sub(pattern, repl, string, count=0, flags=0)
import re
s2 = re.sub(r"[a][b]", "", s, flags=re.I)
# print(re.findall(r".*(ab).*", s))  # ['ab']
print(s2)

# 方法3：分割后拼接
s3 = ''.join(s.split("ab"))

# 引申：
# 删除字符串中固定位置的字符：切片
s = "abc:cba"
s1 = s[:3]+s[-3:]
print(s1)  # abccba

16. 参考

py3笔记21：string使用
1. 字符串拼接直接相连运算符+ 数字与字符串拼接：str()&repr() str(obj):数字等转为字符...
requests源码分析（3） _internal_utils.
to_native_string(string, encoding='ascii')：py2 py3编码问题uni...
Java中String转Date Date转String
String转Date String date = “2020-05-21” ------------------...
flink 嵌套 json 案例
CREATE TABLE ta (e STRING,a ROW(a21 string,a...
Rust 入门 - String
新建一个空的 String 使用 to_string 方法从字符串字面值创建 String 使用 String::...
GeekBand C++面向对象高级编程（上）第二周学习笔记
课堂笔记：三个特殊函数： String(const String& str);//拷贝构造函数 String& ...
test9: 词条依据第一列，合并多个文件，并统计数量
IPA_entry_Intersection的使用说明脚本：Intersection_py3.py需要用py3 一...
格式化输出
1.使用formatted string literals 使用formatted string literals...
Java中String类常用方法 + StringBuilder
学习笔记：String类常用方法 + StringBuilder与String的相互转换 String 类代表字符...
[C++之旅] 8 string类型的使用
[C++之旅] 8 string类型的使用使用string需包含#include 头文件初始化string对...