美文网首页
字符串方法0x08 -- 条件判断

字符串方法0x08 -- 条件判断

作者: import_hello | 来源:发表于2018-11-27 07:10 被阅读0次

    转载须注明出处:简书@Orca_J35 | GitHub@orca-j35

    字符串不仅支持所有通用序列操作,还实现了很多附件方法。
    我会以『字符串方法』为标题,分几篇笔记逐一介绍这些方法。
    我会在这仓库中持续更新笔记:https://github.com/orca-j35/python_notes

    endswith

    🔨 str.endswith(suffix[, start[, end]])

    Return True if the string ends with the specified suffix, otherwise return False. suffix can also be a tuple of suffixes to look for. With optional start, test beginning at that position. With optional end, stop comparing at that position.

    # 测试字符串是否以suffix结尾
    text = 'stop comparing at that position'
    assert text.endswith('tion') is True
    assert text.endswith(('tom', 'tion')) is True
    # 测试 suffix 是否等于 str_obj[start:end]
    assert text.endswith('top', 1, 4) is True
    assert text.endswith('top', 1, 3) is False
    

    startswith

    🔨 str.startswith(prefix[, start[, end]])

    Return True if string starts with the prefix, otherwise return False. prefix can also be a tuple of prefixes to look for. With optional start, test string beginning at that position. With optional end, stop comparing string at that position.

    # 测试字符串是否以suffix开头
    # 测试 suffix 是否等于 str_obj[start:end]
    

    isascii

    🔨 str.isascii()

    Return true if the string is empty or all characters in the string are ASCII, false otherwise. ASCII characters have code points in the range U+0000-U+007F.

    New in version 3.7.

    # 测试字符是否只包含ASCII字符
    from string import printable
    assert r"""!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~""".isascii() is True
    assert printable.isascii() is True
    assert '¡'.isascii() is False
    # 空字符串也会返回 True
    assert ''.isascii() is True
    

    isalnum

    🔨 str.isalnum()

    Return true if all characters in the string are alphanumeric and there is at least one character, false otherwise. A character c is alphanumeric if one of the following returns True: c.isalpha(), c.isdecimal(), c.isdigit(), or c.isnumeric().

    # 测试字符串是否只包含数字和字母
    assert 'abc123'.isalnum() is True
    assert '逆戟鲸'.isalnum() is True
    assert 'abc_123'.isalnum() is False
    assert 'abc 123'.isalnum() is False
    assert '!'.isalnum() is False
    

    isalpha

    🔨 str.isalpha()

    Return true if all characters in the string are alphabetic and there is at least one character, false otherwise. Alphabetic characters are those characters defined in the Unicode character database as “Letter”, i.e., those with general category property being one of “Lm”, “Lt”, “Lu”, “Ll”, or “Lo”. Note that this is different from the “Alphabetic” property defined in the Unicode Standard. —— 关于 “Letter” 和 “Lm”, “Lt”, “Lu”, “Ll”, or “Lo”,详见本文附录 Letter 小节。

    # 测试字符串是否只包含字母: Lu|Ll|Lt|Lm|Lo
    assert 'abc'.isalpha() is True
    assert '逆戟鲸'.isalpha() is True
    assert 'abc def'.isalnum() is False
    assert '123'.isalpha() is False
    assert '!'.isalpha() is False
    

    isdecimal

    🔨 str.isdecimal()

    Return true if all characters in the string are decimal characters and there is at least one character, false otherwise. Decimal characters are those that can be used to form numbers in base 10, e.g. U+0660, ARABIC-INDIC DIGIT ZERO. Formally a decimal character is a character in the Unicode General Category “Nd”.

    # 测试字符串是否只包含十进制字符:0,1,2,3,4,5,6,7,8,9
    assert '0123456789'.isdecimal() is True
    assert '0123456789abcdef'.isdecimal() is False
    assert '1+j1'.isdecimal() is False
    assert '6.1'.isdecimal() is False
    # 包括各种语言中表示0,1,2,3,4,5,6,7,8,9的字符
    # U+0660~U+0669表示ARABIC-INDIC语系中的0~9
    assert ''.join([chr(i) for i in range(0x660, 0x66A)]).isdecimal() is True
    

    就笔者目前的知识而言,我认为 Nd 属性和 Numeric_Type=Decimal 是充要条件。也就是说,当 Numeric_Type=Decimal 时,isdecimal() 必定返回 True。关于 Nd 和 Numeric_Type,详见本文附录部分。

    因为 Decimal ⊂ Digit ⊂ Numeric,所以当 isdecimal() 为真,isdigit()isnumeric() 必定为真:

    # 十进制字符
    assert '0123456789'.isdecimal() is True
    assert '0123456789'.isdigit() is True
    assert '0123456789'.isnumeric() is True
    # 上标'⁸'
    assert '⁸'.isdecimal() is False
    assert '⁸'.isdigit() is True
    assert '⁸'.isnumeric() is True
    # 分数
    assert '⅕'.isdecimal() is False
    assert '⅕'.isdigit() is False
    assert '⅕'.isnumeric() is True
    

    isdigit

    🔨 str.isdigit()

    Return true if all characters in the string are digits and there is at least one character, false otherwise. Digits include decimal characters and digits that need special handling, such as the compatibility superscript digits. This covers digits which cannot be used to form numbers in base 10, like the Kharosthi(Kharoshthi) numbers. Formally, a digit is a character that has the property value Numeric_Type=Digit or Numeric_Type=Decimal —— 详见本文附录Numeric_Type。

    # 测试字符串是否只包含:十进制字符和需要特殊处理的数字(例如兼容性上标数字)
    assert '0123456789'.isdigit() is True
    assert '⁸'.isdigit() is True
    # 包括不是基于10进制构建数值的数字,如U+10A40表示Kharoshthi语系中的数字1
    # 即 '\U00010A40' -> '𐩀'
    assert '\U00010A40'.isdecimal() is False
    assert '\U00010A40'.isdigit() is True
    assert '\U00010A40'.isnumeric() is True
    

    Kharosthi(Kharoshthi) 语系的计数方式不是十进制,只有数字 1、2、3、4、10、20、100、1000,详细介绍可查看Kharosthi - 维基百科

    因为 Decimal ⊂ Digit ⊂ Numeric,所以当 isdigit() 为真,则 isnumeric() 必定为真:

    # 十进制字符
    assert '0123456789'.isdecimal() is True
    assert '0123456789'.isdigit() is True
    assert '0123456789'.isnumeric() is True
    # 上标'⁸'
    assert '⁸'.isdecimal() is False
    assert '⁸'.isdigit() is True
    assert '⁸'.isnumeric() is True
    # 分数
    assert '⅕'.isdecimal() is False
    assert '⅕'.isdigit() is False
    assert '⅕'.isnumeric() is True
    

    isnumeric

    🔨 str.isnumeric()

    Return true if all characters in the string are numeric characters, and there is at least one character, false otherwise. Numeric characters include digit characters, and all characters that have the Unicode numeric value property, e.g. U+2155, VULGAR FRACTION ONE FIFTH. Formally, numeric characters are those with the property value Numeric_Type=Digit, Numeric_Type=Decimal or Numeric_Type=Numeric. —— 详见本文附录Numeric_Type。

    assert '⅕'.isnumeric() is True
    assert 'Ⅵ'.isnumeric() is True
    assert '贰'.isnumeric() is True
    

    因为 Decimal ⊂ Digit ⊂ Numeric,所以当 isnumeric() 为真,isdecimalisdigit 不一定为真:

    # 十进制字符
    assert '0123456789'.isdecimal() is True
    assert '0123456789'.isdigit() is True
    assert '0123456789'.isnumeric() is True
    # 上标'⁸'
    assert '⁸'.isdecimal() is False
    assert '⁸'.isdigit() is True
    assert '⁸'.isnumeric() is True
    # 分数
    assert '⅕'.isdecimal() is False
    assert '⅕'.isdigit() is False
    assert '⅕'.isnumeric() is True
    # 罗马数字
    assert 'Ⅵ'.isdecimal() is False
    assert 'Ⅵ'.isdigit() is False
    assert 'Ⅵ'.isnumeric() is True
    # 中文
    assert '贰'.isdecimal() is False
    assert '贰'.isdigit() is False
    assert '贰'.isnumeric() is True
    

    isidentifier

    🔨 str.isidentifier()

    Return true if the string is a valid identifier according to the language definition, section Identifiers and keywords.

    Use keyword.iskeyword() to test for reserved identifiers such as def and class.

    # 测试字符串是否是合法标识符
    assert 'if'.isidentifier() is True
    assert '_orca_j35'.isidentifier() is True
    assert '123_abc'.isidentifier() is False
    # keyword.iskeyword()用于测试是否是保留标识符
    import keyword
    assert keyword.iskeyword('def') is True
    

    isprintable

    🔨 str.isprintable()

    Return true if all characters in the string are printable or the string is empty, false otherwise. Nonprintable characters are those characters defined in the Unicode character database as “Other” or “Separator”, excepting the ASCII space (0x20) which is considered printable. (Note that printable characters in this context are those which should not be escaped when repr() is invoked on a string. It has no bearing on the handling of strings written to sys.stdout or sys.stderr.) —— 关于 “Other” 或 “Separator”,详见本文附录 Separator&Other 小节。

    注意,这里所说的可打印字符是指 repr() 函数不会转义的字符,与如何处理字符串的写入(sys.stdoutsys.stderr )无关。

    # 测试字符串是否只包含可打印字符
    assert 'orca_j35'.isprintable() is True
    # Unicode字符集中Other或Separator被定义为不可打印字符,但ASCII空格(0x20)除外
    assert '\t'.isprintable() is False
    assert ' '.isprintable() is True
    # 空字符串也会返回 True
    assert ''.isprintable() is True
    

    isspace

    🔨 str.isspace()

    Return true if there are only whitespace characters in the string and there is at least one character, false otherwise. Whitespace characters are those characters defined in the Unicode character database as “Other” or “Separator” and those with bidirectional property being one of “WS”, “B”, or “S”. —— 关于 “Other” 或 “Separator”,详见本文附录 Separator&Other 小节;关于 bidirectional property,可阅读 Bidi_ClassBidirectional Class Values

    # 测试字符串是否只包含空白字符
    # Unicode字符集中Other或Separator被定义为空白字符,以及具备双向属性(WS,B,S)的字符
    assert ' \t\n\r\v\f'.isspace() is True
    assert 'orca_j35'.isspace() is False
    # 空字符串会返回False
    assert ''.isspace() is False
    

    istitle

    🔨 str.istitle()

    Return true if the string is a titlecased string and there is at least one character, for example uppercase characters may only follow uncased characters and lowercase characters only cased ones. Return false otherwise.

    # 测试字符串是否是首字母大写的字符串
    # 大写字符只能位于非大小字符之后,小写字符只能位于小写字符之后
    assert 'A'.istitle() is True
    assert 'Orca 8@Orca 🐳逆戟鲸Orca'.istitle() is True
    assert 'Orca ORca'.istitle() is False
    assert 'Orca orca'.istitle() is False
    assert 'Orca O#rca'.istitle() is False
    assert '35orca'.istitle() is False
    # 首字母可以是 Lt 中的字符,详见本文附录 Letter 小节。
    assert 'ᾯabc'.istitle() is True
    # 汉字属于非大小写字符
    assert '逆戟鲸 Orca'.istitle() is True
    assert '逆戟鲸orca'.istitle() is False
    

    非大小写字符是指不属于 Letter 的字符,详见本文附录 Letter 小节。

    islower

    🔨 str.islower()

    Return true if all cased characters in the string are lowercase and there is at least one cased character, false otherwise.

    Cased characters are those with general category property being one of “Lu” (Letter, uppercase), “Ll” (Letter, lowercase), or “Lt” (Letter, titlecase). —— 详见本文附录 Letter 小节。

    # 测试字符串是否只包含小写字符Ll和非大小写字符
    assert 'a'.islower() is True
    assert 'ƺ'.islower() is True # Latin Small Letter Ezh with Tail
    assert 'orca j35 逆戟鲸 !@\n\t'.islower() is True
    # 至少需要一个小写字符
    assert '逆戟鲸'.islower() is False
    

    非大小写字符是指不属于 Letter 的字符,详见本文附录 Letter 小节。

    isupper

    🔨 str.isupper()

    Return true if all cased characters [4] in the string are uppercase and there is at least one cased character, false otherwise.

    Cased characters are those with general category property being one of “Lu” (Letter, uppercase), “Ll” (Letter, lowercase), or “Lt” (Letter, titlecase). —— 详见本文附录 Letter 小节。

    # 测试字符串是否只包含Lu大写字符和非大小写字符
    assert 'A'.isupper() is True
    assert 'Æ'.isupper() is True  # Latin Capital Letter Ae
    assert 'ORCA J35 逆戟鲸 !@\n\t'.isupper() is True
    assert 'Orca'.isupper() is False
    # 至少需要一个大写字符
    assert '逆戟鲸'.islower() is False
    assert '_35'.isupper() is False
    

    非大小写字符是指不属于 Letter 的字符,详见本文附录 Letter 小节。

    相关文章

      网友评论

          本文标题:字符串方法0x08 -- 条件判断

          本文链接:https://www.haomeiwen.com/subject/nnlqqqtx.html