美文网首页
str.extract()

str.extract()

作者: 弦好想断 | 来源:发表于2020-04-30 13:24 被阅读0次

    先介绍str.extract(),可用正则从字符数据中抽取匹配的数据,只返回第一个匹配的数据。
    注意,正则表达式中必须有分组,只是返回分组中的数据,如果给分组取了名称,则该名称就是返回结果中的字段名。
    Series.str.extract(pat, flags=0, expand=None)
    参数:
    pat : 字符串或正则表达式
    flags : 整型,
    expand : 布尔型,是否返回DataFrame
    Returns:
    数据框dataframe/索引index

    Series.str.extractall(pat, flags=0)
    返回所有匹配的字符
    参数:
    pat : 字符串或正则表达式
    flags : 整型
    返回值:
    其他

    Series.str.capitalize() 首字母大写
    Series.str.cat([others, sep, na_rep]) 用分隔符连接列表中的字符
    Series.str.center(width[, fillchar]) Filling left and right side of strings in the Series/Index with an additional character.
    Series.str.contains(pat[, case, flags, na, …]) Return boolean Series/array whether given pattern/regex is contained in each string in the Series/Index.
    Series.str.count(pat[, flags]) Count occurrences of pattern in each string of the Series/Index.
    Series.str.decode(encoding[, errors]) Decode character string in the Series/Index using indicated encoding.
    Series.str.encode(encoding[, errors]) Encode character string in the Series/Index using indicated encoding.
    Series.str.endswith(pat[, na]) Return boolean Series indicating whether each string in the Series/Index ends with passed pattern.
    Series.str.extract(pat[, flags, expand]) For each subject string in the Series, extract groups from the first match of regular expression pat.
    Series.str.extractall(pat[, flags]) For each subject string in the Series, extract groups from all matches of regular expression pat.
    Series.str.find(sub[, start, end]) Return lowest indexes in each strings in the Series/Index where the substring is fully contained between [start:end].
    Series.str.findall(pat[, flags]) Find all occurrences of pattern or regular expression in the Series/Index.
    Series.str.get(i) Extract element from lists, tuples, or strings in each element in the Series/Index.
    Series.str.index(sub[, start, end]) Return lowest indexes in each strings where the substring is fully contained between [start:end].
    Series.str.join(sep) Join lists contained as elements in the Series/Index with passed delimiter.
    Series.str.len() Compute length of each string in the Series/Index.
    Series.str.ljust(width[, fillchar]) Filling right side of strings in the Series/Index with an additional character.
    Series.str.lower() Convert strings in the Series/Index to lowercase.
    Series.str.lstrip([to_strip]) Strip whitespace (including newlines) from each string in the Series/Index from left side.
    Series.str.match(pat[, case, flags, na, …]) Determine if each string matches a regular expression.
    Series.str.normalize(form) Return the Unicode normal form for the strings in the Series/Index.
    Series.str.pad(width[, side, fillchar]) Pad strings in the Series/Index with an additional character to specified side.
    Series.str.partition([pat, expand]) Split the string at the first occurrence of sep, and return 3 elements containing the part before the separator, the separator itself, and the part after the separator.
    Series.str.repeat(repeats) Duplicate each string in the Series/Index by indicated number of times.
    Series.str.replace(pat, repl[, n, case, flags]) Replace occurrences of pattern/regex in the Series/Index with some other string.
    Series.str.rfind(sub[, start, end]) Return highest indexes in each strings in the Series/Index where the substring is fully contained between [start:end].
    Series.str.rindex(sub[, start, end]) Return highest indexes in each strings where the substring is fully contained between [start:end].
    Series.str.rjust(width[, fillchar]) Filling left side of strings in the Series/Index with an additional character.
    Series.str.rpartition([pat, expand]) Split the string at the last occurrence of sep, and return 3 elements containing the part before the separator, the separator itself, and the part after the separator.
    Series.str.rstrip([to_strip]) Strip whitespace (including newlines) from each string in the Series/Index from right side.
    Series.str.slice([start, stop, step]) Slice substrings from each element in the Series/Index
    Series.str.slice_replace([start, stop, repl]) Replace a slice of each string in the Series/Index with another string.
    Series.str.split([pat, n, expand]) Split each string (a la re.split) in the Series/Index by given pattern, propagating NA values.
    Series.str.rsplit([pat, n, expand]) Split each string in the Series/Index by the given delimiter string, starting at the end of the string and working to the front.
    Series.str.startswith(pat[, na]) Return boolean Series/array indicating whether each string in the Series/Index starts with passed pattern.
    Series.str.strip([to_strip]) Strip whitespace (including newlines) from each string in the Series/Index from left and right sides.
    Series.str.swapcase() Convert strings in the Series/Index to be swapcased.
    Series.str.title() Convert strings in the Series/Index to titlecase.
    Series.str.translate(table[, deletechars]) Map all characters in the string through the given mapping table.
    Series.str.upper() Convert strings in the Series/Index to uppercase.
    Series.str.wrap(width, **kwargs) Wrap long strings in the Series/Index to be formatted in paragraphs with length less than a given width.
    Series.str.zfill(width) Filling left side of strings in the Series/Index with 0.
    Series.str.isalnum() Check whether all characters in each string in the Series/Index are alphanumeric.
    Series.str.isalpha() Check whether all characters in each string in the Series/Index are alphabetic.
    Series.str.isdigit() Check whether all characters in each string in the Series/Index are digits.
    Series.str.isspace() Check whether all characters in each string in the Series/Index are whitespace.
    Series.str.islower() Check whether all characters in each string in the Series/Index are lowercase.
    Series.str.isupper() Check whether all characters in each string in the Series/Index are uppercase.
    Series.str.istitle() Check whether all characters in each string in the Series/Index are titlecase.
    Series.str.isnumeric() Check whether all characters in each string in the Series/Index are numeric.
    Series.str.isdecimal() Check whether all characters in each string in the Series/Index are decimal.
    Series.str.get_dummies([sep]) Split each string in the Series by sep and return a frame of dummy/indicator variables.

    原文链接:https://blog.csdn.net/yj1556492839/article/details/79882488

    相关文章

      网友评论

          本文标题:str.extract()

          本文链接:https://www.haomeiwen.com/subject/yzgnwhtx.html