美文网首页Pandas技巧
Pandas_Select_Data_Boolen

Pandas_Select_Data_Boolen

作者: Kaspar433 | 来源:发表于2020-03-31 21:30 被阅读0次

    Pandas_Select_Data_Boolen

    import pandas as pd
    import numpy as np
    ​
    iris = pd.read_csv('iris.csv')
    iris.head(2)
    
    out:
    sepal_length    sepal_width petal_length    petal_width species
    0   5.1 3.5 1.4 0.2 setosa
    1   4.9 3.0 1.4 0.2 setosa
    

    另一种常见操作是使用布尔向量来过滤数据。

    操作符为:| 对应or,& 对应and,~对应not。

    必须使用括号对这些进行分组。

    使用布尔向量索引系列的工作方式与NumPy ndarray完全相同:

    iris[iris.sepal_length>7]
    
    out:
    sepal_length    sepal_width petal_length    petal_width species
    102 7.1 3.0 5.9 2.1 virginica
    105 7.6 3.0 6.6 2.1 virginica
    107 7.3 2.9 6.3 1.8 virginica
    109 7.2 3.6 6.1 2.5 virginica
    117 7.7 3.8 6.7 2.2 virginica
    118 7.7 2.6 6.9 2.3 virginica
    122 7.7 2.8 6.7 2.0 virginica
    125 7.2 3.2 6.0 1.8 virginica
    129 7.2 3.0 5.8 1.6 virginica
    130 7.4 2.8 6.1 1.9 virginica
    131 7.9 3.8 6.4 2.0 virginica
    135 7.7 3.0 6.1 2.3 virginica
    
    iris.loc[iris.sepal_length>7]
    
    out:
    sepal_length    sepal_width petal_length    petal_width species
    102 7.1 3.0 5.9 2.1 virginica
    105 7.6 3.0 6.6 2.1 virginica
    107 7.3 2.9 6.3 1.8 virginica
    109 7.2 3.6 6.1 2.5 virginica
    117 7.7 3.8 6.7 2.2 virginica
    118 7.7 2.6 6.9 2.3 virginica
    122 7.7 2.8 6.7 2.0 virginica
    125 7.2 3.2 6.0 1.8 virginica
    129 7.2 3.0 5.8 1.6 virginica
    130 7.4 2.8 6.1 1.9 virginica
    131 7.9 3.8 6.4 2.0 virginica
    135 7.7 3.0 6.1 2.3 virginica
    

    & (and)

    iris[(iris.sepal_length>7) & (iris.sepal_width<3)]
    
    out:
    sepal_length    sepal_width petal_length    petal_width species
    107 7.3 2.9 6.3 1.8 virginica
    118 7.7 2.6 6.9 2.3 virginica
    122 7.7 2.8 6.7 2.0 virginica
    130 7.4 2.8 6.1 1.9 virginica
    
    iris.loc[(iris.sepal_length>7) & (iris.sepal_width<3)]
    
    out:
    sepal_length    sepal_width petal_length    petal_width species
    107 7.3 2.9 6.3 1.8 virginica
    118 7.7 2.6 6.9 2.3 virginica
    122 7.7 2.8 6.7 2.0 virginica
    130 7.4 2.8 6.1 1.9 virginica
    

    |(or)

    iris[(iris.sepal_length>7) | (iris.sepal_width>4)]

    out:
    sepal_length sepal_width petal_length petal_width species
    15 5.7 4.4 1.5 0.4 setosa
    32 5.2 4.1 1.5 0.1 setosa
    33 5.5 4.2 1.4 0.2 setosa
    102 7.1 3.0 5.9 2.1 virginica
    105 7.6 3.0 6.6 2.1 virginica
    107 7.3 2.9 6.3 1.8 virginica
    109 7.2 3.6 6.1 2.5 virginica
    117 7.7 3.8 6.7 2.2 virginica
    118 7.7 2.6 6.9 2.3 virginica
    122 7.7 2.8 6.7 2.0 virginica
    125 7.2 3.2 6.0 1.8 virginica
    129 7.2 3.0 5.8 1.6 virginica
    130 7.4 2.8 6.1 1.9 virginica
    131 7.9 3.8 6.4 2.0 virginica
    135 7.7 3.0 6.1 2.3 virginica

    
    ```python
    iris.loc[(iris.sepal_length>7) | (iris.sepal_width>4)]
    
    out:
    sepal_length    sepal_width petal_length    petal_width species
    15  5.7 4.4 1.5 0.4 setosa
    32  5.2 4.1 1.5 0.1 setosa
    33  5.5 4.2 1.4 0.2 setosa
    102 7.1 3.0 5.9 2.1 virginica
    105 7.6 3.0 6.6 2.1 virginica
    107 7.3 2.9 6.3 1.8 virginica
    109 7.2 3.6 6.1 2.5 virginica
    117 7.7 3.8 6.7 2.2 virginica
    118 7.7 2.6 6.9 2.3 virginica
    122 7.7 2.8 6.7 2.0 virginica
    125 7.2 3.2 6.0 1.8 virginica
    129 7.2 3.0 5.8 1.6 virginica
    130 7.4 2.8 6.1 1.9 virginica
    131 7.9 3.8 6.4 2.0 virginica
    135 7.7 3.0 6.1 2.3 virginica
    

    ~(not)

    iris[~(iris.sepal_length<=7)]
    
    out:
    sepal_length    sepal_width petal_length    petal_width species
    102 7.1 3.0 5.9 2.1 virginica
    105 7.6 3.0 6.6 2.1 virginica
    107 7.3 2.9 6.3 1.8 virginica
    109 7.2 3.6 6.1 2.5 virginica
    117 7.7 3.8 6.7 2.2 virginica
    118 7.7 2.6 6.9 2.3 virginica
    122 7.7 2.8 6.7 2.0 virginica
    125 7.2 3.2 6.0 1.8 virginica
    129 7.2 3.0 5.8 1.6 virginica
    130 7.4 2.8 6.1 1.9 virginica
    131 7.9 3.8 6.4 2.0 virginica
    135 7.7 3.0 6.1 2.3 virginica
    

    使用“-”也可以起到同样作用。

    iris[-(iris.sepal_length<=7)]
    
    out:
    sepal_length    sepal_width petal_length    petal_width species
    102 7.1 3.0 5.9 2.1 virginica
    105 7.6 3.0 6.6 2.1 virginica
    107 7.3 2.9 6.3 1.8 virginica
    109 7.2 3.6 6.1 2.5 virginica
    117 7.7 3.8 6.7 2.2 virginica
    118 7.7 2.6 6.9 2.3 virginica
    122 7.7 2.8 6.7 2.0 virginica
    125 7.2 3.2 6.0 1.8 virginica
    129 7.2 3.0 5.8 1.6 virginica
    130 7.4 2.8 6.1 1.9 virginica
    131 7.9 3.8 6.4 2.0 virginica
    135 7.7 3.0 6.1 2.3 virginica
    
    iris.loc[~(iris.sepal_length<=7)]
    
    out:
    sepal_length    sepal_width petal_length    petal_width species
    102 7.1 3.0 5.9 2.1 virginica
    105 7.6 3.0 6.6 2.1 virginica
    107 7.3 2.9 6.3 1.8 virginica
    109 7.2 3.6 6.1 2.5 virginica
    117 7.7 3.8 6.7 2.2 virginica
    118 7.7 2.6 6.9 2.3 virginica
    122 7.7 2.8 6.7 2.0 virginica
    125 7.2 3.2 6.0 1.8 virginica
    129 7.2 3.0 5.8 1.6 virginica
    130 7.4 2.8 6.1 1.9 virginica
    131 7.9 3.8 6.4 2.0 virginica
    135 7.7 3.0 6.1 2.3 virginica
    

    使用map()

    criterion = iris.sepal_length.map(lambda s: s>7.5)
    iris[criterion]
    
    out:
    sepal_length    sepal_width petal_length    petal_width species
    105 7.6 3.0 6.6 2.1 virginica
    117 7.7 3.8 6.7 2.2 virginica
    118 7.7 2.6 6.9 2.3 virginica
    122 7.7 2.8 6.7 2.0 virginica
    131 7.9 3.8 6.4 2.0 virginica
    135 7.7 3.0 6.1 2.3 virginica
    

    相关文章

      网友评论

        本文标题:Pandas_Select_Data_Boolen

        本文链接:https://www.haomeiwen.com/subject/szaguhtx.html