美文网首页
python练习:pandas练习-Occupation

python练习:pandas练习-Occupation

作者: 鲸鱼酱375 | 来源:发表于2019-06-11 05:43 被阅读0次

    题目来源github

    1.Assign it to a variable called users and use the 'user_id' as index

    从网站读取数据,并把user_id作为index

    import pandas as pd
    import numpy as np
    import io
    import requests
    
    link= 'https://raw.githubusercontent.com/justmarkham/DAT8/master/data/u.user'
    s=requests.get(link).content
    users = pd.read_csv(io.StringIO(s.decode('utf-8')),sep='|', index_col='user_id')
    

    pandas.read_csv needs a file-like object as the first argument.
    所以需要通过io.StringIO函数进行转换。

    2.查看前25条记录 和最后10条记录

    users.head(25)
    users.tail(10)
    

    3.What is the number of observations in the dataset?

    users.shape
    

    4.What is the number of columns in the dataset?

    users.shape[1]
    

    5.What is the data type of each column?

    users.dtypes #正确答案
    

    6. Print only the occupation column

    users['occupation']
    

    7.How many different occupations there are in this dataset?

    users['occupation'].nunique()
    

    8.What is the most frequent occupation?

    users['occupation'].value_counts()
    

    9.Summarize all the columns

    users.describe(include = "all")
    

    10.Summarize only the occupation column

    users.occupation.describe()
    

    11. What is the age with least occurrence?

    users['age'].value_counts().sort_values()
    
    users.age.value_counts().tail() #答案的写法
    

    具体参考答案在github

    reference:
    http://landcareweb.com/questions/6234/lai-zi-urlde-pandas-read-csv

    相关文章

      网友评论

          本文标题:python练习:pandas练习-Occupation

          本文链接:https://www.haomeiwen.com/subject/zkwxfctx.html