美文网首页
Week3_Clean/Filter Data and Make

Week3_Clean/Filter Data and Make

作者: Li_Tang | 来源:发表于2017-01-13 15:19 被阅读0次

After retrieving web data and storing them using MongoDB (pymongo), we are considering to clean or format the data in a certain consistent data, filter the data using "pipeline", and make the plot using "chart" module. All the coding was performed in Jupyter Notebook.

  1. Create a new collection, and transfer the retrieved data (.json format) to the new data collection and make a copy for that collections using either mongo shell or cmd:
hw3_1.png hw3_2.png
  1. Below is the link for the code on how to show the top 3 posted categories in one selected zone:
    https://anaconda.org/tangli666/week3_hw_v2/notebook
hw3_3.png
  1. Below is the link for the code on how to show the relationship between the item condition and the average price:
    https://anaconda.org/tangli666/week3_hw_v10/notebook
    Note: in order to filter and format the 'price', some modification was made and update to a the new collection:
    """
    for i in item_info.find():
    try:
    price = int(i['price'].split(' ')[0])
    except ValueError:
    price = 0
    item_info.update({'_id':i['_id']},{'$set':{'price':price}})
    """
hw3_4.png
  1. Last, the command line for exporting the data collection to a csv file:
hw3_5.png

相关文章

网友评论

      本文标题:Week3_Clean/Filter Data and Make

      本文链接:https://www.haomeiwen.com/subject/jwksbttx.html