Python 强化训练:第九篇

作者: 谢小路 | 来源:发表于2016-11-12 16:29 被阅读258次

主题

数据处理

  1. csv文件
  2. json文件
  3. xml: xpath
  4. excel

1.

  1. CSV: 逗号分隔值,其文件以纯文本形式存储表格数据(数字和文本)。
  2. 模块:csv
  3. 方法:csv.reader(), csv.writer(), csv.Dictreader(), csv.writerow(), csv.writerows()
import csv
headers = ['Symbol', 'Price', 'Date', 'Time', 'Change', 'Volume']
rows = [('AA', 39.48, '6/11/2007', '9:36am', -0.18, 181800),
         ('AIG', 71.38, '6/11/2007', '9:36am', -0.15, 195500),
         ('AXP', 62.58, '6/11/2007', '9:36am', -0.46, 935000),
       ]

with open('name.csv', newline="") as f:
    f_csv = csv.reader(f)
    headers = next(f_csv)
    print(headers)
    print("=====")
    for row in f_csv:
        print(row)
        print("===")


写入文件形式:

1478869402821.png

要求:将name.csv文件中Volume的值大于195500的数据写入name_copy.csv文件中.

import codecs
import csv

with codecs.open("name_copy.csv", 'w') as f_name_copy:
    f_name_one = csv.writer(f_name_copy)
    with codecs.open("name.csv", 'r') as f_name:
        f_name_two = csv.reader(f_name)
        headers = next(f_name_two)
        f_name_one.writerow(headers)
        for one in f_name_two:
            print(one)
            if int(one[5]) > 195500:
                f_name_one.writerow(one)

文件显示:


1478869756196.png

要求:获取雅虎指定股票历史数据,并存入csv文件中.

import requests
import csv
import codecs

response = requests.get('http://table.finance.yahoo.com/table.csv?s=000001.sz')
content = response.text

with codecs.open("pingan.csv", 'w') as f:
    content_all = csv.writer(f)
    for one in content.split('\n'):
        content_all.writerow(one.split(','))

Paste_Image.png

2.

python 如何处理json文件:

  1. json 模块
  2. dumps(),dump(), loads(),load()方法

import json
import codecs
# json.dumps()
# json.loads()
# json.dump()  # 接口是一个文件
# json.load()  # 接口是一个文件

one = {"wuhan": 10, "beijing": 1, "changsha": 6}
two = [1, 2, "apple", 'chuizi', {"a": 1, "b": 2}]
one_json = json.dumps(one, separators=[",  ", ":  "], indent=4)
one_1_json = json.dumps(one, sort_keys=True)
two_json = json.dumps(two, separators=[",", ":"])
print(one_json)
print(one_1_json)
print(two_json)

with codecs.open("one.json", 'w') as f:
     json.dump(one, f)

with codecs.open("one.json", 'r') as f:
    print(json.load(f))


转换对照表:

python json
dict object
list,tuple array
str,unicode string
int,long,float number
True true
False false
None null
print(json.dumps(None))
print(json.dumps(True))
print(json.dumps(False))

print(json.loads("null"))
print(json.loads("true"))
print(json.loads("false"))

# with codecs.open("one.json", 'w') as f:
#     json.dump(one, f)

with codecs.open("one.json", 'r') as f:
    print(json.load(f))


res = requests.get("http://www.weather.com.cn/data/cityinfo/101010100.html")
with codecs.open("weather.json", 'w', encoding="utf8") as f_wea:
    json.dump(res.text, f_wea)

with codecs.open("weather.json", 'r') as f_wea_r:
    A = json.load(f_wea_r)

print(A)


3.

xpath语法:

Syntax Meaning
tag Selects all child elements with the given tag.
* Selects all child elements.
. Selects the current node.
// Selects all subelements, on all levels beneath the current element.
.. Selects the parent element.
[@atrrib] Selects all elements that have the given attribute.
[@atrrib='value'] Selects all elements for which the given attribute has the given value.
[tag] Selects all elements that have a child named tag.
[tag="text"] Selects all elements that have a child named tag whose complete text content, including descendants, equals the given text.
[position] Selects all elements that are located at the given position.
from xml.etree.ElementTree import parse
import requests
import codecs
tree = parse("html.xml")
root = tree.getroot()
print(root.tag)
print(root.attrib)
for child in root:
    print(child.tag, child.attrib)

# tag: 查找给定标签的子节点
print(root.findall('country'))

# *:查找所有子节点
print(root.findall("*"))

# . : 查找当前节点
print(root.findall("."))

# // :所有子孙节点
print(root.findall('.//'))

# .. : 父节点
print(root.findall('.//rank/..'))

# [@atrrib] :带有这个属性值的元素
print(root.findall('.//country[@name]'))

# [@atrrib=“value”]
print(root.findall('.//country[@name="Liechtenstein"]'))

# [tag] : 带有tag子节点的节点
print(root.findall('[country]'))

4.

  1. 模块: xlrd, xlwt
  2. 功能: 负责读写操作

book.xlsx文件内容和结构:

1478938867731.png
import xlrd
import xlwt
name = xlrd.open_workbook('book.xlsx')
sheet = name.sheets()
for one in sheet:
    print(one.name)
result = name.sheet_by_name('result')
print(result.nrows, result.ncols)
one_one = result.cell(0, 0)
one_two = result.cell(0, 1)
one_three = result.cell(0, 2)
one_four = result.cell(0, 3)

# 1: text  2: number
print(one_one.ctype, one_one.value)
print(one_two.ctype, one_two.value)
print(one_three.ctype, one_three.value)
print(one_four.ctype, one_four.value)

print(result.row(1))
print(result.row_values(1))
print(result.row_values(1, 1))
print(result.col(1))
print(result.col_values(1))
print(result.col_values(1, 1))

result.put_cell(0, result.ncols, xlrd.XL_CELL_TEXT, u"总分", None)
for row in range(1, result.nrows):
    t = sum(result.row_values(row, 1))
    print(t)
    result.put_cell(row, result.ncols, xlrd.XL_CELL_NUMBER, t, None)

wbook = xlwt.Workbook()
wsheet = wbook.add_sheet("Sheet1")

style = xlwt.easyxf("align: vertical center, horizontal center")
value = [["名称", "hadoop编程实战", "hbase编程实战", "lucene编程实战"], ["价格", "52.3", "45", "36"], ["出版社", "机械工业出版社", "人民邮电出版社", "华夏人民出版社"], ["中文版式", "中", "英", "英"]]
for i in range(0, 4):
    for j in range(0, len(value[i])):
        wsheet.write(i, j, value[i][j], style)
wbook.save("wbook1.xls")

friend = name.sheet_by_index(1)
friend_copy = name.sheet_by_name("friend")

print(friend.nrows, friend.ncols)
print(friend_copy.nrows, friend_copy.ncols)



相关文章

  • Python 强化训练:第九篇

    主题 数据处理 csv文件 json文件 xml: xpath excel 1. CSV: 逗号分隔值,其文件以纯...

  • Python进阶-简单数据结构

    本文为《爬着学Python》系列第九篇文章。 从现在开始算是要进入“真刀真枪”的Python学习了。之所以这么说,...

  • Python 强化训练:第一篇

    强化训练: 第一篇 目标 0. 校招算是结束了吧!简单回顾几句: 校招python岗位极少,多是初创型公司对pyt...

  • 外行学 Python 第十一篇 数据可视化

    在 外行学 Python 爬虫 第九篇 读取数据库中的数据 中完成了使用 API 从数据库中读取所需要的数据,但是...

  • Python 强化训练:第十一篇

    主题 正常情况下,程序的运行按顺序执行,但是涉及某些操作,等待结果完成却是非常耗时的操作,比如爬虫进行IO操作等,...

  • Python 强化训练:第六篇

    强化训练:第六篇 1. 深浅拷贝:是否是同一个对象,使用id判断是否指向同一个对象, 深浅拷贝,引用区分可变对象和...

  • Python 强化训练:第五篇

    强化训练:第五篇 主题:函数 基本定义方法 任意数量参数 只接受关键字参数 显示数据类型 默认参数 匿名函数 N个...

  • Python 强化训练:第四篇

    强化训练:第四篇 问题来源 面向对象的语言的重要特性是存在类的概念 内容 新式类和旧式类 定义类的属性和“访问权限...

  • Python 强化训练:第三篇

    强化训练:第三篇 问题来源 pythoner面试经常会问到迭代器和生成器的区别 内容 可迭代对象 迭代器:正向迭代...

  • Python 强化训练:第二篇

    强化训练:第二篇 摘要:心好累. 问题来源 爬虫中会经常会遇到字符串的处理 主要内容 拆分字符串 字符串开头结尾 ...

网友评论

本文标题:Python 强化训练:第九篇

本文链接:https://www.haomeiwen.com/subject/nmeqpttx.html