简单的输入输出练习
>>> name = input('your name:')
your name:Jack
>>> gender = input('you are a boy?(y/n)')
you are a boy?(y/n)y
>>> name
'Jack'
>>> gender
'y'
>>> welcome_str = 'Welcome to the matrix {prefix} {name}.'
>>> welcome_dic = {}
>>> welcome_dic = {}
>>> welcome_dic = {
... 'prefix': 'Mr.' if gender == 'y' else 'Mrs',
... 'name': name
... }
>>> print('authorizing...')
authorizing...
>>> print(welcome_str.format(**welcome_dic))
Welcome to the matrix Mr. Jack.
input函数接收的都是string,可以看看下面的例子
>>> a = input()
1
>>> b = input()
2
>>>
>>> print('a + b = {}'.format(a + b))
a + b = 12
>>> print('type of a is {}, type of b is {}'.format(type(a), type(b)))
type of a is <class 'str'>, type of b is <class 'str'>
>>> print('a + b = {}'.format(int(a) + int(b)))
a + b = 3
文件的输入输出
python程序很多时候要涉及到IO操作,涉及到文件的读写,下面是一个简单的列子。实现以下功能:
- 读取文件
- 去除所有的标点符号和换行符,并且要把大写变成小写
- 合并相同的词,统计每个词出现的频率,并且按照词频从大到小排序
- 将结果按行保存到文件out.txt中
需要处理的文件in.txt内容如下:
I have a dream that my four little children will one day live in a nation where they will not be judged by the color of their skin but by the content of their character. I have a d
ream today.
I have a dream that one day down in Alabama, with its vicious racists, . . . one day right there in Alabama little black boys and black girls will be able to join hands with little
white boys and white girls as sisters and brothers. I have a dream today.
I have a dream that one day every valley shall be exalted, every hill and mountain shall be made low, the rough places will be made plain, and the crooked places will be made strai
ght, and the glory of the Lord shall be revealed, and all flesh shall see it together.
This is our hope. . . With this faith we will be able to hew out of the mountain of despair a stone of hope. With this faith we will be able to transform the jangling discords of o
ur nation into a beautiful symphony of brotherhood. With this faith we will be able to work together, to pray together, to struggle together, to go to jail together, to stand up fo
r freedom together, knowing that we will be free one day. . . .
And when this happens, and when we allow freedom ring, when we let it ring from every village and every hamlet, from every state and every city, we will be able to speed up that da
y when all of God's children, black men and white men, Jews and Gentiles, Protestants and Catholics, will be able to join hands and sing in the words of the old Negro spiritual: "F
ree at last! Free at last! Thank God Almighty, we are free at last!"
处理脚本如下所示
# more input_out.py
import re
def parse(text):
# 使用正则表达式去除标点符号和换行符
text = re.sub(r'[^\w]', ' ', text)
# 转为小写
text = text.lower()
# 生成所有单词的列表
word_list = text.split(' ')
# 去除空白单词
word_list = filter(None, word_list)
# 生成单词和词频的字典
word_cnt = {}
for word in word_list:
if word not in word_cnt:
word_cnt[word] = 0
word_cnt[word] += 1
# 按照词频排序
sorted_word_cnt = sorted(word_cnt.items(), key=lambda kv: kv[1], reverse=True)
return sorted_word_cnt
with open('in.txt', 'r') as fin:
text = fin.read()
word_and_freq = parse(text)
with open('out.txt', 'w') as fout:
for word, freq in word_and_freq:
fout.write('{} {}\n'.format(word, freq))
# more out.txt
and 15
be 13
will 11
to 11
the 10
of 10
a 8
we 8
day 6
able 6
every 6
together 6
i 5
have 5
dream 5
that 5
one 5
with 5
this 5
in 4
shall 4
free 4
when 4
little 3
black 3
white 3
made 3
faith 3
at 3
last 3
children 2
nation 2
by 2
their 2
today 2
alabama 2
boys 2
girls 2
join 2
hands 2
mountain 2
places 2
all 2
it 2
our 2
hope 2
up 2
freedom 2
ring 2
from 2
god 2
men 2
my 1
four 1
live 1
where 1
they 1
not 1
judged 1
color 1
skin 1
but 1
content 1
character 1
down 1
its 1
vicious 1
racists 1
right 1
there 1
as 1
sisters 1
brothers 1
valley 1
exalted 1
hill 1
low 1
rough 1
plain 1
crooked 1
straight 1
glory 1
lord 1
revealed 1
flesh 1
see 1
is 1
hew 1
out 1
despair 1
stone 1
transform 1
jangling 1
discords 1
into 1
beautiful 1
symphony 1
brotherhood 1
work 1
pray 1
struggle 1
go 1
jail 1
stand 1
for 1
knowing 1
happens 1
allow 1
let 1
village 1
hamlet 1
state 1
city 1
speed 1
s 1
jews 1
gentiles 1
protestants 1
catholics 1
sing 1
words 1
old 1
negro 1
spiritual 1
thank 1
almighty 1
are 1
json序列化
json序列化常常用在下面两种场景:
- 第一种,输入一些杂七杂八的信息,比如python字典,输出一个字符串;
- 第二种,输入这个字符串,可以输出包含原始信息的python字典
下面是基本代码示列
# cat json_2.py
import json
params = {
'symbol': '123456',
'type': 'limit',
'price': 123.4,
'amount': 23
}
with open('params.json', 'w') as fout:
params_str = json.dump(params, fout)
with open('params.json', 'r') as fin:
original_params = json.load(fin)
print('after json deserialization')
print('type of original_params = {}, original_params = {}'.format(type(original_params), original_params))
python json_2.py
after json deserialization
type of original_params = <class 'dict'>, original_params = {'symbol': '123456', 'type': 'limit', 'price': 123.4, 'amount': 23}
其中,json.dumps()这个函数,接受python的基本数据类型,然后将其序列化为string;
json.loads(),这个函数,接受一个合法字符串,然后将其反序列化为python的基本数据类型
高阶练习
能否把上面NLP例子当中的word count实现一遍?这次要求in.txt可能非常非常大(意味着不能一次读取到内存中),
而output.txt不会很大(意味着重复的单词数量很多)
参考代码如下所示
# cat input_1.py
import re
# 表示一次最多读取的字符长度
CHUNK_SIZE = 100
# 这个函数每次会接收上一次得到的last_word,然后和这次的text合并起来处理。
# 合并以后判断最后一个次有没有可能连续,并分离出来,然后返回。
# 这里的代码没有if语句,但是任然是正确的
def parse_to_word_list(text, last_word, word_list):
print("text is:", text, "last_word is:", last_word)
text = re.sub(r'[^\w]', ' ', last_word + text)
text = text.lower()
cur_word_list = text.split(' ')
cur_word_list, last_word = cur_word_list[:-1], cur_word_list[-1]
word_list += filter(None, cur_word_list)
print("处理以后:","text is:", text, "last_word is:", last_word)
print("word_list is:", word_list)
return last_word
def solve():
with open('in.txt', 'r') as fin:
word_list, last_word = [], ''
while True:
text = fin.read(CHUNK_SIZE)
if not text:
break #读取完毕,中断循环
last_word = parse_to_word_list(text, last_word, word_list)
word_cnt = {}
for word in word_list:
if word not in word_cnt:
word_cnt[word] = 0
word_cnt[word] += 1
sorted_word_cnt = sorted(word_cnt.items(),key=lambda kv:kv[1],reverse=True)
return sorted_word_cnt
print(solve())
在家里面向百度网盘写入不超过5GB的数据,在公司检测一旦有新数据,立即拷贝到本地,
然后删除网盘上面的数据。等家里面电脑检测到本地数据全部传入到公司电脑时,再进行下一次写入,直到所有数据都传输过去
在家里面协写一个server.py,在公司写一个client.py来实现这个需求
参考代码如下所示
# cat server.py
import os
from shutil import copyfile
import time
BASE_DIR = 'server/'
NET_DIR = 'net/'
def main():
filenames = os.listdir(BASE_DIR)
for i, filename in enumerate(filenames):
print('copying {} into net drive... {}/{}'.format(filename, i+1,len(filenames)))
copyfile(BASE_DIR + filename, NET_DIR + filename)
print('copied {} into net drive, waiting client complete... {}/{}'.format(filename, i+1, len(filename
while os.path.exists(NET_DIR + filename):
time.sleep(3)
print('transferred {} into client. {}/{}'.format(filename,i+1,len(filenames)))
if __name__ == "__main__":
main()
# cat client.py
import os
from shutil import copyfile
import time
BASE_DIR = 'client/'
NET_DIR = 'net/'
def main():
while True:
filenames = os.listdir(NET_DIR)
for filename in filenames:
print('downloading {} into local disk...'.format(filename))
copyfile(NET_DIR + filename, BASE_DIR + filename)
os.remove(NET_DIR + filename)
print('downloaded {} into local disk.'.format(filename))
time.sleep(3)
if __name__ == "__main__":
main()
网友评论