recently Python
[toc]
1 pycodestyle
- 列表项
Simple Python style checker in one Python file pycodestyle
- usage
pycodestyle --show-source --show-pep8 test.py
# show-source 该参数使错误体式时显示代码
# show-pep8 改参数显示PEP8文档对错误的解释
- example 1
import re # att3.png-10.3kBention to a space here
# Wrong Reason: one line following superfluous \s (eg: space \n)
# it could be the line itself is blank, or one blank line has an indent
![1.png-10.9kB][1]
- example 2
don't use spaces around "=" in function arguments
![2.png-16.1kB][2]
- example 3
surround operators with a single space on either side.
![3.png-10.3kB][3]
- example 4
don't use superfluous space ,however the last line should end with a new line.
![4.png-6.1kB][4]
2 how to find the site-packages in command line?
type the following command :
python -m site --user-site
# output: C:Users\Administrator\AppData\Roaming\Python\Python35\site-packages
3 Set dict default value
Sometimes I need access a new key in a dict ,but a error will arise, as the new key is't in the keys before . I'd like set a dict return a default valuse when I access a new key.
- 1 Using defaultdict
defaultdict is a majic dict ,because it accept a callable methord as argument.The argument methord will return the defualt valuse when I try to access a new key.
from collctions import defaultdict
context = defuaultdict(ls)
print(context["abc"])
# Output: []
context = defuaultdict(dict)
print(context["abc"])
# Output: {}
context = defuaultdict(lambda : "default value")
print(context["abc"])
# Output: default value
- 2 Change the __ missing __
>>> from collections import defaultdict
>>> print defaultdict.__missing__.__doc__
__missing__(key) # Called by __getitem__ for missing key; pseudo-code:
if s
elf.default_factory is None: raise KeyError(key)
self[key] = value = self.default_factory()
return value
#通过查看__missing__()方法的docstring,可以看出当使用__getitem__()方法访问一个不存在的键时,会调用__missing__()方法获取默认值,并将该键添加到字典中去。
I will modify missing() to make a subclass equal to defaultdict.
>>> class Defaulting(dict):
# modify the __missing__
... def __missing__(self, key):
... self[key] = 'default'
... return 'default'
...
>>> d = Defaulting()
>>> d
{}
>>> d['foo']
'default'
>>> d
{'foo': 'default'}
- 3 setdefault
setdefault(key, default) , if key is new return its value as default, here is a example for counting string.
However, setdefault is slower than defaultdict.
strings = ('puppy', 'kitten', 'puppy', 'puppy',
'weasel', 'puppy', 'kitten', 'puppy')
counts = {}
for item in strings:
counts[item] = counts.setdefault(item, 0) + 1
- 4 fromkeys
name_list = ['kevin', 'robin']
context = {}.fromkeys(name_list, 9)
# default value id None
# {'kevin': 9, 'robin': 9}
context = dict.fromkeys([1, 2], True)
# {1: True, 2: True}
4 Quick remove duplicate
# quick remove duplicate
{}.fromkeys(mylist).keys()
# Traditional: mylist = list(set(mylist))
```
---
## 5 deep copy
+ 1 list
```
a = [3, 2, 1]
b = a[:]
```
+ 2 dict
```
a = {'male':0, 'female': 1}
b = a.copy()
```
---
## 6 Sublime Input
> Sublimerebl is a plugins for the input methord , firstly choose "python_input"build system (ctrl + shift +B) , then a new wins will appeal for replacing the command line.It looks like the follow :
![repl.gif-90kB][5]
---
## 7 file encode
```
import os
path = "C:\Users\Administrator\Desktop\1"
files = os.listdir(path)
print(files)
# path = "C:\Users\Administrator\Desktop\1"
^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
```
> * The problem is with the string
**"C:\Users\Eric\Desktop\beeline.txt"**
Here, \U starts an eight-character Unicode escape, such as '\U00014321`. In your code, the escape is followed by the character 's', which is invalid.
Prefix the string with **r** (to produce a raw string).
---
## 8 maxint minint
```
>>> import sys
>>> sys.maxsize
9223372036854775807
>>> -sys.maxsize -1
-9223372036854775808
```
---
## 9 char to ASSCI
```
>>> chr(65)
'A'
>>> ord('a')
97
```
---
## 10 copy
之前我会用`list_a = list_b[:]`的方法产生一份list_b的copy,现在有了规范的用法。深浅拷贝主要针对复合变量,像嵌套的list,class等。他们的差如下:
> * A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.
> * A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.
+ 简单用法
```
import copy
copy.copy(x)
# Return a shallow copy of x.
copy.deepcopy(x)
# Return a deep copy of x.
```
---
## 11 import parent folder module
```
toplevel_package/
├── __init__.py
├── moduleA.py
└── subpackage
├── __init__.py
└── moduleB.py
```
In moduleB:
`from toplevel_package import moduleA`
or `import toplevel_package.moduleA`
If you'd like to run moduleB.py as a script then make sure that parent directory for toplevel_package is in yous sys.path.
As for how to add it , see the bellow content.
---
## 12 add a directory to PYTHONPATH
> sys.path
A list of strings that specifies the search path for modules. Initialized from the environment variable PYTHONPATH, plus an installation-dependent default.
> As initialized upon program startup, the first item of this list, path[0], is the directory containing the script that was used to invoke the Python interpreter. If the script directory is not available (e.g. if the interpreter is invoked interactively or if the script is read from standard input), path[0] is the empty string, which directs Python to search modules in the current directory first.
> sys.path 是系统路径集合的一个List,其中第一个为当前路径,因此它是个变量。同一路径下运行的程序的sys.path相同。
> 现在考虑几个复杂的问题:
1、由于C.py中的sys.path[0]是`~\top\subC`,当其调用`import subD.D`时自然会报错。
2、在A.py中的`sys.path[0]`是`~\top`,故其调用`import subD.D`成功。
3、在A.py中加一句`import subC.C`,由于是A.py在运行,故`sys.path[0]`仍然是`~\top`,所以原本不能正确运行的C.py也正确地被调用了。
```
top/
├── __init__.py
├── A.py
├── B.py
└── subC
├── __init__.py
└── C.py
└── subD
├── __init__.py
└── D.py
```
---
## 13 Get current directory and file name
1、 most simple
```
import os
cwd = os.getcwd()
os.listdir(cwd)
# it does't distinguish the filename or dirname
```
2、distinguish the dirname
```
from os import listdir
from os.path import isfile, join
onlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]
# use os.walk()
from os import walk
f = []
for (dirpath, dirnames, filenames) in walk(mypath):
f.extend(filenames)
break
```
---
## 14 Python sending mail
```
# -*- coding:utf-8 -*-
# Filename: testMail.py
# Author:Adaxry
# Date: 2017-11-08 16:47:42 Wednesday
import smtplib
from email.mime.text import MIMEText
def mail(email_to, email_subject, content): # 参数分别为收件人、主题、内容
sender = '18103995191@163.com' # 发件人
msg = MIMEText(content)
msg['from'] = sender
msg['to'] = email_to
msg['subject'] = email_subject
s = smtplib.SMTP('smtp.163.com')
s.login('18103995191@163.com', 'lyj201314xx') # 邮箱账号和密码
s.sendmail(sender, email_to, msg.as_string())
s.quit()
print("发送成功")
mail("14282008@bjtu.edu.cn", "测试", "hello mail")
```
---
## 15 Generate foder tree
I'd like to generate a tree to represent the structure of current folder, it looks like the following picture.
```
├─.idea
│ ├─inspectionProfiles
│ ├─misc.xml
│ ├─modules.xml
│ └─workspace.xml
├─out
│ └─production
│ ├─TempJava
│ │ ├─A.class
│ │ ├─B.class
│ │ ├─Main$HeapSort.class
│ │ ├─Main$LexicalOrder.class
│ │ ├─Main$ListNode.class
│ │ ├─Main$TreeNode.class
│ │ ├─Main.class
│ │ ├─Solution.class
│ │ ├─Temp.class
│ │ └─Test.class
│ └─.DS_Store
├─src
│ ├─Main.java
│ ├─Solution.java
│ ├─Temp.java
│ └─Test.java
├─.DS_Store
└─TempJava.iml
```
I use recursion to draw picture.Firstly, `draw(dirpath)` take a dirpath as argument, travel around the dirpath , and check each file in the dir.If a file is a simple file that test by `isfile()`, just print the file.If not, it means that this file is a subfolder,we should go over this subfolder and return.
I create a stack named as 'state', it word before we go to a subfolder.If the subfolder's father foder is the last file, push a "True" into state, if not push a "False". State should be deal with before we print a file name.We print `"| "` if state's value if "False" and `" "` for "True"
For example , the current filename is "test.py" and state's value is ["False", "True", "False"],we should print the following content.
```
| | ├─test.py
```
Code is here, but is seems not to work in the command-line windows.
```
# -*- coding:utf-8 -*-
# Filename: tree.py
# Author:Adaxry
# Date: 2017-11-07 18:50:35 Tuesday
import os
from os.path import isfile, join
cwd = os.getcwd()
basic = "├─"
last = "└─"
single = "|"
space = " "
state = []
def dealState(_state):
for i in _state:
if i: # "| " indicate the corresponding father is the last
print(space * 3, end="")
else:
print(single + space * 2, end="")
def draw(dirpath):
files = os.listdir(dirpath)
islast = False
for f in files:
if f == files[-1]:
islast = True
dealState(state)
if isfile(join(dirpath, f)):
if islast:
print(last + f)
else:
print(basic + f)
else: # is folder
state.append(islast)
if islast:
print(last + f)
else:
print(basic + f)
draw(join(dirpath, f))
if len(state) != 0:
state.pop()
draw(cwd)
```
---
## 16 Python OCR
This try is based on the google tesseract, a guy make an unofficial installation tool on [windows][6].This tool will guide us to install this system and pre-trained datas which are used for OCR.I recommand that add the path of `tesseract.ext` and `tessdata`to PATH. For me , they are :
```
D:\file\tesseractOCR
D:\file\tesseractOCR\tessdata
```
After this, we could begin recognize data, for example there is a `test.jpg` in the `G:\\`position , we firstly open the command-line windows here.Then type this command
```
G:\>tesseract test.jpg out -l chi_sim
# "out" mean that we want the output save in "out.txt"
# "-l chi_sim" mean that we use `简体汉语`, while "eng" for English
```
+ **Python API**
I use pytesseract as an API, this lib could be easily install via `pip` , we can use it like the following code.
```
try:
import Image
except ImportError:
from PIL import Image
import pytesseract
output = pytesseract.image_to_string(Image.open('G:\\test.jpg'), lang="chi_sim")
# remove " " in the output
# attion: strip() only work at the end and head of str
output = output.replace(" ", "")
```
---
[1]: http://static.zybuluo.com/Adaxry/8fd45at5tddusbyzgynhm5j5/1.png
[2]: http://static.zybuluo.com/Adaxry/ud8zkmrqepobxmb8go9g8nfs/2.png
[3]: http://static.zybuluo.com/Adaxry/du59l8a07hi59a0x3vr3w0t1/3.png
[4]: http://static.zybuluo.com/Adaxry/w3s32eb5uptjl6ifwa35d8a1/4.png
[5]: http://static.zybuluo.com/Adaxry/bqm4lpff6abuajyjp3010ixh/repl.gif
[6]: https://github.com/UB-Mannheim/tesseract/wiki
网友评论