1. tuple 元组:
tuple = ('a', 'b', 'c')
小括号,tuple一旦初始化就不能修改
2. list 列表:
list = ['a', 'b', 'c']
中括号,list是一种有序的集合,可以随时添加和删除其中的元素。
(1)append
list.append('d')
list
->['a', 'b', 'c', 'd']
(2)extend()
list.extend(['e', 'f'])
list
->['a', 'b', 'c', 'd', 'e', 'f']
(3)insert()
list.insert(6,'g')
list
->['a', 'b', 'c', 'd', 'e', 'f', 'g']
(4)remove()
list.remove('d')
list
->['a', 'b', 'c', 'e', 'f', 'g']
(5)pop()
list.pop()
list
->['a', 'b', 'c', 'e', 'f']
3. dictionary 字典:
dictionaries are written with curly brackets, and they have keys and values.
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
4. array 数组
list中的数据类不必相同的,而array的中的类型必须全部相同。 array的创建:参数既可以是list,也可以是元组.
5. series 序列:
(1)One-dimensional ndarray with axis labels (including time series).
(2)包含index的数据列,data column has label of its own.
(3)Underneath panda stores series values in a typed array using Numpy library. This offers significant speed-up when processing data versus traditional python lists.
(4)有数据缺失时(eg: None),pd.Series 会给出不同的dtype。
//
animals = ['Tiger', 'Bear', None]
pd.Series(animals)
->
0 Tiger
1 Bear
2 None
dtype: object
//
numbers = [1, 2, None]
pd.Series[numbers]
->
0 1.0
1 2.0
2 NaN
dtype: float64
//
sports = {
'Archery':'Bhutan',
'Golf':'Scotland',
'Sumo':'Japan',
'Taekwondo':'South Korea'
}
s = pd.Series(sports)
s
->
Archery Bhutan
Golf Scotland
Sumo Japan
Taekwondo South Korea
dtype: object
//
s.iloc[3]
->'South Korea'
s.loc['Golf']
->'Scotland'
6. dataframe 数据框
(1)Two-dimensional, size-mutable, potentially heterogeneous tabular data.
Data structure also contains labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects.
(2)Two-axis labelled array.
7. String 字符串
(1)description
A string in Python is a sequence of characters. It is a derived data type. Strings are immutable. This means that once defined, they cannot be changed. Many Python methods, such as replace(), join(), or split() modify strings. However, they do not modify the original string. They create a copy of a string which they modify and return to the caller.
(2)去除空格 Stripping Whitespace
greet = ' Hello Bob '
greet.lstrip()
->'Hello Bob '
greet.lstrip()
->' Hello Bob'
greet.strip()
->'Hello Bob'
(3)Prefixes 前缀
line = 'Please have a nice day'
line.startswith('Please')
->True
line.startswith('P')
->True
line.startswith('p')
->False
(4) Parsing and Extracting 解析与提取
data = 'From stephen@uct '
atpos = data.find('@')
print(atpos)
->12
sppos = data.find(' ', atpos)
print(sppos)
->16
额外知识点:
(1) 计算运行时间
%%timeit -n 100
summary = 0
for item in s:
summary += item
->197 µs ± 40.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
(2) iteritem函数
Returns an iterator over the dictionary’s (key, value) pairs.
(3) open() 函数
file = open(filename, mode)
eg: file = open('mbox.txt', 'r')
网友评论