1、Pandas实现数据的合并concat,增加一行(https://blog.csdn.net/weixin_47661174/article/details/124698328)
pd.concat([df1,df2])
2、Series、DataFrame(pandas)和ndarray(numpy)三者相互转换(https://blog.csdn.net/qq_36743482/article/details/114678409)
ndarray => Series
npa = np.arange(12)
ser = pd.Series(npa)
Series => ndarray
npa_s = np.array(ser)
ndarray => DataFrame
npa2 = npa.reshape(3, -1)
df = pd.DataFrame(npa2)
DataFrame => ndarray
npa_d = np.array(df)
npa_v = df.values # npa_d npa_v 一样
DataFrame -> Series
type(df[0]) # pandas.core.series.Series
Series -> DataFrame
pd.DataFrame(ser)
3、python中series转dataframe的两种方法(https://zhuanlan.zhihu.com/p/469512251)
pd.DataFrame([j.to_dict()]) #series有转frame dict等方法
4、pandas读取某几行(https://blog.csdn.net/weixin_39025679/article/details/109216669)
https://blog.csdn.net/bianxia123456/article/details/111396760
np.loc[0:m]
python.pandas.DataFrame初始化,dic写入,切片写入,存csv问题合集(https://zhuanlan.zhihu.com/p/489099818)
df3.loc[['No.1','No.3'],['name','color']] # '[]', 索引特定行列
name color
No.1 apple red
No.3 watermelon green
5、pandas定位某一行、选取列、列累加
for i in range(len(all_data)):
# print(all_data['飞靶号'][0])
# print(all_data[i])
if all_data['飞靶号'][0]=='退电品':
print(i)
print(all_data.iloc[i])
# all_data = np.delete(all_data,i,axis=0)
exit(1)
X=all_data[['温度','PH','L','A','B']]
y=all_data[['二次染时']]
X.head()
x_train[['温度']].apply(lambda x:x.sum())
6、使用numpy初始化数据类型为object的空数组(https://www.cnpython.com/qa/1341898)
a = np.empty((12,), dtype=object)
7、Python: numpy数组添加一行或者一列, numpy数组的增删查改(https://blog.csdn.net/qq_40765537/article/details/105869910)
import numpy as np
a = np.array([[1,2,3],[4,5,6],[7,8,9]])
b = np.array([2,5,8])
print(np.r_[a,[b]])
输出:
[[1 2 3]
[4 5 6]
[7 8 9]
[2 5 8]]
8、【Python数据处理】用pandas将dataframe写入excel中(https://blog.csdn.net/chengyikang20/article/details/90139384)
将pycharm生成的数据用pandas库中的to_excel保存为excel文档时,报错:numpy.ndarray object has no attribute to_excel(https://blog.csdn.net/m0_67870771/article/details/124603745)
import pandas as pd
file_path = 'E:/data/2.xlsx' #想要保存到的位置和文件名称、文件类型。
df = pd.DataFrame(data)
dt.to_excel(file_path)
9、numpy列相加 python(https://www.csdn.net/tags/MtTaAgxsMDQ2MjQtYmxvZwO0O0OO0O0O.html)
x.sum(axis=0)
10、Python中numpy如何提取矩阵的某一行或某一列(https://www.yisu.com/zixun/179241.html)
矩阵的某一行
a[1]
Out[32]: array([3, 4, 5])
矩阵的某一列
a[:,1]
Out[33]: array([1, 4, 7])
11、numpy选择特定的行列(https://blog.csdn.net/goodxin_ie/article/details/109659893)
x[[0,1]][:,[0,3]]
Out[31]:
array([[0, 3],
[4, 7]])
x_test = np.empty((1, 15), dtype=object) # Test数据集
test_feibahao = x_test[:, 0] # 取测试集飞靶号一列
12、Numpy删除行(多行操作)(https://blog.csdn.net/God_WZH/article/details/122575683)
https://blog.csdn.net/A_JI_97/article/details/116235753
删除行:
x1 = np.delete(x, 0, axis=0)
y1 = np.delete(y, 1, axis=0)
print(x1)
print(y1)
13、numpy行列转换(https://blog.csdn.net/m0_37294838/article/details/102743533)
14、如何轻松地将numpy数组(矩阵)从python提取到excel?(https://www.cnpython.com/qa/1678585)
import numpy
numpy.savetxt('your\location\yourfile.csv', numpy_array, delimiter=',')
15、Python合并两个numpy矩阵(http://t.zoukankan.com/itdyb-p-5735911.html)
我们随机生成了a,b这两个矩阵,下面进行合并操作:
hstack()在行上合并
np.hstack((a,b))
array([[ 8., 5., 1., 9.],
[ 1., 6., 8., 5.]])
vstack()在列上合并
np.vstack((a,b))
array([[ 8., 5.],
[ 1., 6.],
[ 1., 9.],
[ 8., 5.]])
16、Python教程:numpy数组初始化为相同的值(https://blog.csdn.net/sinat_38682860/article/details/111314885)
import numpy as np
a = np.ones((4,4)) * 10
[[10. 10. 10. 10.]
[10. 10. 10. 10.]
[10. 10. 10. 10.]
[10. 10. 10. 10.]]
17、数据分析入门之numpy数组数据大小比较与筛选去重(https://blog.csdn.net/ayouleyang/article/details/103757741)
18、【Numpy】Numpy求均值、中位数、众数的方法(https://blog.csdn.net/u013066730/article/details/108844068)
import numpy as np
均值
np.mean(nums)
中位数
np.median(nums)
from scipy import stats
stats.mode(nums)[0][0]
19、python读取EXCEL表格中有相同列名的值(https://blog.csdn.net/qq_41821067/article/details/121798607)
import pandas as pd
df = pd.read_excel('test1.xls',header=0)#现在Excel表格与py代码放在一个文件夹里
result = []
for s_li in df.columns:
打印列名
print(s_li)
if 'I' in str(s_li):
result.append(df[s_li])
print(result)
pd.DataFrame(result).to_excel(r'F:\python_project\result.xls')#保存的路径
20、
import statistics
l = ['温度','PH','DL1','DA1','DB1','DL2','一次染时']
l = ['wendu','ph','dl1','da1','db1','dl2','yici']
print(pandas_data.shape)
for i in l:
print(i+'方差为:%f' % np.var(pandas_data[i]),i+'标准差为:%f' % np.std(pandas_data[i]),i+'最大值为:%f' % np.max(pandas_data[i]),
i+'最小值为:%f' % np.min(pandas_data[i]),i+'平均值为:%f' % np.mean(pandas_data[i]),i+'中位数为:%f' % np.median(pandas_data[i]))
print(i+'众数为:', statistics.mode(pandas_data[i]))
print()
21、DataFrame 取某一行某一列或取某N行某N列(https://blog.csdn.net/qq_42140717/article/details/124350979)
取已知index的某一行数据:
df.loc[a]
取未知index某一行的数据:
df[1:2]#括号下包含,如取第二行数据则为应为[1:2]
取未知index某N行的数据:
df[0:10]
取已知名称的某一列:
df['name']
取不知名称,但知道第几列的数据:
df.iloc[:,2]
取已知名称的N列:
df[['name','name2']]
取已知名称的N行M列:
df['name'][0:4]
取不知名称的N行M列:
df.iloc[0:N,0:M]
iloc是只取索引值即只取数值。loc取得是index索引值,和列名字。如数据中索引值有重复的情况,loc会报错。不使用loc和iloc则是选择第几行的指定名称的列。
22、# 怎样取numpy数组指定行列
https://blog.csdn.net/goodxin_ie/article/details/109659893
b= a[c]先取想要的行数据
b = b[:,d]
print(b)
x[[0,1]][:,[0,3]]
Out[31]:
array([[0, 3],
[4, 7]])
23、Python中numpy数组的拼接、合并(https://blog.csdn.net/qq_39516859/article/details/80666070)
水平组合
np.hstack((a,b))
array([ 0, 1, 2, 0, 2, 4],
[ 3, 4, 5, 6, 8, 10],
[ 6, 7, 8, 12, 14, 16])
np.concatenate((a,b),axis=1)
array([ 0, 1, 2, 0, 2, 4],
[ 3, 4, 5, 6, 8, 10],
[ 6, 7, 8, 12, 14, 16])
data = pd.read_csv(f, low_memory=False)
25、python读取csv文件的几种方式(含实例说明)(https://blog.csdn.net/qq_43160348/article/details/124331781)
import pandas as pd
df = pd.read_csv('../data_pro/audito_whole.csv')
print(df)
26、【Python】——筛选存在空值的行or非空值的行(https://blog.csdn.net/qq_40264559/article/details/124508563)
test = test[test['性别'].notna()] #去掉【性别】为空值的行
test
27、Pandas 创建一个空的Dataframe 并向其添加行与列(https://blog.csdn.net/qq_53817374/article/details/123771713)
import pandas as pd
df = pd.DataFrame(data=None,columns=['时间','车牌','北纬','东经'])
df
拼接(pandas.concat用法详解)(https://cloud.tencent.com/developer/news/372041)
pd.concat([df1,df2,df3]),默认axis=0,在0轴上合并。

28、# 【Python小随笔】Pandas读取每一行数据
for indexs in data.index:
print(data.loc[indexs].values[0:-1])
29、## pandas错误处理:A value is trying to be set on a copy of a slice from a DataFrame
quchong = df_all.drop_duplicates(subset='虚拟飞靶号')
print(quchong.shape)
quchong.insert(loc=6, column='hour', value='')
new_data = quchong.copy()
for i in range(quchong.shape[0]):
new_data['hour'].iloc[i]=int(quchong['一次化抛进槽时间'].iloc[i][11:13])
30、
网友评论