好久没更新了,最近研究K线去了,今天收到朋友的请求帮助写个聚类的分析,写了半天,50行带码7,8个bug,不过好在分析代码和程序代码的不同是非常明显的,做数据分析的带码,重要的是你一定要能算出结果,而且过程数据要清晰不能出错,有bug没关系主逻辑能跑就行,但是业务代码就不行,除了主逻辑要能正常跑外,非正常逻辑也要涉及到,要不然就会触发用户的bug,因为用户可能不是按照你想象的路子走,很多bug都是给不走寻常路的用户搞出来的。
记录一下今天自己写的聚类,用于分析光谱数据,其实就是二维矩阵分析法,数据相似在同一个范围会聚成一类,用到了pandas和大量的二维数组计算
#-*- coding = utf-8 -*-
import pandas as pd
import math
import numpy as np
import copy
# def fun_avg(listTemp, n):
# for i in range(0, len(listTemp), n):
# yield listTemp[i:i + n]
def fun_avg(items, n):
return [items[i:i+n] for i in range(0, len(items), n)]
print("请将需要计算的文件名放入当前目录下,只需要第一个表不能有空格,并以js.xlsx命名结尾")
print("请输入您要计算的光波最小值")
fenleiqujian_min = int(input("整数:"))
print("请输入您要计算的光波最大值")
fenleiqujian_max= int(input("整数:"))
print("请输入您想将光波区间分为多少类")
fenleiqujian_num = int(input("整数:"))
print("请输入您想将光波吸收程度分为多少类")
xishoudu_num= int(input("整数:"))
# fenleiqujian_min=557 # 光照分类波长区间最小值
# fenleiqujian_max=589 # 光照分类波长区间最大值
#
# fenleiqujian_num = 6 # 光照分类数量
#
#
# xishoudu_num=3 #吸收度分类数量
df = pd.read_excel("./js.xlsx")
df = df.set_index('No')
min_biaotou = int(min(df.columns.tolist()))
df1 = df.iloc[:,(fenleiqujian_min - min_biaotou +1 ): (fenleiqujian_max - min_biaotou +1) ]
index_a = df1.index.tolist()
column_a = df1.columns.tolist()
data_a = df1.values.tolist()
data_b = []
jiange = math.floor(len(column_a)/fenleiqujian_num)
for array_a in data_a:
temp_list = fun_avg(array_a,jiange)
temp_list2 = []
for i in temp_list:
temp_list2.append(np.mean(i))
data_b.append(temp_list2)
max = data_b[0][0]
min = data_b[0][0]
for m in data_b:
for n in m:
if float(n) >= float(max):
max = n
if float(n) <= float(min):
min = n
keduzhi = (max - min)/xishoudu_num #刻度值
if max-min < 0.025:
keduzhi =1 #刻度值
zulei=[]
for x in data_b:
temp_y =''
for y in x:
zu = int((y-min)/keduzhi)
temp_y = temp_y+str(zu)
# print(temp_y)
zulei.append(temp_y)
zulei1 = copy.deepcopy(zulei)
zhonglei = list(set(zulei))
zuizhong = list(zip(zulei,index_a))
zuizhong.sort()
file = open('./daan.txt','a+')
i_zhong = 1
for c in zhonglei:
file.write("第"+str(i_zhong)+'种:')
i_zhong+=1
for ii in zuizhong:
if (ii[0] == c):
file.write(str(ii[1])+';')
file.write("\r\n")
file.close()
网友评论