美文网首页影像组学学习笔记
影像组学学习笔记(36)-聚类树状图Dendrogram的pyt

影像组学学习笔记(36)-聚类树状图Dendrogram的pyt

作者: 北欧森林 | 来源:发表于2021-01-17 23:33 被阅读0次

    本笔记来源于B站Up主: 有Li 的影像组学系列教学视频
    本节(36)主要介绍: 聚类树状图Dendrogram的python实现

    应该注意一下scipy版本的问题:scipy 1.5.0版本画聚类树状图要报错,1.5.2或者1.2.1版本就没有问题。

    # modified from https://www.machinelearningplus.com/plots/top-50-matplotlib-visualizations-the-master-plots-python/
    
    import matplotlib.pyplot as plt
    import pandas as pd
    import scipy.cluster.hierarchy as shc
    
    # import data
    df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/USArrests.csv')
    print(df)
    
    # Plot
    plt.figure(figsize=(16, 10), dpi= 80)  
    plt.title("USArrests Dendograms", fontsize=22)  
    dend = shc.dendrogram(shc.linkage(df[['Murder', 'Assault', 'UrbanPop', 'Rape']],
                                      method='ward'), labels=df.State.values, color_threshold=100)  
    
    plt.xticks(fontsize=12)
    #plt.savefig('USArrests_Dendograms.png')
    plt.show()
    
    USArrests_Dendograms.png
    import numpy as np
    import pandas as pd
    from sklearn.utils import shuffle
    from sklearn.preprocessing import StandardScaler
    from sklearn.linear_model import LassoCV
    
    xlsx1_filePath = '/Users/Mac/Documents/JianShuNotes/data/aa.xlsx'
    xlsx2_filePath = '/Users/Mac/Documents/JianShuNotes/data/bb.xlsx'
    data_1 = pd.read_excel(xlsx1_filePath)
    data_2 = pd.read_excel(xlsx2_filePath)
    rows_1,__ = data_1.shape
    rows_2,__ = data_2.shape
    data_1.insert(0,'label',[0]*rows_1)
    data_2.insert(0,'label',[1]*rows_2)
    data = pd.concat([data_1,data_2])
    data = shuffle(data)
    data = data.fillna(0)
    X = data[data.columns[1:]]
    y = data['label']
    colNames = X.columns
    X = X.astype(np.float64)
    X = StandardScaler().fit_transform(X) #new knowledge
    X = pd.DataFrame(X)
    X.columns = colNames
    
    # LASSO
    alphas = np.logspace(-3,1,50)
    model_lassoCV = LassoCV(alphas = alphas, cv = 10, max_iter = 100000).fit(X,y) #cv, cross-validation
    print(model_lassoCV.alpha_)
    
    coef = pd.Series(model_lassoCV.coef_,index = X.columns) #new knowledge
    # print(coef)
    print("Lasso picked " + str(sum(coef != 0)) + " variables and eliminated the other " + str(sum(coef == 0))+ 'variables')
    
    print(coef[coef != 0])
    X = X[coef[coef != 0].index]
    
    print(X.head())
    
    # Plot
    plt.figure(figsize=(5, 5), dpi= 80)  
    plt.title("Radiomic Dendograms", fontsize=22)  
    dend = shc.dendrogram(shc.linkage(X[:].T, method='ward'), labels=X.columns, color_threshold=20)  #参数调整
    plt.xticks(fontsize=12,rotation = 60, ha = 'right')
    plt.show()
    
    Radiomic Dendograms.png

    延伸阅读:
    Agglomerative Clustering and Dendrograms - Explained
    聚类树状图_聚集聚类和树状图-解释

    相关文章

      网友评论

        本文标题:影像组学学习笔记(36)-聚类树状图Dendrogram的pyt

        本文链接:https://www.haomeiwen.com/subject/zburaktx.html