美文网首页
pyradiomics官方文档学习(6)--FeatureVis

pyradiomics官方文档学习(6)--FeatureVis

作者: 北欧森林 | 来源:发表于2021-02-05 04:35 被阅读0次

    pyradiomics 官方文档里有几个示例文件,里面涉及了包括yaml文件设置、feature extraction、可视化等一系列影像组学常规操作,是非常好的学习资料。源文件链接:https://github.com/AIM-Harvard/pyradiomics

    今天学习在名称为''FeatureVisualization"这份文档。

    pyRadiomics feature visualization using multi-dimensional scaling and heatmaps
    1. 下载数据
      Note:前言介绍说示例数据来自Brigham and Women's Surgical Planning Lab(SPL),但是运行代码会报错。手动核实代码里的网址,发现网页已经不存在了。所以这一节(及另外一个以这个data为示例的文档)里就只能学习代码,不能运行它们了。
    # Download the zip file if it does not exist
    import os, zipfile
    import pandas as pd
    import seaborn as sns
    
    from six.moves import urllib
    
    url = "http://www.spl.harvard.edu/publications/bitstream/download/5270"
    filename = 'example_data/Tumorbase.zip'
    if not os.path.isfile(filename):
        if not os.path.isdir('example_data'):
            os.mkdir('example_data')
        print ("retrieving")
        urllib.request.urlretrieve(url, filename)
    else:
        print ("file already downloaded")
        
    extracted_path = 'example_data/tumorbase'
    if not os.path.isdir(extracted_path):
        print ("unzipping")
        z = zipfile.ZipFile(filename)
        z.extractall('example_data')
        print ("done unzipping")
    
    1. 导入必要的函数
    # Import some libraries
    import SimpleITK as sitk
    from radiomics import featureextractor
    
    1. 提取特征(Extract features)
    # Load up the segmentations, 1 to 10 and extract the features
    params = os.path.join(os.getcwd(), '..', 'examples', 'exampleSettings', 'Params.yaml')
    
    extractor = featureextractor.RadiomicsFeatureExtractor(params)
    # hang on to all our features
    features = {}
    
    for case_id in range(1,11):
        path = 'example_data/tumorbase/AutomatedSegmentation/case{}/'.format(case_id)
        image = sitk.ReadImage(path + "grayscale.nrrd")
        mask = sitk.ReadImage(path + "segmented.nrrd")
        # Tumor is in label value 6
        features[case_id] = extractor.execute ( image, mask, label=6 )
        
    
    # A list of the valid features, sorted
    feature_names = list(sorted(filter ( lambda k: k.startswith("original_"), features[1] )))
    
    # Make a numpy array of all the values
    import numpy as np
    
    samples = np.zeros((10,len(feature_names)))
    for case_id in range(1,11):
        a = np.array([])
        for feature_name in feature_names:
            a = np.append(a, features[case_id][feature_name])
        samples[case_id-1,:] = a
        
    # May have NaNs
    samples = np.nan_to_num(samples)
    
    1. Multidimensional scaling

    Multidimensional scaling or MDS is as way to visualize very high dimensional data in a lower dimensional space. In our case, the feature space is len(feature_names) (or 93) dimensional space. To help us understand the data, we project into 2d space. MDS preserves the relative distance between sample points during the projection, so two samples close together in 2d space would also be close together in the original 93-dimensional space (and vice versa).
    We us non-metric algorithm, because our data are highly non-uniform in the scale of each feature.

    Note: MDS算法思想很简单,一句话就是:保持样本在原空间和低维空间的距离不变。因为距离是样本之间一个很好的分离属性,对于大多数聚类算法来说,距离是将样本分类的重要属性,因此当我们降维后,保持距离不变,那么就相当于保持了样本的相对空间关系不变。(CSDN@changyuanchn)

    from sklearn import manifold
    from sklearn.metrics import euclidean_distances
    from sklearn.decomposition import PCA
    
    similarities = euclidean_distances(samples)
    
    
    seed = np.random.RandomState(seed=3)
    
    mds = manifold.MDS(n_components=2, max_iter=5000, eps=1e-12, random_state=seed,
                       n_init=10,
                       dissimilarity="precomputed", n_jobs=1, metric=False)
    pos = mds.fit_transform(similarities)
    
    1. 作图(Plot)
    # Plot
    
    from matplotlib import pyplot as plt
    from matplotlib.collections import LineCollection
    import matplotlib.cm as cm
    
    
    fig = plt.figure(1)
    ax = plt.axes([0., 0., 1., 1.])
    
    s = 100
    
    # Type of tumor
    meningioma = [0, 1, 2]
    glioma = [3,5,9]
    astrocytoma = [4, 6, 7, 8]
    
    plt.scatter(pos[meningioma, 0], pos[meningioma, 1], color='navy', alpha=1.0, s=s, lw=1, label='meningioma')
    plt.scatter(pos[glioma, 0], pos[glioma, 1], color='turquoise', alpha=1.0, s=s, lw=1, label='glioma')
    plt.scatter(pos[astrocytoma, 0], pos[astrocytoma, 1], color='darkorange', alpha=0.5, s=s, lw=1, label='astrocytoma')
    
    plt.legend(scatterpoints=1, loc=5, shadow=False)
    
    similarities = similarities.max() / similarities * 100
    similarities[np.isinf(similarities)] = 0
    plt.show()
    
    image.png
    1. 绘制热图(Plot features as a heatmap)
    import pandas as pd
    import seaborn as sns
    
    # Construct a pandas dataframe from the samples
    d = pd.DataFrame(data=samples, columns=feature_names)
    corr = d.corr()
    
    # Set up the matplotlib figure, make it big!
    f, ax = plt.subplots(figsize=(15, 10))
    
    # Draw the heatmap using seaborn
    sns.heatmap(corr, vmax=.8, square=True)
    plt.show()
    
    image.png
    1. 绘制聚类热图(Cluster the heatmap)
    # Choose a subset of features for clustering
    dd = d.iloc[:,1:50]
    
    pp = sns.clustermap(dd.corr(), linewidths=.5, figsize=(13,13))
    _ = plt.setp(pp.ax_heatmap.get_yticklabels(), rotation=0)
    
    plt.show()
    
    image.png

    相关文章

      网友评论

          本文标题:pyradiomics官方文档学习(6)--FeatureVis

          本文链接:https://www.haomeiwen.com/subject/dvodtltx.html