美文网首页
python实现聚类算法(一)

python实现聚类算法(一)

作者: 此方病 | 来源:发表于2019-07-08 16:04 被阅读0次

    sklearn包中的K-Means算法

    1. 函数:sklearn.cluster.``KMeans

    class sklearn.cluster.``KMeans(n_clusters=8, init=’k-means++’, n_init=10, max_iter=300, tol=0.0001, precompute_distances=’auto’, verbose=0, random_state=None, copy_x=True, n_jobs=None, algorithm=’auto’)

    1. 主要参数

    n_clusters:要进行的分类的个数,即上文中k值,默认是8

    init:‘k-means++’, ‘random’ or an ndarray

    ‘k-means ++’:使用k-means++算法,默认选项

    ‘random’:从初始质心数据中随机选择k个观察值

    第三个是数组形式的参数

    n_init:使用不同的初始化运行算法的次数

    max_iter :最大迭代次数。默认300

    random_state:设置某个整数使得结果固定

    n_jobs: 设置并行量 (-1表示使用所有CPU)

    1. 主要属性:

    cluster_centers_ :集群中心的坐标

    labels_ : 每个点的标签

    1. 官网示例:
    >>> from sklearn.cluster import KMeans
    >>> import numpy as np
    >>> X = np.array([[1, 2], [1, 4], [1, 0],
    ...               [10, 2], [10, 4], [10, 0]])
    >>> kmeans = KMeans(n_clusters=2, random_state=0).fit(X)
    >>> kmeans.labels_
    array([1, 1, 1, 0, 0, 0], dtype=int32)
    >>> kmeans.predict([[0, 0], [12, 3]])
    array([1, 0], dtype=int32)
    >>> kmeans.cluster_centers_
    array([[10.,  2.],
           [ 1.,  2.]])
    
    1. My code:
    %matplotlib inline
    import matplotlib.pyplot as plt
    from sklearn.cluster import KMeans
    import numpy as np
    X = np.array([[1, 2], [1, 4], [1, 0], [4, 2], [4, 4], [4, 0]])
    plt.plot(X[:,0],X[:,1],'x')
    kmeans = KMeans(n_clusters=2).fit(X)
    print(kmeans.labels_)
    print(kmeans.predict([[0, 0], [4, 4]]))
    print(kmeans.cluster_centers_)
    

    聚类算法衡量指标

    1. 函数:sklearn.metrics.``silhouette_score

    sklearn.metrics.``silhouette_score(X, labels, metric=’euclidean’, sample_size=None, random_state=None, **kwds)

    1. 实例:

    https://scikit-learn.org/stable/auto_examples/cluster/plot_kmeans_digits.html#sphx-glr-auto-examples-cluster-plot-kmeans-digits-py

    sklearn包中的K-Means算法

    1. 函数:sklearn.mixture.``GaussianMixture

    sklearn.mixture.``GaussianMixture(n_components=1, covariance_type=’full’, tol=0.001, reg_covar=1e-06, max_iter=100, n_init=1, init_params=’kmeans’, weights_init=None, means_init=None, precisions_init=None, random_state=None, warm_start=False, verbose=0, verbose_interval=10)

    1. 实例:

    https://scikit-learn.org/stable/auto_examples/mixture/plot_gmm_covariances.html#sphx-glr-auto-examples-mixture-plot-gmm-covariances-py

    https://scikit-learn.org/stable/auto_examples/mixture/plot_gmm_selection.html#sphx-glr-auto-examples-mixture-plot-gmm-selection-py

    Reference

    聚类算法一览

    https://www.cnblogs.com/lc1217/p/6893924.html

    https://www.cnblogs.com/lc1217/p/6908031.html

    https://www.cnblogs.com/lc1217/p/6963687.html

    GMM vs K-means

    https://www.jianshu.com/p/a4d8fa39c762

    https://www.jianshu.com/p/13898e68c5c6

    sklearn

    https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html

    https://scikit-learn.org/stable/modules/generated/sklearn.metrics.silhouette_score.html

    https://scikit-learn.org/stable/auto_examples/cluster/plot_kmeans_digits.html#sphx-glr-auto-examples-cluster-plot-kmeans-digits-py

    https://scikit-learn.org/stable/modules/generated/sklearn.mixture.GaussianMixture.html

    https://scikit-learn.org/stable/auto_examples/mixture/plot_gmm_selection.html#sphx-glr-auto-examples-mixture-plot-gmm-selection-py

    相关文章

      网友评论

          本文标题:python实现聚类算法(一)

          本文链接:https://www.haomeiwen.com/subject/tbqahctx.html