Problem:
-
Please build a Gaussian mixture model (GMM) to model the data in file TrainingData_GMM.csv. Note that the data is composed of 4 clusters, and the model should be trained by expectation maximization (EM) algorithm.
-
Based on the GMM learned above, assign each training data point into one of 4 different clusters
Questions:
1) Show how the log-likelihood evolves as the training proceeds
image
x轴为迭代次数,y轴为log-likelihood值
2) The learned mathematical expression for the GMM model after training on the given dataset
3) Randomly select 500 data points from the given dataset and plot them on a 2dimensional coordinate system. Mark the data points coming from the same cluster (using the results of Problem 2) with the same color.
4) Some analyses on the impacts of initialization on the converged values of EM algorithm
不同的初始参数对EM-GMM算法最后收敛的效果影响非常大,我的 image
node_num = 500
_,gamma=E()
label = np.argmax(gamma,1)
selected_node_index = np.random.choice(range(n),size=node_num)
node_pos = data[selected_node_index]
label = label[selected_node_index]
pylab.scatter(node_pos[:,0],node_pos[:,1],marker='o',c=label,cmap=pylab.cm.Accent)
<matplotlib.collections.PathCollection at 0x1212d0b00>
image
网友评论