由上图可知,红色和蓝色的点投影到了完全不同的空间。
当有新的点进来的时候,就可以用上面推导的公式
from scipy.spatial.distance import sqdist, squareform
def rbf_kpca(X, gamma, k):
sq_dists = sqdist(X, metric='sqeuclidean')
mat_sq_dists = squareform(sq_dists)
K = np.exp(-gamma*mat_sq_dists)
N = X.shape[0]
one_N = np.ones((N, N))/N
K = K-one_N.dot(K)-K.dot(one_N)+one_N.dot(K).dot(one_N)
Lambda, Q = np.linalg.eigh(K)
alphas = np.column_stack((Q[:, -i]for i in range(1, 1+k)))
lambdas = [Lambda[-i] for i in range(1, k+1)]
return alphas, lambdas
def proj_new(X_new, X, gamma, alphas, lambdas):
k = np.exp(-gamma*np.sum((X-X_new)**2, 1))
return k.dot(alphas/lambdas) #不是除以sqrt(lambda),不知道为啥
# alphas/lambdas,归一化后的alphas
X, y = make_moons(n_samples=100, random_state=123)
# 设置random_state,为了可重复性
alphas, lambdas = rbf_kpca(X, gamma=15, k=1)
X_new = X[25]
# 以当前样本的某一样本作为新的样本进行测试
X_proj = proj_new(X_new, X, gamma=15, alphas, lambdas)
print(alphas[25])
print(X_proj)
# [-0.07877284]
# [-0.07877284]
PCA白化则可以降低冗余,也就是标准差归一化。
https://blog.csdn.net/lanchunhui/article/details/50492482
https://www.youtube.com/watch?v=G2NRnh7W4NQ&list=PLt0SBi1p7xrRKE2us8doqryRou6eDYEOy&index=2
网友评论