调用kmeans函数,kmeans中调用了whited函数。查后,发现whiten是对输入数据按标准差做归一化处理。

讯享网
v a r i a n c e = Σ i = 1 n ( x i − x m e a n ) 2 n variance = \frac{ \Sigma_{i=1}^{n}(x_{i} - x_{mean})^{2}}{n} variance=nΣi=1n(xi−xmean)2
s t a n d _ d e v a t i o n = v a r i a n c e stand\_devation = \sqrt{variance} stand_devation=variance
经过whiten后
x i ′ = x i s t a n d _ d e v a t i o n x_{i}^{'} =\frac{x_{i}} {stand\_devation} xi′=stand_devationxi
与标准化不同的是,白化处理没有减去均值。
下面是按步骤实现和调用函数实现,结果是一样的。

import numpy as np from numpy import array from scipy.cluster.vq import vq, kmeans, whiten import matplotlib.pyplot as plt features = array([[ 1.9,2.3], [ 1.5,2.5], [ 0.8,0.6], [ 0.4,1.8], [ 0.1,0.1], [ 0.2,1.8], [ 2.0,0.5], [ 0.3,1.5], [ 1.0,1.0]]) mean = np.mean(features,axis=0)# 求每列的均值[0. 1.] np_square = np.power(features-mean,2)# # print('mean:\n',mean) print('features - mean:\n',features-mean) # print('square:\n',np_square) mean_sqare = np.mean(np_square,axis = 0)# 方差 # print('mean_square:\n',mean_sqare) stand_devation = np.sqrt(mean_sqare)# 标准差 # print('stand_devation:\n',stand_devation) np_whit = features/stand_devation# scales by devation print('* np_white ') print(np_whit) whitened = whiten(features)# 调用scipy中的函数 print('* whitened ') print(whitened)
讯享网
whiten后kmeans
讯享网from numpy import random import numpy as np from numpy import array from scipy.cluster.vq import vq, kmeans, whiten import matplotlib.pyplot as plt random.seed((1000,2000)) codes = 3 kmeans(whitened,codes) # (array([[ 2. , 2.], # random # [ 1., 0.], # [ 0., 2.0]]), 0.86241) # Create 50 datapoints in two clusters a and b pts = 50 a = np.random.multivariate_normal([0, 0], [[4, 1], [1, 4]], size=pts)# 中心点、协方差、数据量 b = np.random.multivariate_normal([30, 10], [[10, 2], [2, 1]], size=pts) features = np.concatenate((a, b)) # Whiten data whitened = whiten(features) # Find 2 clusters in the data codebook, distortion = kmeans(whitened, 2)# 返回聚类中心和误差 # Plot whitened data and cluster centers in red plt.scatter(whitened[:, 0], whitened[:, 1]) # plt.scatter(features[:, 0], features[:, 1],c = 'k') plt.scatter(codebook[:, 0], codebook[:, 1], c='r',label = 'centroid') plt.legend() plt.show()

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容,请联系我们,一经查实,本站将立刻删除。
如需转载请保留出处:https://51itzy.com/kjqy/26943.html