米鼠商城

多快好省,买软件就上米鼠网

最新项目

人才服务

靠谱的IT人才垂直招聘平台

python数据分析与挖掘实战学习笔记(四)--聚类算法

  • lkj155
  • 2
  • 2019-08-30 19:45

import numpy as np import pandas as pd import matplotlib.pyplot as plt from matplotlib import patches   #

from sklearn import datasets from sklearn.mixture import GaussianMixture  #GMM更换为GaussianMixture from sklearn.model_selection import StratifiedKFold

# Load the iris dataset iris = datasets.load_iris() #print(iris)  #数据分为data和target两组

# Split dataset into training and testing (80/20 split) skf = StratifiedKFold(n_splits=5)                 #将数据分为5组。 indices=skf.split(iris.data,iris.target)       #将数据分为4组train,1组test

# Take the first foldtrain_index, test_index = next(iter(indices))   

# Extract training data and labels X_train = iris.data[train_index] y_train = iris.target[train_index]

# Extract testing data and labels X_test = iris.data[test_index] y_test = iris.target[test_index]

# Extract the number of classes num_classes = len(np.unique(y_train))

# Build GMM classifier = GaussianMixture(n_components=num_classes, covariance_type='full',     #n_components指的是下层分布由几个构成,本项目中指的是num_classes. covariance_type指一致性算法的类别         init_params='kmeans', max_iter=20)  #init_params中w代表weights,c代表covariance在迭代中进行更新;n_iter迭代次数

# Initialize the GMM means  classifier.means_ = np.array([X_train[y_train == i].mean(axis=0)                               for i in range(num_classes)])

# Train the GMM classifier  classifier.fit(X_train)

# Draw boundaries plt.figure() colors = 'bgr' for i, color in enumerate(colors):     # Extract eigenvalues and eigenvectors     eigenvalues, eigenvectors = np.linalg.eigh(classifier.covariances_[i][:2, :2])  

 #参照GaussianMixture的属性修改为covariances_。在covariances_()时报错,希望通过dataframe的类对象的方法得到#numpy数组。不应带括号,他是属性,不是方法。

    # Normalize the first eigenvector     norm_vec = eigenvectors[0] / np.linalg.norm(eigenvectors[0])

    # Extract the angle of tilt     angle = np.arctan2(norm_vec[1], norm_vec[0])     angle = 180 * angle / np.pi 

    # Scaling factor to magnify the ellipses     # (random value chosen to suit our needs)     scaling_factor = 8     eigenvalues *= scaling_factor 

    # Draw the ellipse     ellipse = patches.Ellipse(classifier.means_[i, :2],              eigenvalues[0], eigenvalues[1], 180 + angle,              color=color)     axis_handle = plt.subplot(1, 1, 1)     ellipse.set_clip_box(axis_handle.bbox)     ellipse.set_alpha(0.6)     axis_handle.add_artist(ellipse)

# Plot the data  colors = 'bgr' for i, color in enumerate(colors):     cur_data = iris.data[iris.target == i]     plt.scatter(cur_data[:,0], cur_data[:,1], marker='o',              facecolors='none', edgecolors='black', s=40,              label=iris.target_names[i])

    test_data = X_test[y_test == i]     plt.scatter(test_data[:,0], test_data[:,1], marker='s',              facecolors='black', edgecolors='black', s=40,              label=iris.target_names[i])

# Compute predictions for training and testing data y_train_pred = classifier.predict(X_train) accuracy_training = np.mean(y_train_pred.ravel() == y_train.ravel()) * 100 print('Accuracy on training data =', accuracy_training)           y_test_pred = classifier.predict(X_test) accuracy_testing = np.mean(y_test_pred.ravel() == y_test.ravel()) * 100 print('Accuracy on testing data =', accuracy_testing)

plt.title('GMM classifier') plt.xticks(()) plt.yticks(())

plt.show()  



这里给大家推荐一个在线软件复杂项交易平台:米鼠网 https://www.misuland.com

米鼠网自成立以来一直专注于从事软件项目人才招聘软件商城等,始终秉承“专业的服务,易用的产品”的经营理念,以“提供高品质的服务、满足客户的需求、携手共创双赢”为企业目标,为中国境内企业提供国际化、专业化、个性化、的软件项目解决方案,我司拥有一流的项目经理团队,具备过硬的软件项目设计和实施能力,为全国不同行业客户提供优质的产品和服务,得到了客户的广泛赞誉。



如有侵权请联系邮箱(service@misuland.com)

猜你喜欢

评论留言