Total members 11806 |It is currently Thu Nov 21, 2019 4:50 am Login / Join Codemiles

Java

C/C++

PHP

C#

HTML

CSS

ASP

Javascript

JQuery

AJAX

XSD

Python

Matlab

R Scripts

Weka





In this example, we apply the unsupervised learning concept using the kmeans clustering. We apply the Kmean algorithm on the breast cancer dataset from sklearn.

python code
#https://jupyter.org/try
#Demo5
#M. S. Rakha, Ph.D.
# Post-Doctoral - Queen's University
# UnSupervised Learning - Clustering Kmeans
# Kmeans Clustering
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt

from mpl_toolkits.mplot3d import Axes3D
import pandas as pd
from sklearn.cluster import KMeans
from sklearn import datasets
from sklearn.preprocessing import scale
import sklearn.metrics as sm
from sklearn.metrics import confusion_matrix,classification_report

np.random.seed(5)
breastCancer = datasets.load_breast_cancer()

list(breastCancer.target_names)

X = breastCancer.data[:, 0:2]

y = pd.DataFrame(breastCancer.target)
varriableNames= breastCancer.feature_names

#first ten records
X[0:10,]

#Building you Kmeans model
n_clusters = 2 # The number of clusters
init = 'random' # Centroids will be assigned in a random way
n_init = 10 # Number of iterations
clusteringKMeans = KMeans(n_clusters=n_clusters, init=init, n_init=n_init)
clusteringKMeans.fit(X)

##Plotting the model output
breastCancer_df = pd.DataFrame(breastCancer.data)
breastCancer_df = breastCancer_df.iloc[:, 0:2]# first column of data frame (first_name)
breastCancer_df.columns = ['meanRadius','meanTexture']
y.columns = ["Targets"]

color_theme = np.array(['red','darkgreen'])
plt.subplot(1,2,1)
plt.scatter(x=breastCancer_df.meanRadius, y=breastCancer_df.meanTexture,c=color_theme[breastCancer.target],s=50)
plt.title('Ground Truth Classification')

plt.subplot(1,2,2)
plt.scatter(x=breastCancer_df.meanRadius, y=breastCancer_df.meanTexture,c=color_theme[clusteringKMeans.labels_],s=50)
plt.title('K-Means Clustering')

#Evaluate the model

print(classification_report(y,clusteringKMeans.labels_))



Below is the performance of Kmeans clustering in contrast to the original dataset.

Code:
   precision    recall  f1-score   support

           0       0.76      0.82      0.78       212
           1       0.89      0.84      0.86       357

    accuracy                           0.83       569
   macro avg       0.82      0.83      0.82       569
weighted avg       0.84      0.83      0.83       569


The following picture presents the 2 clusters (Left: The labeled data, Right: is the output of Kmeans)
Attachment:
Image.png
Image.png [ 33.49 KiB | Viewed 256 times ]




_________________
M. S. Rakha, Ph.D.
Queen's University
Canada


Author:
Mastermind
User avatar Posts: 2715
Have thanks: 74 time
Post new topic Reply to topic  [ 1 post ] 

  Related Posts  to : Kmeans Clustering - Unsupervised Learning
 java hierarchical clustering algorithm     -  
 Useful tutorials for learning ADO.NET using C#     -  
 machine learning java libraries     -  
 Video for learning how to deal with registry keys     -  
 Video for learning how to deal with registry keys     -  
 Build Linear Regression in Python - Supervised Learning     -  
 Video For Learning how to deal with (Open, Save dialogs) , f     -  
 Naive Bayes Classification (Binary )- Supervised Learning     -  
 Random Forest Classification (Binary )- Supervised Learning     -  
 Open Source For Training and Learning Online on Advance Grap     -  



Topic Tags

Artificial Intelligence, Machine Learning, Python






Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
All copyrights reserved to codemiles.com 2007-2011
mileX v1.0 designed by codemiles team
Codemiles.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com