Total members 11806 |It is currently Thu Nov 21, 2019 4:02 am Login / Join Codemiles

Java

C/C++

PHP

C#

HTML

CSS

ASP

Javascript

JQuery

AJAX

XSD

Python

Matlab

R Scripts

Weka





This example illustrates the extra analysis that random forest can provide for data scientists. In a random forest, we can rank the important features based on the error caused by dropping any of them.

python code
#https://jupyter.org/try
#Demo7 - part2
#M. S. Rakha, Ph.D.
# Post-Doctoral - Queen's University
# Supervised Learning - Random Forest
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt

from mpl_toolkits.mplot3d import Axes3D
import pandas as pd
from sklearn.cluster import KMeans
from sklearn import datasets
from sklearn.preprocessing import scale
import sklearn.metrics as sm
from sklearn.metrics import confusion_matrix,classification_report
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier

np.random.seed(5)
breastCancer = datasets.load_breast_cancer()

list(breastCancer.target_names)

#Only two features
X = breastCancer.data[:, 0:10]
y = breastCancer.target


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.50, random_state=42)

X_train[:,0].size
X_train[:,0].size

varriableNames= breastCancer.feature_names




randomForestModel = RandomForestClassifier(n_estimators=100, max_depth=2, random_state=0)

randomForestModel.fit(X_train, y_train);

y_pred = randomForestModel.predict(X_test)


from sklearn.metrics import classification_report

print(classification_report(y_test, y_pred))



importances = randomForestModel.feature_importances_
std = np.std([tree.feature_importances_ for tree in randomForestModel.estimators_],
axis=0)
indices = np.argsort(importances)[::-1]

# Print the feature ranking
print("Feature ranking:")

for f in range(X.shape[1]):
print("%d. feature %d (%f)" % (f + 1, indices[f], importances[indices[f]]))

# Plot the feature importances of the forest
plt.figure()
plt.title("Feature importances")
plt.bar(range(X.shape[1]), importances[indices],
color="r", yerr=std[indices], align="center")
plt.xticks(range(X.shape[1]), indices)
plt.xlim([-1, X.shape[1]])
plt.show()

print(varriableNames)



The output of this snippet:

Code:
      precision    recall  f1-score   support

           0       0.92      0.85      0.88        98
           1       0.92      0.96      0.94       187

    accuracy                           0.92       285
   macro avg       0.92      0.90      0.91       285
weighted avg       0.92      0.92      0.92       285

Feature ranking:
1. feature 7 (0.327613)
2. feature 6 (0.197932)
3. feature 2 (0.187159)
4. feature 0 (0.104715)
5. feature 3 (0.102147)
6. feature 5 (0.039644)
7. feature 1 (0.026285)
8. feature 9 (0.008671)
9. feature 4 (0.005309)
10. feature 8 (0.000525)



['mean radius' 'mean texture' 'mean perimeter' 'mean area'
'mean smoothness' 'mean compactness' 'mean concavity'
'mean concave points' 'mean symmetry' 'mean fractal dimension'
'radius error' 'texture error' 'perimeter error' 'area error'
'smoothness error' 'compactness error' 'concavity error'
'concave points error' 'symmetry error' 'fractal dimension error'
'worst radius' 'worst texture' 'worst perimeter' 'worst area'
'worst smoothness' 'worst compactness' 'worst concavity'
'worst concave points' 'worst symmetry' 'worst fractal dimension']

Attachment:
RandomForestImportant.png
RandomForestImportant.png [ 4.4 KiB | Viewed 270 times ]




_________________
M. S. Rakha, Ph.D.
Queen's University
Canada


Author:
Mastermind
User avatar Posts: 2715
Have thanks: 74 time
Post new topic Reply to topic  [ 1 post ] 

  Related Posts  to : Get the important variables of random forest classifier
 random forest algorithm classifier     -  
 Cost Sensitive Classifier Random Forest Java in weka     -  
 Random Search for tuning classifier parameters     -  
 Random Forest Classification (Binary )- Supervised Learning     -  
 Weka java code for Random Forest Cross Validation     -  
 KFold Cross-validation Random Forest Binary Classification     -  
 Local variables vs Instance variables     -  
 naive Bayes classifier in MATLAB     -  
 Important JSP tags     -  
 Usage of the CSS property !important     -  



Topic Tags

Machine Learning, Artificial Intelligence, Python






Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
All copyrights reserved to codemiles.com 2007-2011
mileX v1.0 designed by codemiles team
Codemiles.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com