Total members 11890 |It is currently Mon May 20, 2024 5:31 pm Login / Join Codemiles














R Scripts


This example is a good one to start learning applying machine learning in python. If you are new to python and machine learning this example will guide you through simple steps to run your first Supervised Learning model. As a dataset, we use the publicly available Diabetes dataset in sklearn library. The Diabetes dataset has records for 442 patients and 10 features. The features are Age, Gender, BMI, Blood Pressure, 6x Blood Serum Measurements. For simplicity, we pick the second feature which is the Gender. The target class is a continuous value for Diabetes Disease. The trained model is validated by splitting the dataset into training and testing. The linear regression model try to find a linear relationship between the feature X and target class Y. The linear equation is Y=aX+c where 'a' is the coefficient and 'c' is the intersection. We train the model using the training split and then measure the model performance using the testing split.

#M. S. Rakha, Ph.D.
#Post-Doctoral - Queen's University
#Supervised Learning - LinearRegression
%matplotlib inline

import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets, linear_model
from sklearn.metrics import mean_squared_error, r2_score
import pandas as pd

# Load the diabetes dataset
diabetes = datasets.load_diabetes()

# Use only one feature
diabetes_X =[:, np.newaxis, 2]

# Split the data into training/testing sets
diabetes_X_train = diabetes_X[:-20]
diabetes_X_test = diabetes_X[-20:]

# Split the targets into training/testing sets
diabetes_y_train =[:-20]
diabetes_y_test =[-20:]

# Create linear regression object
regr = linear_model.LinearRegression()

# Train the model using the training sets, diabetes_y_train)

# Make predictions using the testing set
diabetes_y_pred = regr.predict(diabetes_X_test)

# The coefficients
print('Coefficients: \n',regr.intercept_)
print('Coefficients: \n', regr.coef_)
# The mean squared error
print("Mean squared error: %.2f"
      % mean_squared_error(diabetes_y_test, diabetes_y_pred))
# Explained variance score: 1 is perfect prediction
print('Variance score: %.2f' % r2_score(diabetes_y_test, diabetes_y_pred))

# Plot outputs
plt.scatter(diabetes_X_test, diabetes_y_test,  color='black')
plt.plot(diabetes_X_test, diabetes_y_pred, color='blue', linewidth=3)


This code uses the scikit-learn library to perform linear regression on a diabetes dataset. The dataset is loaded, and a single feature (column 2) is selected. The data is then split into training and testing sets. A linear regression object is created and then fit into the training data. The model is then used to make predictions on the test data. The code then calculates and prints out the model's coefficients, the mean squared error of the predictions, and the explained variance score. The code also produces a scatter plot of the test data with the predictions plotted on top of it. The plot shows the relationship between the selected feature and the target variable and how well the linear regression model fits the data.

Linear regression is a good choice for this code because it is a simple and widely used method for modeling the relationship between a dependent variable and one or more independent variables. In this case, the dependent variable is and the independent variable is[:, np.newaxis, 2]. Linear regression can be used to estimate the relationship between these two variables and make predictions about based on new values of[:, np.newaxis, 2]. The mean squared error and the variance score are used to evaluate the model's performance. The model's output is then plotted to visualize the relationship between the independent and dependent variables. Overall, linear regression is a suitable method for this problem because it is easy to interpret and understand, and it can provide a good fit for linear relationships between variables.

Below is the output of running the script on Jupyter notebook:
Automatically created module for IPython interactive environment
Mean squared error: 2548.07
Variance score: 0.47

LinearResults.png [ 13.67 KiB | Viewed 4416 times ]

M. S. Rakha, Ph.D.
Queen's University
User avatar Posts: 2715
Have thanks: 74 time
Post new topic Reply to topic  [ 1 post ] 

  Related Posts  to : Build Linear Regression in Python - Supervised Learning
 Naive Bayes Classification (Binary )- Supervised Learning     -  
 Random Forest Classification (Binary )- Supervised Learning     -  
 Building Quantile regression in R     -  
 linear interpolation array c++     -  
 AES -S-BOX Linear Algorithm implementation-substitution box     -  
 hashing in python     -  
 how to use GeoIP with Python     -  
 Python Module for MySQL     -  
 Reading email in Python     -  
 usage of SQLite database from Python     -  

Topic Tags

Machine Learning, Python, Artificial Intelligence

Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
All copyrights reserved to 2007-2011
mileX v1.0 designed by codemiles team is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to