Linear Regression in Machine Learning-plot-algorithms-explain 02


Linear Regression in Machine Learning: part02

WELCOME TO SECOND PART OF YOUR LINEAR REGRESSION POST. IN THE FIRST POST WE SEE OPERATION ON DATASETS. IN THIS POST WE SEE LINEAR REGRESSION OPERATION. LET’S  START:

Training a Linear Regression Model

Let’s now begin to train out regression models. We will need to first split up our data into an X array that contain the feature to train on, and a y array with the target variables, in this case the Price column. We will toss out the Address columns because it only has text info that the linear regression model can not use.

X and y array

Train Test Splits

Now let’s splits the data into a training set and a testing set. We will train out models on the training set and then use the test set to evaluate the modes

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4)

from sklearn.linear_model import LinearRegression

from sklearn.linear_model import LinearRegression
lm = LinearRegression()
lm.fit(X_train,y_train)

Model Evaluation

Let’s evaluate the model by checking out it is coefficient and how we can interpret them
# print the intercept
print(lm.intercept_)
-2640159.79685

coeff_df = pd.DataFrame(lm.coef_,X.columns,columns=['Coefficient'])
coeff_df
data info.
CoefficientAvg. Area Incomes21.528276Avg. Area House Ages164883.282027Avg. Area Number of Room122368.678027Avg. Area Number of Bedroom2233.801864Area Populations15.150420
Interpreting the coefficients:

- Holding all other features fixed, a 1 unit increase in **Avg. Area Income** is associated with an **increase of $21.52 **.
- Holding all other features fixed, a 1 unit increase in **Avg. Area House Age** is associated with an **increase of $164883.28 **.
- Holding all other features fixed, a 1 unit increase in **Avg. Area Number of Rooms** is associated with an **increase of $122368.67 **.
- Holding all other features fixed, a 1 unit increase in **Avg. Area Number of Bedrooms** is associated with an **increase of $2233.80 **.
- Holding all other features fixed, a 1 unit increase in **Area Population** is associated with an **increase of $15.15 **.

Does this make sense? Probably not because I made up this data. If you want real data to repeat this sort of analysis, check out the [boston dataset](http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_boston.html):

from sklearn.datasets import load_boston
boston = load_boston()
print(boston.DESCR)
boston_df = boston.data


Predictions from our Model



Let's grab prediction off our test set and see how well it did.

predictions = lm.predict(X_test)
plt.scatter(y_test,predictions)
<matplotlib.collections.PathCollection at 0x142622c88>
Pythonslearning
Residual Histogram
sns.distplot((y_test-predictions));
Pythonslearning linear


Regression Evaluation Metric



Here are three common evaluation metric for regression problem:


Mean Absolute Error (MAE) is the mean of the absolute value of the error:




1ni=1n|yiy^i|


Mean Squared Error (MSE) is the mean of the squared error:




1ni=1n(yiy^i)2


Root Mean Squared Error (RMSE) is the square root of the mean of the squared error:




1ni=1n(yiy^i)2


Comparing these metric:

  • MAE is the easiest to understand, because it is the average error.
  • MSE is more popular than MAE, because MSE “punishes” larger error, which tends to be useful in the real world.
  • RMSE is more popular than MSE, because RMSE is interpretableS in the “y” units.
All of these are loss function, because we want to minimize them.
from sklearn import metrics
print('MAE:', metrics.mean_absolute_error(y_test, predictions))
print('MSE:', metrics.mean_squared_error(y_test, predictions))
print('RMSE:', np.sqrt(metrics.mean_squared_error(y_test, predictions)))
MAE: 82288.2225191
MSE: 10460958907.2
RMSE: 102278.829223
This was your Machine Learning Project!
Tags: Linear Regression in Machine Learning-plot-algorithms-explain
FOR THE FIRST PART OF PROJECT CLICK HERE.
                                                                  BEST OF LUCK!!!
pagarsach14@gmail.com

I am Mr. Sachin pagar the founder of Pythonslearning, a Passionate Educational Blogger and Author, who love to share the informative content on educational resources.

Have any Question or Comment?

Leave a Reply

Your email address will not be published. Required fields are marked *