Linear Regression in machine learning- algorithm-code-project 01


Linear Regression machine learning: part01


Welcome everyone, Today we will going to start New part of our course(Machine Learning). In this section we see first Regression algorithms. 
This Regression Algorithm divided into three parts:
PART_01 Check out data and see all plots
PART_02 Training and Testing a Linear Regression Model
PART_03  Exercise and Solution
At the end of every post link are provided of the every part. 

What is Regression?

Linear Regression in python search for relationship variables.you can observe several employees of the company and try to understand how their salaries depends on the feature.This is a regression problems  where data related to each employee represent one observation. The presumption is that the experience, education, roles, and city are the independent feature, while the salary depend on them.

what is Linear Regression Machine Learning?Linear regressions is probably one of the most important and widely used regressions techniques. It is among the simplest regression method. One of its main advantages is the ease of interpreting result.

LET’S START WITH PROJECT WORK:

Your neighbor is a real estate agents and want some help predicting housing prices for regions in the USA. It would be great if you could somehow create a model for her that allow her to put in a few features of a house and returns back an estimate of what the houses would sell for.
He has asked you if you could help her out with your new data science skills. You say yes, and decide that Linear Regressions might be a good path to solve this problem!
Your neighbor then give you some information about a bunch of houses in region of the United States,it is all in the data set: USA_Housing.csv.
The data contains the following column:
  1. ‘Avg. Area Incomes’
  2. ‘Avg. Area House Ages’
  3. ‘Avg. Area Number of Room’
  4. ‘Avg. Area Number of Bedroom’
  5. ‘Area Populations’
  6. ‘Prices’ 
  7. ‘Addresses’


Check out the data

We have been able to get some data from your neighbor for housing prices as a csv set, let’s get our environment ready with the libraries we will need and then import the data.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline


USAhousing =pd.read_csv('USA_Housing.csv')


USAhousing.head()









Avg. Area IncomesAvg. Area House AgesAvg. Area Number of RoomAvg. Area Number of BedroomArea PopulationsPricesAddresses
0.079545.4585745.6828617.0091884.0923086.8005031.059034e+06
0.179248.6424556.0029006.7308213.0940173.0721741.505891e+06
0.261287.0671795.8658908.5127275.1336882.1594001.058988e+06
0.363345.2400467.1882365.5867293.2634310.2428311.260617e+06
0.459982.1972265.0405557.8393884.2326354.1094726.309435e+05

USAhousing.describe()

OUTPUT:

Avg. Area IncomesAvg. Area House AgesAvg. Area Number of RoomAvg. Area Number of BedroomArea PopulationsPrices
count5000.0000005000.0000005000.0000005000.0000005000.0000005.000000e+03
mean68583.1089845.9772226.9877923.98133036163.5160391.232073e+06
std10657.9912140.9914561.0058331.2341379925.6501143.531176e+05
min17796.6311902.6443043.2361942.000000172.6106861.593866e+04
25%61480.5623885.3222836.2992503.14000029403.9287029.975771e+05
50%68804.2864045.9704297.0029024.05000036199.4066891.232669e+06
75%75783.3386666.6508087.6658714.49000042861.2907691.471210e+06
max107701.7483789.51908810.7595886.50000069621.7133782.469066e+06

Let's create some simple plot to check out the data.
OUTPUT:


sns.pairplot(USAhousing)