Introduction :
Hey guys, in this article i am share my one internship Machine learning project code, which is Prediction using Supervised Machine Learning Algorithms. so let’s see :

Project Task : To predict the percentage of a student based on the number of study Hours
Step 01) We need to first installed all required library
pip install seaborn
Requirement already satisfied: seaborn in /srv/conda/envs/notebook/lib/python3.6/site-packages (0.11.0)
Requirement already satisfied: numpy>=1.15 in /srv/conda/envs/notebook/lib/python3.6/site-packages (from seaborn) (1.19.4)
Requirement already satisfied: matplotlib>=2.2 in /srv/conda/envs/notebook/lib/python3.6/site-packages (from seaborn) (3.3.3)
Requirement already satisfied: scipy>=1.0 in /srv/conda/envs/notebook/lib/python3.6/site-packages (from seaborn) (1.5.3)
Requirement already satisfied: pandas>=0.23 in /srv/conda/envs/notebook/lib/python3.6/site-packages (from seaborn) (1.1.4)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in /srv/conda/envs/notebook/lib/python3.6/site-packages (from matplotlib>=2.2->seaborn) (2.4.7)
Requirement already satisfied: pillow>=6.2.0 in /srv/conda/envs/notebook/lib/python3.6/site-packages (from matplotlib>=2.2->seaborn) (8.0.1)
Requirement already satisfied: python-dateutil>=2.1 in /srv/conda/envs/notebook/lib/python3.6/site-packages (from matplotlib>=2.2->seaborn) (2.8.1)
Requirement already satisfied: cycler>=0.10 in /srv/conda/envs/notebook/lib/python3.6/site-packages/cycler-0.10.0-py3.6.egg (from matplotlib>=2.2->seaborn) (0.10.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /srv/conda/envs/notebook/lib/python3.6/site-packages (from matplotlib>=2.2->seaborn) (1.3.1)
Requirement already satisfied: pytz>=2017.2 in /srv/conda/envs/notebook/lib/python3.6/site-packages (from pandas>=0.23->seaborn) (2020.4)
Requirement already satisfied: six>=1.5 in /srv/conda/envs/notebook/lib/python3.6/site-packages (from python-dateutil>=2.1->matplotlib>=2.2->seaborn) (1.15.0)
Note: you may need to restart the kernel to use updated packages.
step 2: Importing the required Library
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
step 3: We need to read or import the dataset
(Note : In this article we will use the data sets of the sparks foundation )
Sparkurl= "https://raw.githubusercontent.com/AdiPersonalWorks/Random/master/student_scores%20-%20student_scores.csv"
data= pd.read_csv(Sparkurl)
data.head(10)
Hours Scores
0 2.5 21
1 5.1 47
2 3.2 27
3 8.5 75
4 3.5 30
5 1.5 20
6 9.2 88
7 5.5 60
8 8.3 81
9 2.7 25
step 4: Visualization and analysis of data
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 25 entries, 0 to 24
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Hours 25 non-null float64
1 Scores 25 non-null int64
dtypes: float64(1), int64(1)
memory usage: 528.0 bytes
data.describe()
Hours Scores
count 25.000000 25.000000
mean 5.012000 51.480000
std 2.525094 25.286887
min 1.100000 17.000000
25% 2.700000 30.000000
50% 4.800000 47.000000
75% 7.400000 75.000000
max 9.200000 95.000000
sns.heatmap(data.corr(),linewidth=1)
<AxesSubplot:>

After heatmap the next step is plotting the distribution of score
data.plot(x='Hours',y='Scores',style='o')
plt.title('Hours vs Percentage')
plt.xlabel('Hours')
plt.ylabel('Percentage')
plt.show()

step 5 : Preparing the data
x=data.iloc[: , :-1].valuesdata:image/png
y=data.iloc[: ,1].values
x_train,x_test,y_train,y_test= train_test_split(x,y,test_size=0.2,random_state=0)
step 6: Train the Algo. data
regressor=LinearRegression()
regressor.fit(x_train, y_train)
line= regressor.coef_*x+regressor.intercept_
plt.scatter(x,y)
plt.plot(x,line, color= 'Red')
plt.show()

Step 7: Making Prediction of given data set
print(x_test)
y_pred= regressor.predict(x_test)
[[1.5]
[3.2]
[7.4]
[2.5]
[5.9]]
step 8: Compaire Actual vs Predict
data1= pd.DataFrame({'Actual': y_test, 'Predicted': y_pred})
data1
Actual Predicted
0 20 16.884145
1 27 33.732261
2 69 75.357018
3 30 26.794801
4 62 60.491033
Step 9: Testing the data set
hours= 9.25
test= np.array([hours])
test= test.reshape(-1,1)
ownpred= regressor.predict(test)
print("Total number of hours= {}".format(hours))
print("Total PredictScore= {}".format(ownpred[0]))
Total number of hours= 9.25
Total PredictScore= 93.69173248737539
Step 10: And last one final step is Evaluating the model
from sklearn import metrics
print('Mean Absolute Error:',metrics.mean_absolute_error(y_test,y_pred))
print('Mean Suared Error:', metrics.mean_squared_error(y_test,y_pred))
print('Root mean squared Error', np.sqrt(metrics.mean_squared_error(y_test,y_pred)))
Mean Absolute Error: 4.183859899002982
Mean Suared Error: 21.598769307217456
Root mean squared Error 4.647447612100373
ohhh Finally I have done my first Task project :To predict the percentage of a student based on the number of study Hours¶
Summary :
In this article we saw How to To predict the percentage of a student based on the number of study Hours¶so about this section you have any query then free to ask me.
BEST OF LUCK!!!