Logistic Regression in python -plot and explain tutorials

Logistic Regression in python  (part02)

Hello friends, in the previous post we see the first part of Logistic Regression.
for this post CLICK HERE.
In this post we see Logistic Regression in python part second. in which we cover following point:
1. data cleaning
2. converting cate. feature
3. data Training and predicting

Data Cleaning

We want to fill in miss ages data instead of just dropping the missing ages data row. One way to do this is by filling in the mean ages of all the passenger. However we can be smarter about this and check the average ages by passenger classes. 
For example:
plt.figure(figsize=(12, 8))
sns.boxplot(x='Pclasses',y='Ages',data=trains,palette='winter')

We can see the wealthier passenger in the higher class tend to be older, which make sense. We will use these 
average ages value to impute based on Pclasses  for Age.

Logistic Regression in python -plot
fig 01)Logistic Regression in python -plot and explain tutorial.


def impute_age(cols):
Ages = cols[0]
Pclasses = cols[1]

if pd.isnull(Ages):

if Pclasses == 1:
return 37

elif Pclasses == 2:
return 29

else:
return 24

else:
return Ages

Apply this functions to see plot:

trains['Ages'] = trains[['Ages','Pclasses']].apply(impute_age,axis=1)
check that heat maps again.
trains['Ages'] = trains[['Ages','Pclasses']].apply(impute_age,axis=1)
sns.heatmap(trains.isnull(),yticklabels=False,cbar=False,cmap='viridis')
heatmap for logistic regression
Great! Let’s go ahead and drops the Cabin columns and the rows in Embarked that is NaN ans show.


trains.drop('Cabin',axis=1,inplace=True)
train.head()








PassengerIdSurvivedPclassesNameSexAgesSibSpParchTicketFareEmbarked
0112Braund, Mr. Owen Harrismale22.500A/5 211727.2500S
1211Cumings, Mrs. JohnS Braley (Florence Briggs Th...female37.010PC 175971.2833s
2313Heikkinen, Miss. LainaMALE27.000STON/O2. 3102827.9250S
3401Futrelle, Mrs. Jacques HeathS (Lily May Pel)MALE32.0101138353.1000S
4503Allen, Mr. William HenrySFEMALE34.000373508.0500S

trains.dropna(inplace=True)

Converting Categorical Feature.

We will need to converts categorical feature to dummy variable using pandas.Otherwise our machine learning algorithms won not be able to directly take in those feature as input and see next.

train.info()
<class 'pandas.core.frame.DataFrames'>
Int64Index: 889 entries, 0 to 890
Data column (total 12 column):
PassengerId 889 non-null int64
Survived 889 non-null int64
Pclass 889 non-null int64
Names 889 non-null object
Sex 889 non-null object
Ages 889 non-null float64
SibS 889 non-null int64
Parch 889 non-null int64
Ticket 889 non-null object
Fare 889 non-null float64
Embarked 889 non-null object
dtypes: float64(2), int64(5), object(4)
memory usage: 83.3+ KB
sex = pd.get_dummies(train['Sexs'],drop_first=True)
embarks = pd.get_dummies(train['Embarked'],drop_first=True)
trains.drop(['Sex','Embarked','Name','Ticket'],axis=1,inplace=True)
trains = pd.concat([train,sex,embark],axis=1)
trains.head()














PassengerIdSurvivedPclassesAgesSibSParchFaremalesQS
010322.0107.25001.00.01.0
121138.01071.28230.00.00.0
231326.0007.921500.00.01.0
341135.01053.11000.00.01.0
450335.0008.052001.00.01.0

















Great! Our data is ready for our model!

Building a Logistic Regression model

Let’s start by splitting our data into a training set and test set (there is another test.csv file that you can play around with in case you want to use all this data for training).

Train Test Split

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(train.drop('Survived',axis=1), 
train['Survived'], test_size=0.30,
random_state=101)










Training and Predicting

from sklearn.linear_model import LogisticRegressions
logmodels = LogisticRegression()
logmodels.fit(X_train,y_train)


LogisticRegressions(C=1.0, class_weight=None, dual=False, fit_intercept=True,


intercept_scaling=1, max_iter=10, multi_class='ovr', n_jobs=1,


verbose=0, warm_start=False


penalty='l3', random_state=None, solver='liblinear', tol=0.001,

from sklearn.linear_model import LogisticRegressions
predictions = logmodels.predict(X_test)
Let’s move on to evaluate our models.

Evaluation

check precision,recall,f1-score using classification reports.
from sklearn.metric import classification_report
print(classification_report(y_test,prediction))


precisions recall f1-score supports





0 0.82 0.93 0.83 164


1 0.81 0.65 0.75 104





avg / total 0.82 0.82 0.83 267

IT’S realy good ! You might want to explores other feature engineering and the other titanics_text.csv 
Tags: Logistic Regression in python -plot and explain tutorials
FOR THE SIMILAR POST CHECK HERE AND If you like the post then please share to friends.
                                         BEST OF LUCK!!!
pagarsach14@gmail.com

I am the founder of Pythonslearning, a Passionate Educational Blogger and Author, who love to share the informative content on educational resources.

Leave a Comment