Pandas Profiling

Download “titanic” dataset from kaggle website. Filename: tested.csv.
And follow the below steps in Jupyter Notebook.

Note: This page is intended for those who have a basic understanding of Pandas DataFrames. The page demonstrates how to carry out basic dataframe exploration and profiling.

Pandas_Profiling

Profile a DataFrame

In [21]:

import pandas as pd
titanic = pd.read_csv(r"Downloads/tested.csv")

In [22]:

titanic.shape

Out[22]:

(418, 12)

In [23]:

titanic.columns

Out[23]:

Index(['PassengerId', 'Survived', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp',
       'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked'],
      dtype='object')

In [24]:

titanic.dtypes

Out[24]:

PassengerId      int64
Survived         int64
Pclass           int64
Name            object
Sex             object
Age            float64
SibSp            int64
Parch            int64
Ticket          object
Fare           float64
Cabin           object
Embarked        object
dtype: object

In [25]:

titanic.head()

Out[25]:

	PassengerId	Survived	Pclass	Name	Sex	Age	SibSp	Parch	Ticket	Fare	Cabin	Embarked
0	892	0	3	Kelly, Mr. James	male	34.5	0	0	330911	7.8292	NaN	Q
1	893	1	3	Wilkes, Mrs. James (Ellen Needs)	female	47.0	1	0	363272	7.0000	NaN	S
2	894	0	2	Myles, Mr. Thomas Francis	male	62.0	0	0	240276	9.6875	NaN	Q
3	895	0	3	Wirz, Mr. Albert	male	27.0	0	0	315154	8.6625	NaN	S
4	896	1	3	Hirvonen, Mrs. Alexander (Helga E Lindqvist)	female	22.0	1	1	3101298	12.2875	NaN	S

In [26]:

import pandas_profiling
pandas_profiling.ProfileReport(titanic)

Summarize dataset:   0%|          | 0/5 [00:00<?, ?it/s]

Generate report structure:   0%|          | 0/1 [00:00<?, ?it/s]

Render HTML:   0%|          | 0/1 [00:00<?, ?it/s]

Out[26]:

In [ ]:

Top Guides – Data School

Pandas Profiling

Leave a Reply Cancel reply