Pandas Function
Explore DataFrame
To get the DataFrame Columns | df.columns |
To get the rows and column count of DataFrame | df.shape |
To get the DataFrame Column and Datatype | df.dtypes |
To get the DataFrame first 5 rows | df.head() |
Selection
To get the DataFrame first 10 rows using iloc: [ : , :] before comma rows and after comma columns | df.iloc[:10] |
To get the DataFrame first 5 Columns using iloc: | df.iloc[ : , :5] |
To get columns by name: loc – label based | df.loc[:,[“column1″,”column2″,”column3”]] |
To get random 10% rows | df.sample(frac = 0.1) |
To get 50 rows | df.sample(n = 50) |
DataFrame Column Manipulation
Rename column by position: here 3 rd column renamed | df.rename( ) df.rename(columns={‘col1′:’column1’, ‘col2′:’column2’ }, inplace = True) |
rename columns having underscore ‘_’ in their names with ‘dot’ | df.columns = df.columns.str.replace(‘_’ , ‘.’) |
rename column by adding prefix and suffix | df = df.add_prefix(‘de_’) df = df.add_prefix(‘_lv’) |
Selecting a column as index | df.set_index( ) |
DataFrame Data Manipulation
Removing rows or columns | df.drop() |
sort | df.sort_values( ) |
grouping | df.groupby( ) |
Filter | df.query( ) or df.where( ) |
Find missing Values | df.isnull( ) |
Drop missing Values | df.dropna( ) |
Drop duplicates | df.drop_duplicates( ) |
Rank | df.rank() |
Selecting column by types | df.select_dtypes( ) |
Concatenate 2 dataframe | pd.concat |
Top Level Function | |
concat 2 dataframe | pd.concat() |
Merging (Joining) based on common values | pd.merge( ) |