Hello Python Enthusiast,
Though Python is considered as one of the easisest language to learn in the market, many freshers and beginners will find great difficulty in understansing the syntax and also how to achieve a operation under a given situation.
To make that more easier and clearer for you, im posting some of the important operations and how to achieve it with their syntax.
Understandings
- Any word enclosed inside the <> means it is a general form, you can replace it with your dataframe or column name or index name depending upon what mentioned in the syntax.
- Where ever [] or single quotes or double quotes are mentioned use it as it is.
- Words mentioned without any <> symbols represents the keywords , so use as it is.
Come, lets learn!!
Basic operations in pandas
- Replace a column value in a dataframe
If you want single value to be replaced to single value or multiple values to be replaced by multiple values follow the below statement accordingly.
<Dataframe>[‘<Columnname>’]=<Dataframe>[‘<Columnname>’].replace(‘<value1>’,’<newvalue>’)
<Dataframe>[‘<Columnname>’]
=<Dataframe>[‘<Columnname>’].replace([‘<value1>’,’<value2>’],’<newvalue>’)
- To read row by row and index by index in pandas
Reading the rows and the index postitions can be done using the following statements.
for <index>,<row> in <dataframe>.iterrows():
print(<row>) #prints the rows inside the dataframe
print(<index>) #prints the index value of that row
- To concatenate multiple dataframes into a single one
To combine three data frames into one single dataframe.
<Combined_dataframe>=pd.concat([<dataframe1>,<dataframe2>,<dataframe3>])
All the dataframes which we are planning to combine must be put inside the list[] in concat function.
- To view the top 5 lines of a dataframe
To print the top 5 lines of the dataframe.
<dataframe>.head()
- To find out the columns having null values in a dataframe
To find the columns holding null values in it.
<null_columns> = <dataframe>.columns[<dataframe>.isnull().any()]
Isnull() - checks for null values
Any() - even if one row has null value also take that column
Now the null_columns will contain a list of columns having null values in that data frame.
- To find out the sum of null values in every column in a dataframe
How many null values a column holds can be identified by using the below format.
<dataframe>[‘<null_columns>’].isnull().sum()
Apply the isnull() check on the columns identified as having null values and perform a sum on it.
- To find out any specific column is having null values in it
To check whether any specific column is having null value in it or not.
<dataframe>[<dataframe>[“<column name>”].isnull()]
To check whether the specified column name is having any null values in the dataframe.
- To find out a row which contains atleast one null value in it
Basically the property axis=1 represents the column names in the dataframe and axis=0 represents the row names in the dataframe.
<dataframe>[<dataframe>.isnull().any(axis=1)]
To check whether any value is null on any one of the column in the dataframe, if so that complete row will be returned.
- To replace a null value in a column with some other value
To replace the null value in a column with some other value use fillna
<dataframe>[‘<columnname>’]=<dataframe>[‘<columnname>’].fillna(‘<value to replace>’)
What-ever posted above are about the functions used under the pandas library in python, similarly will post more about the basic python syntaxes and libraries like numpy,matplotlib etc.
Thank you!!