Skip to main content

Pandas Tutorial : Pandas Interview Question

 Pandas Tutorial 

Python Pandas Interview Question

Pandas is a powerful Python library for data analysis and manipulation. It provides easy-to-use data structures and data analysis tools for handling and manipulating large amounts of data.

Here are some common pandas interview questions and examples:

  1. How do you read a CSV file into a pandas DataFrame?
import pandas as pd df = pd.read_csv('file.csv')
  1. How do you select a column from a DataFrame?
# Select the "age" column 
df['age']
# You can also use the dot notation 
df.age
  1. How do you select multiple columns from a DataFrame?
# Select the "age" and "name" columns 
df[['age', 'name']]
  1. How do you select rows from a DataFrame based on a condition?
# Select rows where the age is greater than 30 
df[df.age > 30]
  1. How do you group a DataFrame by a column and calculate the mean of each group?
df.groupby('gender').mean()
  1. How do you handle missing values in a DataFrame?
# Drop rows with any missing values
df.dropna() 
# Fill missing values with 0
df.fillna(0)
  1. How do you pivot a DataFrame?
# Pivot the DataFrame with index "id", columns "group", and values "value" df.pivot(index='id', columns='group', values='value')
  1. How do you merge two DataFrames on a common column?
df1 = pd.DataFrame({'key': ['a', 'b', 'c'], 'value': [1, 2, 3]}) 
df2 = pd.DataFrame({'key': ['a', 'b', 'd'], 'value': [4, 5, 6]})
# Inner join on "key" column 
pd.merge(df1, df2, on='key'
# Outer join on "key" column 
pd.merge(df1, df2, on='key', how='outer')
  1. How do you concatenate two DataFrames vertically or horizontally?
df1 = pd.DataFrame({'key': ['a', 'b', 'c'], 'value': [1, 2, 3]}) 
df2 = pd.DataFrame({'key': ['d', 'e', 'f'], 'value': [4, 5, 6]}) 
# Concatenate vertically 
pd.concat([df1, df2]) 
# Concatenate horizontally 
pd.concat([df1, df2], axis=1)
  1. How do you apply a function to a column of a DataFrame?
import numpy as np 
# Calculate the absolute value of each element in the "value" column df['value'].apply(np.abs
# Calculate the length of each name in the "name" column 
df['name'].apply(len
# You can also define your own function 
def add_one(x):
    return x + 1 df['value'].apply(add_one)
  1. How do you sort a DataFrame by a column?
# Sort the DataFrame by the "age" column in ascending order 
df.sort_values('age'
# Sort the DataFrame by the "age" column in descending order df.sort_values('age', ascending=False)
  1. How to?
  • Rename columns:
df.rename(columns={'old_name': 'new_name'}, inplace=True)
  • Drop columns:
df.drop(columns=['column_1', 'column_2'], inplace=True)
  • Replace values:
df.replace(to_replace=old_value, value=new_value, inplace=True)
  1. how to use loc and iloc?

loc is used to index and slice data using label-based indexing, while iloc is used to index and slice data using integer-based indexing.

Certainly! Here are a few examples to illustrate the use of loc and iloc in pandas:

import pandas as pd 
# create a sample dataframe 
df = pd.DataFrame([[1, 2, 3], [4, 5, 6], [7, 8, 9]], columns=['a', 'b', 'c']) 
#ind a b c 
# 0  1 2 3
# 1  4 5 6 
# 2  7 8 9 
# index the second column using label-based indexing with loc 
df.loc[:, 'b']
# 0 2 
# 1 5 
# 2 8 
# Name: b, dtype: int64
# index the second column using integer-based indexing with iloc 
df.iloc[:, 1
# 0 2 
# 1 5 
# 2 8 
# Name: b, dtype: int64
# slice the first two rows and first two columns using label-based indexing with loc 
df.loc[0:1, 'a':'b']
#   a b 
# 0 1 2 
# 1 4 5 
# slice the first two rows and first two columns using integer-based indexing with iloc 
df.iloc[0:2, 0:2
#   a b 
# 0 1 2 
# 1 4 5 
# index a single value using label-based indexing with loc
df.loc[1, 'c']
# 6 
# index a single value using integer-based indexing with iloc
df.iloc[1, 2]
# 6
  1. How do you create a pivot table in pandas?
import pandas as pd
# Create a pivot table with index "city", columns "gender", and values "value" pd.pivot_table(df, index='city', columns='gender', values='value')
  1. How do you create a bar plot of a pivot table in pandas?
import matplotlib.pyplot as plt
# Create a pivot table
table = pd.pivot_table(df, index='city', columns='gender', values='value'
# Plot the pivot table as a bar plot 
table.plot(kind='bar', stacked=True) plt.show()
  1. Explain all types of plots?

Line plot: A line plot is a way to display data along a number line. To create a line plot with pandas, you can use the plot function and specify the kind parameter as 'line'. For example:

import pandas as pd import matplotlib.pyplot as plt
# Read in the data 
df = pd.read_csv('data.csv'
# Create a line plot df.plot(x='date', y='sales', kind='line'
# Show the plot plt.show()

Bar plot: A bar plot is a way to display data using bars. To create a bar plot with pandas, you can use the plot function and specify the kind parameter as 'bar'. For example:

# Create a bar plot df.plot(x='country', y='sales', kind='bar'
# Show the plot 
plt.show()

Scatter plot: A scatter plot is a way to display data using dots. To create a scatter plot with pandas, you can use the plot function and specify the kind parameter as 'scatter'. For example:

# Create a scatter plot 
df.plot(x='x_col', y='y_col', kind='scatter'
# Show the plot 
plt.show()

Histogram: A histogram is a way to display the distribution of a numeric variable. To create a histogram with pandas, you can use the hist function. For example:

# Create a histogram 
df['column_name'].hist()
# Show the plot 
plt.show()


I hope this helps! Let me know if you have any other questions.


Comments

Favourite post

Part 2 : TCS DCA python coding questions

TCS wings1 DCA python coding TCS elevate wings1 coding questions This post contains only coding questions asked in TCS digital wings1 or tcs elevate wings1 exam. If anyone want coding answer of these questions please comment me or reach me through email provided. Given a string str which consists of only 3 letters representing the color,(H) Hue, (S) Saturation, (L) Lightness, called HSL colors. The task is to count the occurrence of ordered triplet “H, S, L” in a given string and give this count as the output. This question was asked in April 21 Digital Capababilty Assessment Examples: A) Input HHSL Output 2 Explanation : There are two triplets of RGB in the given string: H at index O, S at index 2 and Lat index 3 forms one triplet of HSL. H at index 1, S at index 2 and Lat index 3 forms the second triplet of HSL. B) Input:   SHL Output: 0 Explanation : No triplets exists. In this 3 Palindrome, Given an input string word, split the string into exactly 3 palindromic substrings. Work...

Part 1 : TCS DCA python coding questions

TCS wings1 DCA python coding TCS elevate wings1 coding questions This post contains only coding questions asked in TCS digital wings1 or tcs elevate wings1 exam. If anyone want coding answer of these questions please comment me or reach me through email provided. Find how many Sexy Prime Numbers in a given range k and p.  Sexy prime means the prime numbers that differ from each other by 6. Where difference between two sexy prime numbers is 6. Constraint: 2 <=p<k<= 1000,000,000. Sample Input: 4  40 Output: 7 Explanation: [5,11] [7,13] [11,17] [13,19] [17,23]  [23,29] [31,37] Problem Description -: Given two non-negative integers n1 and n2, where n1 For example: Suppose n1=11 and n2=15. There is the number 11, which has repeated digits, but 12, 13, 14 and 15 have no repeated digits. So, the output is 4. Example1: Input: 11 — Vlaue of n1 15 — value of n2 Output: 4 Example 2: Input: 101 — value of n1 200 — value of n2 Output: 72 Consider the string S1 = 321 All char...

Calculator program using python tkinter

Project Name:- Simple Calculator Software using Python Tkinter. Dependencies:-                           1)Install  Tkinter- https://www.techinfected.net/2015/09/how-to-install-and-use-tkinter-in-ubuntu-debian-linux-mint.html                                 2)Install Math Library- for python3- pip3 install math  for python2- pip install math  ******************************************//************************************************************************ from tkinter import * from math import * class cal: def press( self ,n): self .text.insert(INSERT,n) def dl( self ): self .text.delete( 1.0 , 2.0 ) def equal( self ): self .exp= self .text.get( 1.0 ,END) try : self .result= eval ( self .exp) self .text.delete( 1.0 , 2.0 ) ...