Pandas is the most popular python library that is used for data analysis. Pandas is used for data science projects. If you are applying for a data engineer or data scientist job, you can be asked questions on Pandas in interview
Pandas Interview Questions
Q:- What is Pandas?

Pandas is the most popular python library that is used for data analysis. It provides highly optimized performance with backend source code is purely written in Python.

We can analyze data in pandas using:

  • Series
  • DataFrames

Pandas is free software released under the three-clause BSD license.

Q:- What is Pandas Series?

Pandas Series is a one-dimensional labeled array capable of holding data of any type (integer, string, float, python objects, etc.)

Axis labels are collectively called index. Pandas Series is nothing but a column in an excel sheet.

Creating a series from Array

# import pandas as pd import pandas as pd # import numpy as np import numpy as np # pandas as an array data = np.array(['p','a','n','d','a', 's']) myseries = pd.Series(data) print(myseries) #Output 0 p 1 a 2 n 3 d 4 s 5 s dtype: object
Q:- What is Pandas DataFrames?

DataFrame is a two-dimensional labeled data structure with columns of potentially different types. You can think of it like a spreadsheet or a dict of Series objects.

Creating DataFrame from a dictionary

>>> d = {'col1': [1, 2], 'col2': [3, 4]} >>> df = pd.DataFrame(data=d) >>> df col1 col2 0 1 3 1 2 4
Q:- How do I sort a pandas DataFrame or a Series?

You can sort a series using - sort_values

>>> s = pd.Series([np.nan, 1, 2, 3, 5]) >>> s 0 NaN 1 1.0 2 2.0 3 3.0 4 5.0 dtype: float64 //Sort values in ascending order >>> s.sort_values(ascending=True) 1 1.0 2 2.0 4 3.0 3 5.0 0 NaN dtype: float64
Q:- What is Pandas Reindexing?

Reindexing changes the row labels and column labels of a DataFrame.

Q:- How do I read a tabular data file into pandas?

You can read tablular data file into pandas using - read_table

import pandas as pd table_data = pd.read_table("SOURCE_URL"); //Note - Change the SOURCE_URL with actual URL.

For more info read here - Pandas Read Table