← Back to Libraries📊 Data Science
📦

Pandas in Python

Learn how to use the Pandas library in Python for data analysis and manipulation.

pip install pandas

Overview

Pandas is a powerful open-source data analysis and manipulation library for Python. It provides data structures and functions needed to work with structured data seamlessly, making it an essential tool for data scientists and analysts.

Key features of Pandas include data alignment, missing data handling, label-based slicing, and powerful group by functionality. It's widely used for data cleaning, transformation, and analysis in fields like finance, economics, and data science.

To get started with Pandas, you need to install the library and familiarize yourself with its core data structures: Series and DataFrame. Common patterns include data loading, cleaning, exploration, and visualization.

Code Examples

Creating a DataFrame

import pandas as pd
data = {'name': ['Alice', 'Bob', 'Charlie'], 'age': [25, 30, 35]}
df = pd.DataFrame(data)
print(df)

Reading a CSV file

import pandas as pd
df = pd.read_csv('data.csv')
print(df.head())

Filtering Data

import pandas as pd
df = pd.DataFrame({'name': ['Alice', 'Bob', 'Charlie'], 'age': [25, 30, 35]})
adults = df[df['age'] > 30]
print(adults)

Group By and Aggregate

import pandas as pd
data = {'name': ['Alice', 'Bob', 'Alice'], 'score': [85, 90, 95]}
df = pd.DataFrame(data)
grouped = df.groupby('name').sum()
print(grouped)

Handling Missing Data

import pandas as pd
df = pd.DataFrame({'name': ['Alice', 'Bob', None], 'age': [25, None, 35]})
df_filled = df.fillna('Unknown')
print(df_filled)

Common Methods

read_csv

Loads data from a CSV file into a DataFrame.

to_csv

Writes the DataFrame to a CSV file.

head

Returns the first n rows of the DataFrame.

tail

Returns the last n rows of the DataFrame.

describe

Generates descriptive statistics of the DataFrame.

groupby

Groups data using a specified column and applies a function to each group.

merge

Combines two DataFrame objects by columns or rows.

pivot_table

Creates a spreadsheet-style pivot table as a DataFrame.

fillna

Fills missing values with a specified value.

dropna

Removes missing values from a DataFrame.

More Data Science Libraries