Python for Data Analysis
Python for data work: Pandas, analysis, and workflows.
Core libraries
- Pandas — DataFrames, series, and data manipulation
- NumPy — Arrays and numerical operations
- Matplotlib / Seaborn — Plotting and visualization
- Jupyter — Interactive notebooks for exploration
Common patterns
Reading and shaping
import pandas as pd
df = pd.read_csv("data.csv")
df = df.rename(columns=str.lower).dropna(subset=["key_col"])Aggregation
summary = df.groupby("category").agg(
count=("id", "count"),
total=("amount", "sum"),
)Merging
merged = pd.merge(left, right, on="id", how="left")Best practices
- Use vectorized operations instead of loops when possible
- Set
dtypewhen loading to control memory - Prefer method chaining for readability
- Use
pd.read_sqlfor database queries; avoid loading full tables into memory