A hands-on learning repository for data analysis with Pandas, NumPy, Matplotlib, and Seaborn
Installation • Topics • Resources • About
This repository documents everything I've learned while working with Pandas. Starting from the basics like Series and DataFrames, I gradually moved towards more complex topics like grouping, merging, and multi-index operations. Each notebook represents a specific topic explored in detail.
The core philosophy is learning by working with actual datasets, experimenting with different methods, and visualizing results using Matplotlib and Seaborn. I've used Ruff to maintain clean code and uv for efficient dependency management.
- Python 3.13 or above
- Git
Clone the repository:
git clone https://github.com/Tams3d/pandas-journey.git
cd pandas-journeyCreate and activate a virtual environment (highly recommended):
python -m venv .venv
# Windows
.venv\Scripts\activate
# macOS/Linux
source .venv/bin/activateInstall dependencies:
uv syncThis automatically installs pandas, matplotlib, seaborn, ruff, and all other dependencies specified in pyproject.toml.
- Series Operations - Creation, indexing, and basic operations
- DataFrame Basics - Structure, creation, and fundamental operations
- Data Access - Techniques for retrieving and filtering data
- Data Modification - Updating, adding, and removing data
- Built-in Methods - Leveraging pandas' extensive method library
- String Operations - Text processing and manipulation
- GroupBy Operations - Split-apply-combine workflows and aggregations
- Merging DataFrames - Joins, concatenation, and combining datasets
- MultiIndex - Working with hierarchical indices
- Time Series - Datetime handling and temporal analysis
- Creating plots with Matplotlib
- Statistical graphics with Seaborn
- Customizing visualizations for data presentation
-
Pandas Cheat Sheet – pandas.pydata.org (Official)
Download PDF - A handy reference guide summarizing the most common Pandas operations. -
Python for Data Analysis – Wes McKinney (English)
Read online - A comprehensive guide to using Pandas, written by its creator, great for both beginners and experienced users. -
Pandas Tutorial Series – Corey Schafer (English)
Watch on YouTube - A practical tutorial series on core Pandas features; slightly older but still very relevant. -
Pandas Tutorial – CampusX (Hindi)
Watch on YouTube - Beginner-friendly, well-paced Pandas tutorial in Hindi, great for getting started with data analysis. -
Pandas Exercises – guipsamora (GitHub)
Explore on GitHub - A comprehensive collection of practical exercises covering topics like data filtering, merging, grouping, time series analysis, and visualization.
I'm 18 and deeply interested in AI, machine learning, and data science. I've realized that the best way to learn is by building things and experimenting with real problems rather than just reading theory.
Apart from programming, I enjoy creative work like photo and video editing, and 3D design using Blender. I often combine these creative interests with technical skills to build useful tools and workflows.
Right now, I'm focusing on deep learning and generative AI. My long-term goal is to contribute to research and open-source projects in the AI domain. This repository is part of building that foundation - understanding core tools like Pandas is essential for any serious work in data science or ML.
This project is released under the MIT Licence. Check the LICENSE file for more details
⭐ If you find this repository helpful, please consider giving it a star!
This is primarily a learning repository. The code reflects my learning process and may not always follow production-level best practices. The focus is on understanding concepts through hands-on experimentation.
Made with 🤍 by Tamil Selvan