PostgreSQL | SQL | Data Analytics | Windown Functions | CTE
This project analyzes Zomato’s restaurant, customer, rider, and order data using PostgreSQL. It involves data creation, cleaning, transformation, and insightful SQL queries to derive meaningful business metrics such as customer behavior, restaurant performance, and delivery efficiency.
The analysis consists of the following key steps:
- Database Setup – Create all required tables.
- Data Cleaning – Handle null values, enforce relationships, and remove invalid records.
- Data Analysis Queries – Execute SQL queries to answer important business questions.
-
RESTAURANTS
- Contains restaurant details like name, city, and opening hours.
-
RIDERS
- Stores rider information and sign-up dates.
-
CUSTOMERS
- Holds customer registration details.
-
ORDERS
- Contains order details including customer, restaurant, status, and total amount.
-
DELIVERY
- Tracks delivery information, rider assigned, and delivery status.
- ORDERS → CUSTOMERS →
customer_id - ORDERS → RESTAURANTS →
restaurant_id - DELIVERY → ORDERS →
order_id - DELIVERY → RIDERS →
rider_id
- Checked and removed null or incomplete records from
ORDERS,CUSTOMERS, andRESTAURANTS. - Ensured referential integrity using foreign keys.
- Practiced inserting and deleting test data to validate constraints.
- Top 5 most ordered dishes by a specific customer (e.g., Arjun Mehta).
- Average Order Value (AOV) for customers with more than 180 orders.
- High-value customers (spent > ₹100K).
- Customer churn analysis – who ordered in 2023 but not in 2024.
- Customer lifetime value (CLV) and segmentation into Gold/Silver tiers.
- Orders without delivery (potential cancellations).
- Restaurant revenue ranking by city.
- Most popular dishes per city.
- Monthly restaurant growth ratio based on delivered orders.
- Monthly sales trends to observe revenue growth or decline.
- Rider average delivery time.
- Monthly rider earnings (8% of order value).
- Rider ratings based on delivery speed (5⭐, 4⭐, 3⭐ logic).
- Rider efficiency – fastest and slowest average delivery times.
- Order item popularity by season (Spring, Summer, Winter).
- Order frequency by day of the week – peak ordering day per restaurant.
-
Create the database
CREATE DATABASE zomato_sales;
-
Connect to the database
\c zomato_sales;
-
Run the provided SQL script to create tables, clean data, and perform analysis.
-
Verify table creation
\dt
-
Execute analysis queries section by section as per the business question.
- PostgreSQL 14+
- SQL Window Functions (RANK, LAG, etc.)
- CTEs (Common Table Expressions)
- Aggregate Functions (SUM, AVG, COUNT)
This project is open for collaboration and contributions!
If you're passionate about data analytics, SQL optimization, or building data-driven insights, feel free to join in.
- 🧱 Add new analytical SQL queries or visualizations
- 🧼 Improve data cleaning and integrity checks
- 📊 Build dashboards or reports based on query results
- 📘 Enhance documentation or README clarity
- Fork this repository
- Create a new branch for your feature or improvement
- Commit your changes with clear messages
- Submit a pull request with a short description of your contribution
For questions, discussions, or collaboration opportunities, reach out via:
- 📧 Email: hajra.mshahid24@gmail.com
Let’s collaborate to make this Zomato Data Analysis project more powerful and insightful together!