This project explores the Netflix Movies & TV Shows dataset using SQL to derive meaningful insights about content distribution, genres, ratings, countries, and trends over time. It demonstrates the complete workflow of importing data, designing schema structures, writing analytical queries, and generating clear outcomes for business decisions.
- Design a clean SQL schema for the Netflix dataset
- Organize and validate raw data for analysis
- Write SQL queries to solve real business problems
- Generate insights on content trends and streaming industry patterns
- Demonstrate SQL proficiency for analytics and reporting roles
- SQL
- PostgreSQL
- pgAdmin
- Dataset: netflix_titles.csv (Kaggle)
- Created a structured table (
netflix) with well-defined datatypes - Included fields like title, director, country, duration, release_year, etc.
Examples include:
- Content added by year
- Most common ratings
- Countries with highest content
- Directors with most releases
- Keyword-based categorization (e.g., βkillβ, βviolenceβ)
- Aggregations
- Joins
- GROUP BY / ORDER BY
- Date functions
- String functions
- Window functions
- Movie content dominates over TV shows.
- Most content is rated TV-MA.
- The United States and India have the highest number of titles.
- Keyword-based tagging helps detect themes like violence or crime.
- Yearly additions show strong growth around 2017β2019.
Author: Abbas Imran
Email: abbasimranabdi009@gmail.com
Location: Lucknow, India