Skip to content

ashmeetkaur2003/Netflix-data-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

This project is designed as an Exploratory Data Analysis (EDA) of the Netflix content library, utilizing a dataset of titles to extract and visualize key metrics regarding the platform's content composition and strategic trends. The script leverages Python's powerful data science ecosystem, primarily Pandas for robust data manipulation and cleaning, and Matplotlib for creating an accessible and interactive visualization interface. The overall goal is to transform raw content data into actionable insights about Netflix's library characteristics and audience targeting.

The workflow begins by importing the data and immediately conducting essential data quality checks, printing summaries of data types and explicit counts of missing values. A crucial preprocessing step follows: rows missing data in the listed_in column are dropped, ensuring the reliability of the core genre analysis that follows. This clean dataset then feeds into five distinct analytical threads, calculating metrics such as the total count of Movies versus TV Shows, the Top 10 most frequent genres (by splitting and exploding the genre strings), the distribution of content based on release year, and the distribution of Movie Durations.

The most notable feature of this project is its interactive output design. Instead of generating five separate, static windows, the script uses matplotlib.widgets.Button to integrate all five plots into a single, cohesive figure. This interactive interface allows the user to dynamically navigate the visualizations using 'Next' and 'Previous' buttons, providing a highly efficient and user-friendly way to review all the analytical findings—from the ratio of content types to the detailed breakdown of movie runtimes and content ratings. Ultimately, the project successfully bridges complex data processing with accessible visualization, delivering a comprehensive snapshot of Netflix's content landscape.

HOW TO SETUP AND RUN:

  1. Install python and set path.
  2. Go to command prompt install numpy by using command: pip install numpy
  3. Install pandas by using command: pip install pandas
  4. Install matplotlib by using command: pip install matplotlib
  5. Inorder to run the python file, open command prompt in the same folder where your python file is placed.
  6. Run command: python hi.py (here hi.py is the name of python file) Note: The python file and csv file must be in same folder

Screenshot 2026-04-01 173437 Screenshot 2026-04-01 173225 Screenshot 2026-04-01 173203 Screenshot 2026-04-01 173137 Screenshot 2026-04-01 173421 Screenshot 2026-04-01 173358 Screenshot 2026-04-01 173341 Screenshot 2026-04-01 173313 Screenshot 2026-04-01 173254 Screenshot 2026-04-01 173238

About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages