Skip to content

leigberk2002/travel-agency-sql-eda

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Travel Agency Operations: SQL Exploratory Data Analysis (EDA)

Project Overview

This project contains an end-to-end Exploratory Data Analysis (EDA) script written in SQL. The goal of this analysis is to evaluate a travel agency's internal operations, assess financial accuracy, and uncover macro-level business trends. By transforming, cleaning, and aggregating project data, this script prepares raw information for strategic decision-making and business intelligence (BI) dashboarding.

Data Source

  • Source: Kaggle
  • Dataset: Travel Agency Project Data (Imported via CSV)

Tech Stack

  • Language: SQL (MySQL dialect)
  • Techniques Utilized: Data Type Casting, Data Cleaning, Time Series Aggregation, Cross-Tabulation, Financial Variance Analysis, Standard Deviation (Risk Assessment).
  • Downstream Tools: Designed to output clean datasets for Tableau or Power BI.

Methodology & Analysis Phases

The SQL script (Travel_agency_operations_eda.sql) is broken down into 10 distinct phases:

  1. Database Setup & Data Import: Establishing the schema and loading the CSV data.
  2. Data Understanding: Initial volume checks and row counts.
  3. Data Preparation & Cleaning: Converting generic strings to DATE and DECIMAL types, checking for NULLs, and verifying data integrity against duplicates.
  4. Univariate Analysis: Analyzing the distributions of individual variables like Project Type and Project Status.
  5. Bivariate & Multivariate Analysis: Examining relationships, such as total revenue generated by different departments and revenue spreads across project roles.
  6. Time Series Analysis: Tracking project volume and revenue growth over years, as well as identifying peak seasonal months.
  7. Cross-Tabulation & Deep Dive Categorical Analysis: Deep diving into categorical intersections, like finding the most lucrative project types per audience segment, and identifying operational bottlenecks by mapping project statuses to specific departments.
  8. Financial Accuracy Metrics: Comparing actual vs. estimated revenues to determine forecast accuracy across different project categories and target audiences.
  9. Advanced Time Series: Rolling data up into standard fiscal quarters to assess macro-level business cycles.
  10. Risk Assessment: Calculating the Standard Deviation of revenue by project type to identify the financial volatility/unpredictability of specific offerings.

Key Business Questions Answered

  • Which departments are driving the most revenue?
  • What times of the year represent our peak seasons and highest revenue-generating months?
  • How accurately is the agency forecasting revenue, and which teams are over/under-predicting?
  • Which project types carry the highest financial risk (volatility)?
  • Are there operational bottlenecks within specific departments based on project status pipelines?

How to Run the Code

  1. Clone this repository to your local machine.
  2. Open your preferred SQL IDE (e.g., MySQL Workbench).
  3. Run the initial CREATE DATABASE travel_agency_db; command.
  4. Use the Table Data Import Wizard** to import your Kaggle CSV file into a table named travel_agency_projects.
  5. Execute the queries sequentially to clean the data and generate insights.
  6. Export the resulting tables to Tableau or Power BI for visualization.

About

An end-to-end SQL Exploratory Data Analysis (EDA) analyzing travel agency operational data to uncover financial trends, departmental performance, and forecasting accuracy.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors