Data Analyst and Business Intelligence professional with a BBA in Business Management and a strong foundation in finance, business analytics, and machine learning.
I deliver end-to-end analytics solutions using SQL, Python, Power BI, and Scikit-learn, with experience across EDA, ETL, KPI reporting, dashboard development, customer segmentation, churn prediction, geospatial analysis, and time series forecasting.
My work combines technical analytics skills with business and financial thinking to translate complex data into clear, actionable insights.
I apply CRISP-DM methodology across all projects and document everything here on GitHub.
- SQL โข Python โข DAX โข M (Power Query) โข R
- Pandas โข NumPy โข Matplotlib โข Seaborn
- Scikit-learn (Random Forest, KMeans, Decision Tree)
- Folium โข Altair โข Streamlit
- OpenRouteService API
- Power BI โข Power Query โข SQL Server (SSMS)
- Jupyter Notebook โข VS Code
- Git & GitHub
- EDA โข ETL โข Churn Prediction โข Customer Segmentation
- Geospatial Analysis โข Pareto Analysis
- Time Series Forecasting (ARIMA) โข Business Storytelling
Tools: Python | Pandas | Scikit-learn (Random Forest) | Matplotlib | Seaborn
- Analyzed 11,233 sales records and 544 customer profiles from a fictional South Carolina distributor to predict which customers were at risk of churning
- Trained a Random Forest Classifier achieving 100% accuracy on a 109-sample test set, scoring all active 2025 customers by churn risk level
- Identified days since last purchase as the strongest predictor (~20% feature importance) and found month-to-month customers churn at 2x the rate of annual contracts
- Delivered 5 targeted retention recommendations including an early-warning flag system for accounts silent 60+ days
๐ View Repository
Tools: Python | Scikit-learn (KMeans, Decision Tree) | Pandas | Folium | Matplotlib
- Applied KMeans clustering to geographically segment 1,000 customers into 5 commercial divisions based on latitude & longitude, validated using the Elbow Method
- Trained a Decision Tree Classifier on the cluster output achieving 100% accuracy on 200 test samples โ enabling fully automated real-time classification of new customers
- Visualized all customer divisions on interactive maps using Folium
๐ View Repository
Tools: Python | Pandas | OpenRouteService API | Folium | Jupyter Notebook
- Built a route simulation engine integrating the OpenRouteService API to calculate real driving distances and compare delivery scenarios across 6 days
- Quantified that an emergency branch change increased total weekly route distance from 5,522 km to 10,250 km โ an increase of +4,728 km (+85.6%)
- Visualized both route scenarios on interactive maps using Folium and delivered business-focused recommendations on inventory planning and contingency routing
๐ View Repository
Tools: Python | Pandas | NumPy | Matplotlib | Seaborn | Jupyter Notebook
- Analyzed 601,836 transactions (2018โ2024) to diagnose a business paradox where revenue grew while sales volume and active customer count both declined
- Applied IQR outlier treatment calibrated against operational context and Pareto analysis to identify that one region-market-segment combination drove 28.95% of total volume loss
- Delivered targeted commercial recovery recommendations prioritized by business impact
๐ View Repository
Tools: SQL Server | Power BI (DAX) | R (ARIMA Forecasting)
- Analyzed ~500 S&P 500 companies post-pandemic using SQL Server, 13 custom DAX measures, and an ARIMA forecast model in R integrated directly inside Power BI
- Identified total revenue doubling from $5.6T to $13.1T with net income margin improving from 8% to 11% across 6 interactive dashboards
- Found no direct correlation between company size and profitability โ business model drives margins more than revenue scale
๐ View Repository
Tools: SQL Server | Power BI | Power Query | DAX
- Built a snowflake schema data model with 20 custom DAX measures and 3 interactive dashboards with collapsible navigation menus and custom tooltip pages
- Identified $9.39M in revenue, $3.91M in profit, and 41.5% average margin across 5 U.S. regions
- Pareto analysis confirmed top 27 SKUs generated ~80% of total revenue; Accessories (~62.6%) and Mountain Bikes (~45.4%) were top profitability drivers
๐ View Repository
- โ Google Data Analytics Professional Certificate
- โ SQL for Data Analysis
- โ Power BI & Data Analytics
- โ Python Fundamentals
- LinkedIn: linkedin.com/in/leonardo-gama-a99648279
- Streamlit: share.streamlit.io/user/leomgama