Skip to content

GuledIM/local-cafe-etl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Café Order ETL Pipeline

This project is a Proof of Concept (PoC) for a local data management system designed to replace a paper-based order record system for a local café chain. The application extracts, transforms, and loads (ETL) order data from CSV files into a local MySQL database while ensuring data quality and removing Personally Identifiable Information (PII).


Project Overview

The café required a simple yet effective solution to digitize their order records on a Microsoft Windows or macOS infrastructure. This ETL pipeline extracts raw order data, cleans and validates the data by removing incorrect or badly formed records, and scrubs sensitive PII such as card numbers and customer names before loading the sanitized data into a MySQL database.

Two user interfaces are included:

  • Command-Line Interface (CLI): A menu-driven interface to manually trigger ETL pipeline functions and interact with the data.
  • Windows Graphical User Interface (GUI): A simple GUI menu for non-technical users to perform the same ETL tasks more intuitively.

Features

-[x] Extract order data from CSV files -[x] Transform data by cleaning, validating, and removing PII -[x] Load sanitized data into a local MySQL database -[ ] CLI and GUI menus to manually control the ETL process -[ ] Display and clear screen functionality in the CLI -[x] Designed with Docker for easy deployment


Technologies Used

  • Python
  • MySQL
  • Docker
  • Flask Web based GUI
  • Tkinter (for Windows GUI)
  • pandas (for data transformation)

Installation & Setup

  1. Clone the repository:

    git clone https://github.com/GuledIM/local-cafe-etl.git
    cd local-cafe-etl
  2. Set up Docker and MySQL:

    Ensure Docker is installed on your machine. The project includes a docker-compose.yml file to set up a local MySQL database container.

  3. Install Python dependencies:

    pip install -r requirements.txt
  4. Configure database credentials:

    Update the config.py or environment variables with your local MySQL credentials.


Usage

  • CLI: Run python cli-app.py to launch the command-line interface.
  • GUI: Run python tkinter-app.py to open the local graphical user interface.
  • GUI: Run python flask-app.py to run the graphical user interface before laucnhing in your browser.

Within both interfaces, users can:

  • Trigger Extract, Transform, and Load functions individually
  • View sanitized data summaries
  • Clear the console screen (CLI only)

Project Deliverables

  • Product Demo: Showcase of all application functions (approx. 5 mins)
  • Client Presentation: Highlighting benefits such as improved data accuracy, PII compliance, and easier record-keeping (approx. 5 mins)
  • Whiteboard Session: Explanation of design choices, architecture, and alternatives considered (approx. 5 mins)

Reflections & Progress

Currently completed all functions and tested manually the ETL functions and for the CLI application. Next steps for now are to complete the CLI application and begin thw Tkinker and Flask application. Completed all three menu options - Need to debug the Flask menu for issues in loading data.


Future Improvements


License

This project is licensed under the MIT License.

About

Café Order ETL Pipeline | Python, Docker & MySQL CLI & GUI Proof of Concept for local data management and PII-safe record storage.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors