This project is a Proof of Concept (PoC) for a local data management system designed to replace a paper-based order record system for a local café chain. The application extracts, transforms, and loads (ETL) order data from CSV files into a local MySQL database while ensuring data quality and removing Personally Identifiable Information (PII).
The café required a simple yet effective solution to digitize their order records on a Microsoft Windows or macOS infrastructure. This ETL pipeline extracts raw order data, cleans and validates the data by removing incorrect or badly formed records, and scrubs sensitive PII such as card numbers and customer names before loading the sanitized data into a MySQL database.
Two user interfaces are included:
- Command-Line Interface (CLI): A menu-driven interface to manually trigger ETL pipeline functions and interact with the data.
- Windows Graphical User Interface (GUI): A simple GUI menu for non-technical users to perform the same ETL tasks more intuitively.
-[x] Extract order data from CSV files -[x] Transform data by cleaning, validating, and removing PII -[x] Load sanitized data into a local MySQL database -[ ] CLI and GUI menus to manually control the ETL process -[ ] Display and clear screen functionality in the CLI -[x] Designed with Docker for easy deployment
- Python
- MySQL
- Docker
- Flask Web based GUI
- Tkinter (for Windows GUI)
- pandas (for data transformation)
-
Clone the repository:
git clone https://github.com/GuledIM/local-cafe-etl.git cd local-cafe-etl -
Set up Docker and MySQL:
Ensure Docker is installed on your machine. The project includes a
docker-compose.ymlfile to set up a local MySQL database container. -
Install Python dependencies:
pip install -r requirements.txt
-
Configure database credentials:
Update the
config.pyor environment variables with your local MySQL credentials.
- CLI: Run
python cli-app.pyto launch the command-line interface. - GUI: Run
python tkinter-app.pyto open the local graphical user interface. - GUI: Run
python flask-app.pyto run the graphical user interface before laucnhing in your browser.
Within both interfaces, users can:
- Trigger Extract, Transform, and Load functions individually
- View sanitized data summaries
- Clear the console screen (CLI only)
- Product Demo: Showcase of all application functions (approx. 5 mins)
- Client Presentation: Highlighting benefits such as improved data accuracy, PII compliance, and easier record-keeping (approx. 5 mins)
- Whiteboard Session: Explanation of design choices, architecture, and alternatives considered (approx. 5 mins)
Currently completed all functions and tested manually the ETL functions and for the CLI application. Next steps for now are to complete the CLI application and begin thw Tkinker and Flask application. Completed all three menu options - Need to debug the Flask menu for issues in loading data.
This project is licensed under the MIT License.