Synthetic Data Generator

A Flask web application that uses CTGAN (Conditional Tabular GAN) to generate synthetic data from uploaded CSV files or manually entered data.

Features

Upload CSV files and train CTGAN models
Manually enter tabular data for model training
Generate synthetic data based on trained models
Web-based interface with Bootstrap styling

Local Development

Prerequisites

Python 3.9+
pip

Setup

Clone the repository:

git clone https://github.com/anshtrivediaiml/Synthetic-Data-Generator.git
cd ctgan_flask_app

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:
```
pip install -r requirements.txt
```

Set environment variables (optional for development):

export SECRET_KEY="your-secret-key-here"
export FLASK_DEBUG="True"

Run the application:
```
python app.py
```
Open http://localhost:5000 in your browser.

Production Deployment

Using Railway

Create a Railway account at https://railway.app
Connect your GitHub repository
Railway will automatically detect the Dockerfile and deploy
Set environment variables in Railway dashboard:
- SECRET_KEY: A secure random string
- FLASK_DEBUG: False

Using Fly.io

Install Fly CLI: https://fly.io/docs/hands-on/install-flyctl/
Login: fly auth login
Launch: fly launch

Set secrets:

fly secrets set SECRET_KEY="your-secret-key"
fly secrets set FLASK_DEBUG="False"

Deploy: fly deploy

Using Heroku

Install Heroku CLI
Create app: heroku create your-app-name
Set buildpack: heroku buildpacks:set heroku/python
Push to Heroku: git push heroku main

Set environment variables:

heroku config:set SECRET_KEY="your-secret-key"
heroku config:set FLASK_DEBUG="False"

Using Docker Locally

docker build -t ctgan-app .
docker run -p 8000:7860 ctgan-app

Environment Variables

SECRET_KEY: Flask secret key for sessions (required in production)
FLASK_DEBUG: Set to True for development, False for production

Health Check

The application provides a health check endpoint at /health that returns {"status": "healthy"}.

File Structure

ctgan_flask_app/
├── app.py                 # Main Flask application
├── requirements.txt       # Python dependencies
├── Dockerfile            # Docker configuration
├── Procfile              # Heroku deployment config
├── .gitignore           # Git ignore rules
├── README.md            # This file
├── static/              # Static files (CSS, images)
├── templates/           # HTML templates
├── uploads/             # Uploaded files (ignored in git)
└── outputs/             # Generated models and data (ignored in git)

Deployment Considerations

Persistent Storage

The application stores trained models and generated data in local directories (uploads/ and outputs/). For production deployments:

Railway: Supports persistent disks. Configure a volume for /code/uploads and /code/outputs
Fly.io: Supports persistent volumes. Add volumes in fly.toml for data persistence
Heroku: Uses ephemeral storage - data will be lost on dyno restarts. Consider using cloud storage like AWS S3
Docker: Mount host directories as volumes for persistence

Scaling

For high-traffic deployments, consider:

Using a database instead of file storage
Implementing caching for model loading
Using async processing for model training

Security Notes

File uploads are limited to 16MB
Sensitive data should not be uploaded to public repositories
Use strong SECRET_KEY in production
Implement proper authentication if needed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Synthetic Data Generator

Features

Local Development

Prerequisites

Setup

Production Deployment

Using Railway

Using Fly.io

Using Heroku

Using Docker Locally

Environment Variables

Health Check

File Structure

Deployment Considerations

Persistent Storage

Scaling

Security Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
outputs		outputs
static		static
templates		templates
uploads		uploads
venv		venv
venv_old/Scripts		venv_old/Scripts
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
Procfile		Procfile
README.md		README.md
Synthetic Data Generator (1).pptx		Synthetic Data Generator (1).pptx
Synthetic-Data-Generator.pdf		Synthetic-Data-Generator.pdf
app.py		app.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Synthetic Data Generator

Features

Local Development

Prerequisites

Setup

Production Deployment

Using Railway

Using Fly.io

Using Heroku

Using Docker Locally

Environment Variables

Health Check

File Structure

Deployment Considerations

Persistent Storage

Scaling

Security Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages