Synthetic Data Generator (Python)

This project generates synthetic data from scratch based on constraints and simple rules without requiring an original dataset.

I built this because in analytics, market research, and capstone projects, you often need realistic-looking data to prototype analysis, dashboards, or workflows, but can’t use real or proprietary data.

What this tool does

Generates CSV datasets from a YAML specification
Supports numeric ranges, categories, dates, and IDs
Allows rule-based dependencies between fields
Works for any topic (not tied to a specific domain)

The generator doesn’t assume anything about the data’s meaning it just follows the structure you define.

How it works (conceptually)

You describe the dataset shape in a YAML file (columns, bounds, rules)
The generator creates base values within those bounds
Rules override values where conditions are met
The result is written to a CSV file

Why YAML

YAML keeps the data definition readable and easy to change without editing Python code. Most changes happen in the spec, not the generator.

Limitations

Rule-based dependencies (not statistical modeling)
No guarantee of real-world distributions
Optimized for clarity and flexibility, not scale

Example use cases

Capstone projects
Analytics prototyping
Market research simulations
Synthetic datasets for dashboards or demos

How to run

python synthgen.py

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
output		output
README.md		README.md
spec.yml		spec.yml
synthgen.py		synthgen.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Synthetic Data Generator (Python)

What this tool does

How it works (conceptually)

Why YAML

Limitations

Example use cases

How to run

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Synthetic Data Generator (Python)

What this tool does

How it works (conceptually)

Why YAML

Limitations

Example use cases

How to run

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages