YC Bench: a Live Benchmark for Forecasting Startup Outperformance in Y Combinator Batches

Paper: https://arxiv.org/abs/2604.02378

Forecasting which startups will dominate a YC batch — months before Demo Day.

Overview

YC Bench turns every Y Combinator batch into a rapid-evaluation environment for startup success prediction. Instead of waiting 7–10 years for exits or large funding rounds, we use the Pre-Demo Day Score — a short-term proxy metric that combines public traction signals and Google web mentions.

This repository contains all data, collection scripts, scoring code, and analysis for the YC W26 batch (196 startups).

Repository Structure

ycbench/
├── yc_w26_startups.csv              # Main list of YC W26 companies
├── yc_mentions.csv                  # Google mentions during the batch
├── yc_mentions_early.csv            # Pre-application Google mentions (baseline)
├── YC_W26_Google_Mentions.ipynb     # Colab notebook for data collection & visualizations
├── yc_google.py                     # Google mentions scraping utilities
├── requirements.txt
├── scripts/
│   ├── scrape/                      # Data collection scripts
│   ├── processing/                  # Data cleaning pipelines
│   └── scoring/                     # Pre-Demo Day Score computation
├── fix_pipeline.sh
├── paper/                           # LaTeX paper
└── figures/                         # Charts from the paper

Key Features

Scripts to collect fresh Google mentions data
Colab notebook for easy data collection and visualization
Pre-computed mentions (during batch + pre-application baseline)
Traction data integration
Baseline model (pre-YC application Google mentions)

Quick Start

1. Clone the repository

git clone https://github.com/benstaf/ycbench.git
cd ycbench
pip install -r requirements.txt

2. Explore with Colab (Recommended)

Results — YC W26 Batch

A simple baseline using Google mentions before the YC application deadline achieved:

Precision@20: 70%
Recall@11: 55%
Lift over random: 7×
Forecasting horizon: ~5 months

Full details are available in the paper.

Paper

Title: YC Bench: a Live Benchmark for Forecasting Startup Outperformance in Y Combinator Batches
Author: Mostapha Benhenda

📄 https://arxiv.org/abs/2604.02378

Citation

@misc{benhenda2026ycbench,
  title={YC Bench: A Live Benchmark for Forecasting Startup Outperformance in Y Combinator Batches},
  author={Mostapha Benhenda},
  year={2026},
  url={https://arxiv.org/abs/2604.02378}
}

Roadmap

Support for future batches (S26, W27, ...)
Learn optimal signal weights from historical data
Expand traction dataset
Public leaderboard for community models

Contributing

Contributions are welcome! Especially:

Improved scraping methods
New predictive signals
Better scoring logic
Support for upcoming batches

Feel free to open issues or submit pull requests.

Built to make startup forecasting faster and more rigorous.
Star the repo if you're working on this problem! 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
scripts		scripts
spring_2026_batch		spring_2026_batch
.gitignore		.gitignore
README.md		README.md
YC_W26_Google_Mentions.ipynb		YC_W26_Google_Mentions.ipynb
extract_companies_domains.py		extract_companies_domains.py
fix_pipeline.sh		fix_pipeline.sh
requirements.txt		requirements.txt
yc_all_companies.json		yc_all_companies.json
yc_google.py		yc_google.py
yc_google2.py		yc_google2.py
yc_mentions.csv		yc_mentions.csv
yc_mentions_early.csv		yc_mentions_early.csv
yc_scraperapi.py		yc_scraperapi.py
yc_serpapi.py		yc_serpapi.py
yc_serpapi_early.py		yc_serpapi_early.py
yc_serpapi_one.py		yc_serpapi_one.py
yc_serper.py		yc_serper.py
yc_serper_test.py		yc_serper_test.py
yc_traffic.py		yc_traffic.py
yc_traffic2.py		yc_traffic2.py
yc_traffic3.py		yc_traffic3.py
yc_traffic4.py		yc_traffic4.py
yc_w26_pre_demo_scores.csv		yc_w26_pre_demo_scores.csv
yc_w26_startups.csv		yc_w26_startups.csv
yc_w26_traction.csv		yc_w26_traction.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YC Bench: a Live Benchmark for Forecasting Startup Outperformance in Y Combinator Batches

Paper: https://arxiv.org/abs/2604.02378

Overview

Repository Structure

Key Features

Quick Start

1. Clone the repository

2. Explore with Colab (Recommended)

Results — YC W26 Batch

Paper

Citation

Roadmap

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

YC Bench: a Live Benchmark for Forecasting Startup Outperformance in Y Combinator Batches

Paper: https://arxiv.org/abs/2604.02378

Overview

Repository Structure

Key Features

Quick Start

1. Clone the repository

2. Explore with Colab (Recommended)

Results — YC W26 Batch

Paper

Citation

Roadmap

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages