A free, automated weekly scraper that collects Indian startup funding data from public sources, filters for EdTech deals, and cross-references investors against the SEBI AIF registry to surface contact details — all saved to public CSVs. No paid subscriptions needed.
| Column | Description |
|---|---|
startup_name |
Name of the startup |
domain |
Sector / industry (e.g. EdTech, FinTech, HealthTech) |
round |
Funding stage (Seed, Series A, Series B, Angel …) |
amount_usd |
Ticket size in USD (null if undisclosed) |
investor |
Investor name (one row per investor per deal) |
headquarters |
City / state where the startup is based |
date |
Date of funding announcement |
scraped_at |
Date this row was collected |
| Column | Description |
|---|---|
investor |
Investor name |
investor_email |
Contact email from SEBI AIF registry (if matched) |
sebi_reg_no |
SEBI AIF registration number (if matched) |
aif_category |
AIF category (Category I / II / III) |
investor_city |
City from SEBI registry |
startup_name |
Name of the EdTech startup funded |
domain |
EdTech sub-sector |
round |
Funding stage |
amount_usd |
Ticket size in USD |
headquarters |
Startup headquarters |
date |
Date of funding announcement |
| Source | URL | Method |
|---|---|---|
| StartupTalky 2026 | link | HTML table |
| StartupTalky 2025 | link | HTML table |
| StartupTalky 2024 | link | HTML table |
| SEBI AIF Registry | link | HTML scrape |
- Every Monday at 6:00 AM IST a GitHub Actions workflow runs
scraper/run.py - StartupTalky tables (2024, 2025, 2026) are scraped and parsed into structured rows
- Rows matching EdTech keywords are filtered into a separate dataset
- The SEBI AIF registry is scraped to collect fund names, registration numbers, categories, contact emails, and cities
- EdTech investors are fuzzy-matched against SEBI fund names (≥ 50% token overlap) to enrich with contact details
- All CSVs are deduplicated and committed back to the repo
| File | Description |
|---|---|
data/india_funding_2024.csv |
All startup funding deals scraped for 2024 |
data/india_funding_2025.csv |
All startup funding deals scraped for 2025 |
data/india_funding_2026.csv |
All startup funding deals scraped for 2026 |
data/edtech_investors.csv |
EdTech-only investors enriched with SEBI AIF contact data |
Go to the Actions tab and click Enable workflows
Go to Actions → Weekly India Funding Scraper → Run workflow
No API keys or secrets required — all data sources are publicly accessible.
git clone https://github.com/YOUR_USERNAME/InvestorFinder
cd InvestorFinder
pip install -r requirements.txt
python scraper/run.py| Component | Cost |
|---|---|
| GitHub Actions | Free (2000 min/month) |
| Everything else | ₹0 |
The latest data is always available in the data/ folder. Direct raw links:
https://raw.githubusercontent.com/YOUR_USERNAME/InvestorFinder/main/data/india_funding_2026.csv
https://raw.githubusercontent.com/YOUR_USERNAME/InvestorFinder/main/data/edtech_investors.csv
MIT — free to use, fork, and build on.