Kerala Assembly Election 2026 Predictor

An end-to-end machine learning pipeline forecasting the 140 constituencies of the 2026 Kerala Legislative Assembly Elections.

Overview

This project simulates and predicts the electoral outcomes by fusing historical results, parliamentary momentum, local body trends, demographic data, and regional political issues.

The pipeline consists of two main components:

create_dataset.py: A heuristic engine that synthesizes a comprehensive 43-feature dataset for all 140 constituencies. It combines baseline 2021 results with 2024 Lok Sabha momentum, 2025 Local Body trends, demographic makeup, and constituency-specific issue impacts to generate projected vote shares.
train.py: A robust Neural Network pipeline that trains on these features to predict the winning alliance (LDF, UDF, NDA, OTHERS) and exact vote shares.

How We Predict the Election

Predicting elections with data is challenging—especially in Kerala, where there are only 140 constituencies (which means a very small dataset) and where the dominant parties (LDF and UDF) win almost every seat, making it incredibly hard for an AI to learn how third fronts like the NDA or independent candidates might win.

To tackle these unique challenges, our approach uses a few clever strategies:

1. The "Wisdom of Crowds" (Ensemble Learning)

Because our dataset is incredibly small (only 140 rows), training just one AI model is risky. It might just memorize the data instead of learning real patterns. To fix this, we train 15 separate models on different randomized slices of the state. When predicting the final results, we ask all 15 models to vote on the outcome. By averaging their predictions together, we get a much more stable, reliable, and highly confident forecast.

2. Predicting the Score, Not Just the Winner

Historically, the NDA rarely wins seats in Kerala. If we only ask the AI to predict "Who wins?", it will almost never see enough examples to effectively learn what an NDA victory looks like.

Instead, we ask the AI to do two things at once:

Predict the winning party.
Predict the exact vote share percentage for every party.

Because every party gets some vote share in every constituency, the AI constantly learns what makes a party perform well, even in places where they ultimately lose. By learning how to calculate vote shares, the model organically figures out how traditional strongholds might tip toward a third party in extremely close races.

3. Paying Extra Attention to Rare Events

If left to its own devices, an AI will naturally ignore rare events (like an independent candidate winning a seat) to focus on the big, common patterns. During training, we use specialized math techniques that force the AI to pay extra attention to these incredibly rare scenarios, keeping it from taking the easy way out and predicting LDF/UDF every single time.

A Word on How the Model Thinks

It's important to understand what this AI is actually doing behind the scenes.

Usually, to build a "true" predictive AI, you feed it historical data (like 2011 election factors) and ask it to predict the 2016 outcome. Once it learns those rules against hard historical truth, you use it to predict the future.

However, because we don't have perfectly paired historical data stretching back decades, we had to be creative. Our dataset builder (create_dataset.py) acts as a human logic engine: it takes the most recent available data (2021 results, 2024 parliamentary momentum, etc.) and uses documented political science formulas to estimate a "projected truth."

Our neural network then trains on this data. What it's actually doing is learning to deeply mimic that political human logic, smoothing out the hard math, and finding hidden relationships between demographics, geography, and political momentum. It acts as an incredible digital strategist applying complex political logic statewide, rather than an independent crystal ball.

Usage

Generate the dataset:

python create_dataset.py

Train the ensemble and output predictions:

python train.py

The final output is saved to predictions_2026.csv.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
checkpoints		checkpoints
data		data
data_files		data_files
models		models
utils		utils
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
config.py		config.py
create_dataset.py		create_dataset.py
generate_svg.py		generate_svg.py
instagram_post_2026.svg		instagram_post_2026.svg
predictions_2026.csv		predictions_2026.csv
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kerala Assembly Election 2026 Predictor

Overview

How We Predict the Election

1. The "Wisdom of Crowds" (Ensemble Learning)

2. Predicting the Score, Not Just the Winner

3. Paying Extra Attention to Rare Events

A Word on How the Model Thinks

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Kerala Assembly Election 2026 Predictor

Overview

How We Predict the Election

1. The "Wisdom of Crowds" (Ensemble Learning)

2. Predicting the Score, Not Just the Winner

3. Paying Extra Attention to Rare Events

A Word on How the Model Thinks

Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages