Carbon Transition Risk and Asset Stranding in Agricultural Enterprises

Which agricultural enterprises are most exposed to carbon transition risk — and which would become stranded assets if carbon were priced? This project answers that with a modular Python pipeline that constructs emissions proxies from enterprise-level financial and climate data, prices them under three NGFS-inspired carbon-price scenarios, and flags enterprises whose projected carbon costs exceed projected revenues. Built as a team research project in the CentraleSupélec × ESSEC double-degree curriculum.

Approach

Data. The Agriculture Financial Risk Dataset (Kaggle): 4,981 enterprises × 17 variables — financials (revenue, expenses, loan amount, debt-to-equity), climate exposure (average temperature, rainfall, drought index, flood risk score), and structure (region, enterprise size, quarter).

Pipeline (scripts/run_pipeline.py runs every step end-to-end):

Feature engineering — financial ratios (profit margin, cost ratio, debt ratio) and a composite climate stress index: 0.4·drought + 0.4·flood + 0.2·z(temperature).
Preprocessing — one-hot encoding of categoricals, standardization of numerics.
Emissions proxies — emissions are not observed in the data, so four candidate proxies are built as weighted z-score composites of activity scale (log expenses or log revenue), input-cost intensity, climate stress, and leverage. Their stability is compared via mean absolute deviation and top-decile overlap before fixing a baseline.
Models — Ridge regressions (5-fold CV alpha tuning over a log grid, 80/20 holdout) predicting net profit and the baseline emissions proxy.
Scenario pricing — carbon cost indices under three carbon prices benchmarked on NGFS Phase 3 (REMIND) scenarios, in USD 2010: Delayed Transition ($10/t), Net Zero 2050 ($110/t), Divergent Net Zero ($300/t). See images/ngfs_bar_plot.png for the benchmark levels.
Projection & stranding — 5-year revenue paths (2% growth, NPV at 5% discount), per-scenario future profit and carbon-risk indices, and a stranded flag where projected carbon cost exceeds projected revenue under the severest scenario (Divergent Net Zero).

Scope assumptions (stated up front given data limits): the dataset is enterprise-level, not investor-level; emissions are proxied, not observed; carbon prices are scenario-based, not forecast. The goal is a climate-financial risk pipeline, not precise emissions accounting.

Path	What it is
`scripts/run_pipeline.py`	Single entry point — runs the full pipeline
`scripts/eda_report.py`	EDA entry point — figures, describe table, summary
`src/config.py`	All paths, scenario prices, and tunable defaults
`src/features.py`	Financial ratios + climate stress index
`src/preprocessing.py`	Encoding and scaling
`src/proxies.py`	Emissions proxy variants + stability metrics
`src/models.py`	Ridge training with CV alpha tuning
`src/carbon.py`, `src/outputs.py`	Scenario carbon costs, revenue projection, stranding analysis
`src/eda.py`, `src/io.py`, `src/reporting.py`	Plots, I/O, markdown/CSV writers
`data/AgriRiskFin_Dataset.csv`	Raw Kaggle dataset
`data/data_cleaned.csv`	Preprocessed snapshot
`data/final_results_with_stranding_analysis.csv`	Full per-enterprise output (66 columns)
`outputs/`	Figures, tables, and reports from the last run

Results

Proxy choice matters more than proxy weights. Proxies sharing a scale base agree strongly (expenses-based v1/v3: 70% top-decile overlap; revenue-based v2/v4: 71%), while proxies across bases agree weakly (23–28%). The scale variable, not the weighting scheme, drives who lands in the top risk decile. Full table: outputs/tables/proxy_stability.md.
Model metrics confirm the dataset's deterministic structure. Both Ridge models reach R² ≈ 1.0 (RMSE ≈ 1e-6): net profit is an accounting identity of revenue and expenses in this synthetic dataset, and the proxy is a linear composite of available features. The models serve as the pipeline's prediction stage, not as evidence of out-of-sample power.
Stranding under the severest scenario. At $300/t (Divergent Net Zero), every enterprise in the sample flags as stranded. Because risk indices are z-score based (not unit-consistent dollars), the flag is best read as a scenario-severity ordering rather than a calibrated default probability — a limitation the index framing makes explicit.
Per-enterprise scenario scores (carbon cost, adjusted profit, carbon risk, future indices) are exported to data/final_results_with_stranding_analysis.csv.

Running

Python 3.11. From the repo root:

pip install pandas numpy scikit-learn matplotlib tabulate

python scripts/eda_report.py    # EDA: histograms, correlation heatmap, describe table
python scripts/run_pipeline.py  # full pipeline: proxies, models, scenarios, stranding

Outputs are written to outputs/figures/, outputs/tables/, outputs/reports/, and data/. All defaults (random seed, CV splits, horizon, growth/discount rates, scenario prices) live in src/config.py.

Team

Oscar Caudreliez, with Daniil, Oulaya, Ulysse, and Yuhan — team research project at CentraleSupélec / ESSEC.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
data		data
images		images
outputs		outputs
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Carbon Transition Risk and Asset Stranding in Agricultural Enterprises

Approach

Contents

Results

Running

Team

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Carbon Transition Risk and Asset Stranding in Agricultural Enterprises

Approach

Contents

Results

Running

Team

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages