Skip to content

Commit 2acd049

Browse files
authored
Merge pull request #34 from pyronear/arthur/leaderboard-add-temporal-models
feat(leaderboard): add mtb-change-detection and pyro-detector-baseline
2 parents 01b8497 + 1724850 commit 2acd049

12 files changed

Lines changed: 445 additions & 23 deletions

File tree

experiments/temporal-models/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ Experiments exploring temporal/video-based approaches to reduce false positives
1111

1212
## 🏆 Leaderboard
1313

14-
Current rankings on the test set (298 sequences: 149 wildfire + 149 false positive):
14+
Current rankings on the [pyro-dataset](https://github.com/pyronear/pyro-dataset) **v2.2.0** test set (298 sequences: 149 wildfire + 149 false positive):
1515

1616
| Rank | Model | Precision | Recall | F1 | FPR | Mean TTD | Median TTD |
1717
|------|-------|-----------|--------|----|-----|----------|------------|

experiments/temporal-models/temporal-model-leaderboard/README.md

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,43 +1,47 @@
1-
# Temporal Model Leaderboard
1+
# 🏆 Temporal Model Leaderboard
22

3-
Standardized evaluation and ranking of `TemporalModel` implementations on the [pyro-dataset](https://github.com/pyronear/pyro-dataset) sequential test set.
3+
Standardized evaluation and ranking of `TemporalModel` implementations on the [pyro-dataset](https://github.com/pyronear/pyro-dataset) **v2.2.0** sequential test set.
44

5-
## Leaderboard
5+
## 📊 Leaderboard
66

77
| Rank | Model | Precision | Recall | F1 | FPR | Mean TTD (s) | Median TTD (s) |
88
|------|-------|-----------|--------|----|-----|--------------|----------------|
99
| 1 | [FSM Tracking Baseline](../tracking-fsm-baseline/) | 0.9474 | 0.9664 | 0.9568 | 0.0537 | 142.0 | 58.0 |
10+
| 2 | [Pyro-Detector Baseline](../pyro-detector-baseline/) | 0.8563 | 0.9597 | 0.9051 | 0.1611 | 27.0 | 7.0 |
11+
| 3 | [MTB Change Detection](../mtb-change-detection/) | 0.7165 | 0.9329 | 0.8105 | 0.3691 | 85.4 | 25.0 |
1012

11-
*Evaluated on 298 sequences (149 wildfire + 149 false positive). Last updated: 2026-03-31.*
13+
*Evaluated on 298 sequences (149 wildfire + 149 false positive). Last updated: 2026-04-02.*
1214

13-
## Models
15+
## 🤖 Models
1416

1517
| Model | Description | Paper |
1618
|-------|-------------|-------|
1719
| [FSM Tracking Baseline](../tracking-fsm-baseline/) | YOLO11s detector + IoU-based FSM tracker. Requires temporal persistence (5 consecutive frames) before raising an alarm. Rule-based, no ML training. | [FLAME (Gragnaniello et al., 2024)](https://doi.org/10.1007/s00521-024-10963-z) |
20+
| [Pyro-Detector Baseline](../pyro-detector-baseline/) | Production pyro-predictor: YOLO ONNX + per-camera sliding-window temporal smoothing. Alarm when aggregated confidence crosses threshold over N consecutive frames. | -- |
21+
| [MTB Change Detection](../mtb-change-detection/) | YOLO11s + pixel-wise frame differencing (MTB ratio) to reject static FPs, followed by IoU-based FSM tracker. | [SlowFastMTB (Choi, Kim & Oh, 2022)](https://doi.org/10.3390/s22155602) |
1822

19-
## Metrics
23+
## 📏 Metrics
2024

2125
- **Precision, Recall, F1** -- sequence-level binary classification (smoke vs. no smoke)
2226
- **FPR** -- false positive rate
2327
- **Mean / Median TTD** -- time-to-detection in seconds for true positives (time from first frame to trigger frame)
2428

25-
## Data
29+
## 📦 Data
2630

2731
Test set imported via DVC from [pyro-dataset](https://github.com/pyronear/pyro-dataset):
2832
- 149 wildfire (positive) + 149 false positive (negative) sequences
2933
- Ground truth determined by directory structure (`wildfire/` vs `fp/`)
3034
- Max 20 frames per sequence, 30s apart
3135

32-
## How to Reproduce
36+
## 🔄 How to Reproduce
3337

3438
```bash
3539
make install
3640
uv run dvc pull # pull test set + model packages from S3
3741
uv run dvc repro # run evaluation pipeline
3842
```
3943

40-
## Adding a New Model
44+
## Adding a New Model
4145

4246
1. Implement `TemporalModel` in a new experiment under `experiments/temporal-models/`
4347
2. Package the model (see [tracking-fsm-baseline](../tracking-fsm-baseline/) for the zip format)
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,3 @@
11
/fsm-tracking-baseline.zip
2+
/mtb-change-detection.zip
3+
/pyro-detector-baseline.zip
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
outs:
2+
- md5: af4db66fccf5be457dff99c0b66a30f5
3+
size: 19260701
4+
hash: md5
5+
path: mtb-change-detection.zip
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
outs:
2+
- md5: 8df77be9c5cd9a120569730c4160098a
3+
size: 38513179
4+
hash: md5
5+
path: pyro-detector-baseline.zip
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,3 @@
11
/fsm-tracking-baseline
2+
/mtb-change-detection
3+
/pyro-detector-baseline

experiments/temporal-models/temporal-model-leaderboard/data/08_reporting/leaderboard.json

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,5 +12,33 @@
1212
"fpr": 0.0537,
1313
"mean_ttd_seconds": 142.0,
1414
"median_ttd_seconds": 58.0
15+
},
16+
{
17+
"model_name": "pyro-detector-baseline",
18+
"num_sequences": 298,
19+
"tp": 143,
20+
"fp": 24,
21+
"fn": 6,
22+
"tn": 125,
23+
"precision": 0.8563,
24+
"recall": 0.9597,
25+
"f1": 0.9051,
26+
"fpr": 0.1611,
27+
"mean_ttd_seconds": 27.0,
28+
"median_ttd_seconds": 7.0
29+
},
30+
{
31+
"model_name": "mtb-change-detection",
32+
"num_sequences": 298,
33+
"tp": 139,
34+
"fp": 55,
35+
"fn": 10,
36+
"tn": 94,
37+
"precision": 0.7165,
38+
"recall": 0.9329,
39+
"f1": 0.8105,
40+
"fpr": 0.3691,
41+
"mean_ttd_seconds": 85.4,
42+
"median_ttd_seconds": 25.0
1543
}
1644
]

experiments/temporal-models/temporal-model-leaderboard/dvc.lock

Lines changed: 75 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -18,12 +18,12 @@ stages:
1818
nfiles: 32640
1919
- path: scripts/evaluate.py
2020
hash: md5
21-
md5: 75f32f92f6ac1d1ba28eb9c7bf5343e1
22-
size: 3761
21+
md5: b0f7e0612d85e1691ef78aed9bd2c118
22+
size: 3979
2323
- path: src/temporal_model_leaderboard
2424
hash: md5
25-
md5: cbdb9e92766ca170b1bc87485d74c633.dir
26-
size: 32176
25+
md5: 423d337f89bfe4e71fb126ec3c665aae.dir
26+
size: 32284
2727
nfiles: 12
2828
outs:
2929
- path: data/07_model_output/fsm-tracking-baseline
@@ -37,24 +37,86 @@ stages:
3737
deps:
3838
- path: data/07_model_output
3939
hash: md5
40-
md5: 6af82611b2a69282012915498c595d56.dir
41-
size: 46667
42-
nfiles: 4
40+
md5: 8679635cba572131844bd723271f57d5.dir
41+
size: 139733
42+
nfiles: 8
4343
- path: scripts/leaderboard.py
4444
hash: md5
4545
md5: 81dec9584c04f465c99ced1b8c733a08
4646
size: 2507
4747
- path: src/temporal_model_leaderboard
4848
hash: md5
49-
md5: cbdb9e92766ca170b1bc87485d74c633.dir
50-
size: 32176
49+
md5: 423d337f89bfe4e71fb126ec3c665aae.dir
50+
size: 32284
5151
nfiles: 12
5252
outs:
5353
- path: data/08_reporting/leaderboard.json
5454
hash: md5
55-
md5: bdfd6120841f4649413c6426e8cd04aa
56-
size: 283
55+
md5: a0333c0f68c720abf9000938e20ad851
56+
size: 842
5757
- path: data/08_reporting/leaderboard.txt
5858
hash: md5
59-
md5: cc2e649cfd237e8d4d7108e3efee5707
60-
size: 279
59+
md5: 49c9c9ac97a732437dfee797ee73c523
60+
size: 470
61+
evaluate_mtb_change_detection:
62+
cmd: uv run python scripts/evaluate.py --model-name mtb-change-detection
63+
--model-type mtb-change-detection --model-package
64+
data/01_raw/models/mtb-change-detection.zip --test-dir
65+
data/01_raw/sequential_test/test --output-dir
66+
data/07_model_output/mtb-change-detection
67+
deps:
68+
- path: data/01_raw/models/mtb-change-detection.zip
69+
hash: md5
70+
md5: af4db66fccf5be457dff99c0b66a30f5
71+
size: 19260701
72+
- path: data/01_raw/sequential_test
73+
hash: md5
74+
md5: 828edb7fb8f0909f0d6115d67ffbddc9.dir
75+
size: 1594484536
76+
nfiles: 32640
77+
- path: scripts/evaluate.py
78+
hash: md5
79+
md5: b0f7e0612d85e1691ef78aed9bd2c118
80+
size: 3979
81+
- path: src/temporal_model_leaderboard
82+
hash: md5
83+
md5: 423d337f89bfe4e71fb126ec3c665aae.dir
84+
size: 32284
85+
nfiles: 12
86+
outs:
87+
- path: data/07_model_output/mtb-change-detection
88+
hash: md5
89+
md5: 7b4f525b949035c54a0de70a635456c1.dir
90+
size: 46552
91+
nfiles: 2
92+
evaluate_pyro_detector:
93+
cmd: uv run python scripts/evaluate.py --model-name pyro-detector-baseline
94+
--model-type pyro-detector-baseline --model-package
95+
data/01_raw/models/pyro-detector-baseline.zip --test-dir
96+
data/01_raw/sequential_test/test --output-dir
97+
data/07_model_output/pyro-detector-baseline
98+
deps:
99+
- path: data/01_raw/models/pyro-detector-baseline.zip
100+
hash: md5
101+
md5: 8df77be9c5cd9a120569730c4160098a
102+
size: 38513179
103+
- path: data/01_raw/sequential_test
104+
hash: md5
105+
md5: 828edb7fb8f0909f0d6115d67ffbddc9.dir
106+
size: 1594484536
107+
nfiles: 32640
108+
- path: scripts/evaluate.py
109+
hash: md5
110+
md5: b0f7e0612d85e1691ef78aed9bd2c118
111+
size: 3979
112+
- path: src/temporal_model_leaderboard
113+
hash: md5
114+
md5: 423d337f89bfe4e71fb126ec3c665aae.dir
115+
size: 32284
116+
nfiles: 12
117+
outs:
118+
- path: data/07_model_output/pyro-detector-baseline
119+
hash: md5
120+
md5: 0ff18e7a674f5a7c8b0452f3ce20cc4a.dir
121+
size: 46514
122+
nfiles: 2

experiments/temporal-models/temporal-model-leaderboard/dvc.yaml

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,38 @@ stages:
1515
outs:
1616
- data/07_model_output/fsm-tracking-baseline
1717

18+
evaluate_mtb_change_detection:
19+
cmd: >-
20+
uv run python scripts/evaluate.py
21+
--model-name mtb-change-detection
22+
--model-type mtb-change-detection
23+
--model-package data/01_raw/models/mtb-change-detection.zip
24+
--test-dir data/01_raw/sequential_test/test
25+
--output-dir data/07_model_output/mtb-change-detection
26+
deps:
27+
- scripts/evaluate.py
28+
- src/temporal_model_leaderboard
29+
- data/01_raw/sequential_test
30+
- data/01_raw/models/mtb-change-detection.zip
31+
outs:
32+
- data/07_model_output/mtb-change-detection
33+
34+
evaluate_pyro_detector:
35+
cmd: >-
36+
uv run python scripts/evaluate.py
37+
--model-name pyro-detector-baseline
38+
--model-type pyro-detector-baseline
39+
--model-package data/01_raw/models/pyro-detector-baseline.zip
40+
--test-dir data/01_raw/sequential_test/test
41+
--output-dir data/07_model_output/pyro-detector-baseline
42+
deps:
43+
- scripts/evaluate.py
44+
- src/temporal_model_leaderboard
45+
- data/01_raw/sequential_test
46+
- data/01_raw/models/pyro-detector-baseline.zip
47+
outs:
48+
- data/07_model_output/pyro-detector-baseline
49+
1850
leaderboard:
1951
cmd: >-
2052
uv run python scripts/leaderboard.py

experiments/temporal-models/temporal-model-leaderboard/pyproject.toml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,15 @@ requires-python = ">=3.11"
66
dependencies = [
77
"pyrocore",
88
"tracking-fsm-baseline",
9+
"mtb-change-detection",
10+
"pyro-detector-baseline",
911
]
1012

1113
[tool.uv.sources]
1214
pyrocore = { path = "../../../lib/pyrocore" }
1315
tracking-fsm-baseline = { path = "../tracking-fsm-baseline" }
16+
mtb-change-detection = { path = "../mtb-change-detection" }
17+
pyro-detector-baseline = { path = "../pyro-detector-baseline" }
1418

1519
[dependency-groups]
1620
dev = [

0 commit comments

Comments
 (0)