Skip to content

feat: add MetOp IASI Level 1C infrared brightness temperature data source#811

Open
NickGeneva wants to merge 6 commits intoNVIDIA:mainfrom
NickGeneva:ngeneva/metop_iasi_datasource
Open

feat: add MetOp IASI Level 1C infrared brightness temperature data source#811
NickGeneva wants to merge 6 commits intoNVIDIA:mainfrom
NickGeneva:ngeneva/metop_iasi_datasource

Conversation

@NickGeneva
Copy link
Copy Markdown
Collaborator

Description

Add MetOpIASI DataFrameSource for EUMETSAT IASI Level 1C calibrated infrared spectral radiance data from MetOp-A/B/C polar-orbiting satellites. The source parses EPS native binary format files, extracting per-IFOV geolocation, viewing geometry, quality flags, and spectral radiance for user-selected channels, converting to brightness temperature via the inverse Planck function.

Data source details

Property Value
Source type DataFrameSource
Remote store EUMETSAT Data Store
Format EPS native binary (.nat)
Spatial resolution ~12 km at nadir per IFOV (30 EFOVs × 4 IFOVs per scan line)
Spectral resolution 8461 channels, 645–2760 cm⁻¹, 0.25 cm⁻¹ sampling
Date range 2006-10 (MetOp-A) to present
Region Global (polar orbiter, ~14 orbits/day per satellite)
Authentication OAuth2 via EUMETSAT_CONSUMER_KEY + EUMETSAT_CONSUMER_SECRET env vars

Data licensing

License: EUMETSAT Data Policy — free and open access
URL: https://www.eumetsat.int/legal-framework/data-policy

Open data, freely available for commercial and non-commercial use.
Registration required for API access.

Dependencies added

No new dependencies needed — eumdac>=3.1.0 is already in the data optional dependency group.

Key implementation details

  • Channel subsetting: Constructor accepts channel_indices parameter (0-based, default 100 representative channels) to avoid loading all 8461 channels per IFOV
  • EPS binary parsing: Sequential record walking with _compute_mdr_field_offsets() to handle variable-size MDR header fields (no hardcoded offsets)
  • GIADR scale factors: Per-band radiance scale factors read from GIADR record (positive integers representing negative exponents)
  • Radiance → BT: Plain inverse Planck (no band correction needed for IASI's narrow 0.25 cm⁻¹ channels)
  • Same pattern as MetOpAMSUA/MHS/AVHRR: eumdac for download, struct-based binary parsing, long-format DataFrame output

Validation

See sanity-check validation in PR comments below.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.
  • The CHANGELOG.md is up to date with these changes.
  • An issue is linked to this pull request.
  • Assess and address Greptile feedback (AI code review bot).

…urce

Add MetOpIASI DataFrameSource for EUMETSAT IASI L1C calibrated spectral
radiance data from MetOp-A/B/C satellites. Parses EPS native binary format
with 8461 infrared channels (645-2760 cm⁻¹), converting to brightness
temperature via inverse Planck function.

Includes MetOpIASILexicon, unit tests (22 passing + 4 network xfail),
and documentation updates.
@NickGeneva
Copy link
Copy Markdown
Collaborator Author

NickGeneva commented Apr 10, 2026

Sanity-Check Validation

Source: MetOpIASI — IASI Level 1C infrared brightness temperature from MetOp satellites
Time: 2025-01-15 12:00 UTC ±2h
Parameters: satellite=metop-c, 10 representative channels across 3 spectral bands

Metric Value
Output rows 2,150,420
Channels validated 10 (spanning 645–2760 cm⁻¹)
BT range 151.6–345.9 K
Lat range -89.72° to 89.66°
Lon range 0.00° to 359.99°
Missing / NaN 0
Satellites metop-c
Orbit segments 3 (~4h window)
Quality range 0–624 (uint16 flags)

Per-channel summary:

Channel WN (cm⁻¹) Band Obs Min BT (K) Max BT (K) Mean BT (K)
1 645.25 1 215,783 192.5 236.6 215.6
666 811.50 1 215,783 178.8 311.0 265.3
1331 977.75 1 215,783 180.1 307.9 266.2
1997 1144.25 1 215,783 179.8 313.9 266.8
1998 1144.50 2 215,783 181.1 314.4 266.9
3557 1534.25 2 215,793 194.5 240.7 221.7
5116 1924.00 2 215,747 151.6 280.5 250.8
5117 1924.25 3 215,747 151.6 285.0 254.5
6789 2342.25 3 215,428 159.3 261.1 238.2
8461 2760.25 3 208,790 183.5 345.9 277.9

Key findings:

  • All channels show 100% valid data with physically reasonable BT ranges
  • Ch 1 (645 cm⁻¹, CO₂ band): mean 216 K — correctly sees stratospheric emission
  • Window channels (811–1144 cm⁻¹): mean 265–267 K — correct for January global surface temperature
  • Ch 3557 (1534 cm⁻¹, H₂O band): mean 222 K — correctly sees mid-tropospheric water vapor
  • Ch 8461 (2760 cm⁻¹, SWIR): up to 346 K — daytime solar contribution expected
  • Global polar orbiter coverage confirmed from 3 orbit segments
sanity_check_iasi

General qualitative results look good for all three band ranges (low, mid, high)... also time ranges / orbital position looks consistent.

Sanity-check script (click to expand)
"""Sanity-check validation for MetOpIASI data source.

This script is for PR review only — do NOT commit to the repo.
Uses cached .nat files from 2025-01-15 to validate the parser
and produce sanity-check plots.
"""

import sys
import time as time_mod
from datetime import datetime, timedelta

import matplotlib

matplotlib.use("Agg")
import cartopy.crs as ccrs
import cartopy.feature as cfeature
import matplotlib.pyplot as plt
import numpy as np

from earth2studio.data import MetOpIASI
from earth2studio.lexicon import MetOpIASILexicon


def validate_and_summarize(ds, time_val, variable):
    """Fetch data and print variable/channel summary."""
    print("=" * 70)
    print(f"Fetching IASI data for {time_val}, variable='{variable}'")
    print(f"Channel indices: {len(ds._channel_indices)} channels")
    print("=" * 70)

    t0 = time_mod.time()
    df = ds(time_val, variable)
    t1 = time_mod.time()
    print(f"Fetch time: {t1 - t0:.1f}s")
    print(f"Total rows: {len(df):,}")

    if len(df) == 0:
        print("ERROR: No data returned!")
        return df

    channels = sorted(df["channel_index"].unique())
    satellites = sorted(df["satellite"].unique())

    print(f"Satellites: {satellites}")
    print(f"Channels: {len(channels)} (range: {min(channels)}-{max(channels)})")
    print(f"Lat range: {df['lat'].min():.2f} to {df['lat'].max():.2f}")
    print(f"Lon range: {df['lon'].min():.2f} to {df['lon'].max():.2f}")
    print(f"Time range: {df['time'].min()} to {df['time'].max()}")
    print(f"BT range: {df['observation'].min():.1f} to {df['observation'].max():.1f} K")
    print(f"NaN count: {df['observation'].isna().sum()}")
    print(f"Quality range: {df['quality'].min()} to {df['quality'].max()}")

    # Per-channel summary
    print(
        f"\n{'Channel':>8} {'WN (cm-1)':>10} {'Band':>5} {'Obs':>8} "
        f"{'Valid%':>7} {'Min BT':>8} {'Max BT':>8} {'Mean BT':>8}"
    )
    print("-" * 75)

    for ch in channels:
        sub = df[df["channel_index"] == ch]
        n_total = len(sub)
        obs = sub["observation"]
        n_valid = obs.notna().sum()
        pct = (n_valid / n_total * 100) if n_total > 0 else 0

        wn = (2581 + (ch - 1)) * 25.0 / 100.0

        if ch <= 1997:
            band = 1
        elif ch <= 5116:
            band = 2
        else:
            band = 3

        vmin = obs.min() if n_valid > 0 else float("nan")
        vmax = obs.max() if n_valid > 0 else float("nan")
        vmean = obs.mean() if n_valid > 0 else float("nan")

        flag = " *** REMOVE" if pct < 10 else ""
        print(
            f"{ch:>8} {wn:>10.2f} {band:>5} {n_total:>8} "
            f"{pct:>6.1f}% {vmin:>8.1f} {vmax:>8.1f} {vmean:>8.1f}{flag}"
        )

    return df


def plot_sanity_check(df, time_str):
    """Create sanity-check scatter plots on Robinson projection."""
    channels = sorted(df["channel_index"].unique())

    plot_channels = [channels[0], channels[len(channels) // 2], channels[-1]]

    df = df.copy()
    df["lon_plt"] = df["lon"].where(df["lon"] <= 180, df["lon"] - 360)

    fig, axes = plt.subplots(
        1,
        3,
        figsize=(24, 7),
        subplot_kw={"projection": ccrs.Robinson()},
    )

    for ax, ch in zip(axes, plot_channels):
        subset = df[df["channel_index"] == ch].copy()
        obs = subset["observation"].values
        valid_mask = np.isfinite(obs)

        if valid_mask.sum() == 0:
            ax.set_title(f"Channel {ch} — no valid data")
            continue

        obs_valid = obs[valid_mask]
        vmin, vmax = np.percentile(obs_valid, [2, 98])

        wn = (2581 + (ch - 1)) * 25.0 / 100.0

        ax.set_global()
        ax.add_feature(cfeature.COASTLINE, linewidth=0.5)
        ax.add_feature(cfeature.BORDERS, linewidth=0.3, alpha=0.5)

        sc = ax.scatter(
            subset.loc[valid_mask, "lon_plt"].values,
            subset.loc[valid_mask, "lat"].values,
            c=obs_valid,
            s=1,
            cmap="turbo",
            alpha=0.8,
            vmin=vmin,
            vmax=vmax,
            edgecolors="none",
            transform=ccrs.PlateCarree(),
        )
        ax.set_title(
            f"Ch {ch} ({wn:.1f} cm⁻¹)\n"
            f"{valid_mask.sum():,} obs, BT: {vmin:.0f}{vmax:.0f} K (p2–p98)"
        )
        plt.colorbar(
            sc,
            ax=ax,
            shrink=0.6,
            label="BT (K)",
            orientation="horizontal",
            pad=0.05,
        )

    satellites = sorted(df["satellite"].unique())
    plt.suptitle(
        f"MetOpIASI — {time_str}\nSatellites: {', '.join(satellites)}",
        y=1.02,
    )
    plt.tight_layout()
    outpath = "sanity_check_iasi.png"
    plt.savefig(outpath, dpi=150, bbox_inches="tight")
    print(f"\nSaved: {outpath}")
    return outpath


if __name__ == "__main__":
    time_val = datetime(2025, 1, 15, 12, 0)

    channels = np.concatenate(
        [
            np.linspace(0, 1996, 4, dtype=np.int32),
            np.linspace(1997, 5115, 3, dtype=np.int32),
            np.linspace(5116, 8460, 3, dtype=np.int32),
        ]
    )

    ds = MetOpIASI(
        satellite="metop-c",
        channel_indices=channels,
        time_tolerance=timedelta(hours=2),
        cache=True,
        verbose=True,
    )

    print("Using cached IASI files from 2025-01-15")
    df = validate_and_summarize(ds, time_val, "iasi")

    if len(df) == 0:
        print("\nERROR: No data returned! Check cache files.")
        sys.exit(1)

    outpath = plot_sanity_check(df, "2025-01-15 12:00 UTC ±2h")
    print(f"\nDone! Please inspect: {outpath}")
Print Out / Stats
MetOpIASI fetch time: 56.8s
Rows: 37,546,527
Satellites: ['metop-c']
Channels: [16, 38, 49, 51, 55, 57, 59, 61, 63, 66, 70, 72, 74, 79, 81, 83, 85, 87, 104, 106, 109, 111, 113, 116, 119, 122, 125, 128, 131, 133, 135, 138, 141, 144, 146, 148, 151, 154, 157, 159, 161, 163, 167, 170, 173, 176, 180, 185, 187, 193, 199, 205, 207, 210, 212, 214, 217, 219, 222, 224, 226, 230, 232, 236, 239, 243, 246, 249, 252, 254, 260, 262, 265, 267, 275, 282, 294, 296, 299, 303, 306, 323, 327, 329, 335, 345, 347, 350, 354, 356, 360, 366, 371, 373, 375, 377, 379, 381, 383, 386, 389, 398, 401, 404, 407, 410, 414, 416, 426, 428, 432, 434, 439, 445, 457, 515, 546, 552, 559, 566, 571, 573, 646, 662, 668, 756, 867, 906, 921, 1027, 1046, 1121, 1133, 1191, 1194, 1271, 1479, 1509, 1513, 1521, 1536, 1574, 1579, 1585, 1587, 1626, 1639, 1643, 1652, 1658, 1671, 1786, 1805, 1884, 1991, 2019, 2094, 2119, 2213, 2239, 2271, 2321, 2398, 2701, 2889, 2958, 2993, 3002, 3049, 3105, 3110, 5381, 5399, 5480]
BT range: 157.6–315.2 K
Lat range: -89.72 to 89.66
Lon range: 0.00 to 359.99
Time range: 2025-01-15 10:00:00.167000 to 2025-01-15 13:59:59.903000

Common channels (1-based): 174

======================================================================
Per-Channel BT Comparison (common channels)
======================================================================
    Ch  WN(cm-1)   MetOp N MetOp Mean MetOp Std     UFS N   UFS Mean   UFS Std  dMean(K)
-----------------------------------------------------------------------------------------------
    16    649.00   215,783     233.35     11.81     6,713     233.22      9.83     +0.12
    38    654.50   215,783     216.67     11.22     6,713     215.89      9.48     +0.78
    49    657.25   215,783     219.83     11.51     6,713     219.38      9.44     +0.45
    51    657.75   215,783     217.00     11.28     6,713     216.29      9.46     +0.71
    55    658.75   215,783     219.84     11.49     6,713     219.38      9.43     +0.46
    57    659.25   215,783     216.68     11.27     6,713     215.92      9.50     +0.76
    59    659.75   215,783     225.76     11.96     6,713     225.60      9.76     +0.15
    61    660.25   215,783     220.34     11.55     6,713     219.94      9.46     +0.40
    63    660.75   215,783     216.85     11.24     6,713     216.11      9.46     +0.74
    66    661.50   215,783     225.17     11.93     6,713     225.02      9.73     +0.15
    70    662.50   215,783     222.84     10.88     6,713     222.34      9.00     +0.50
    72    663.00   215,783     232.52     11.84     6,713     232.43      9.82     +0.10
    74    663.50   215,783     224.53     11.65     6,713     224.21      9.52     +0.33
    79    664.75   215,783     222.87     11.66     6,713     222.58      9.52     +0.30
    81    665.25   215,783     223.73     11.60     6,713     223.43      9.47     +0.30
    83    665.75   215,783     221.32     11.54     6,713     220.87      9.48     +0.45
    85    666.25   215,783     220.52     11.58     6,713     220.13      9.47     +0.39
    87    666.75   215,783     218.32     11.63     6,713     217.85      9.55     +0.47
   104    671.00   215,783     220.32     11.57     6,713     219.80      9.53     +0.51
   106    671.50   215,783     227.60     12.02     6,713     227.47      9.83     +0.13
   109    672.25   215,783     217.39     11.38     6,713     216.66      9.54     +0.73
   111    672.75   215,783     221.27     11.65     6,713     220.90      9.52     +0.37
   113    673.25   215,783     227.47     12.10     6,713     227.38      9.89     +0.10
   116    674.00   215,783     216.82     11.35     6,713     216.12      9.50     +0.70
   119    674.75   215,783     228.11     12.16     6,713     228.05      9.95     +0.06
   122    675.50   215,783     216.37     11.38     6,713     215.63      9.57     +0.75
   125    676.25   215,783     228.50     12.23     6,713     228.46     10.02     +0.04
   128    677.00   215,783     216.54     11.37     6,713     215.80      9.56     +0.74
   131    677.75   215,783     227.79     12.14     6,713     227.72      9.93     +0.08
   133    678.25   215,783     221.61     11.61     6,713     221.23      9.50     +0.38
   135    678.75   215,783     216.53     11.38     6,713     215.79      9.55     +0.74
   138    679.50   215,783     228.98     12.24     6,713     228.94     10.02     +0.04
   141    680.25   215,783     216.17     11.34     6,713     215.36      9.59     +0.80
   144    681.00   215,783     228.56     12.23     6,713     228.49     10.02     +0.07
   146    681.50   215,783     219.58     11.40     6,713     219.06      9.41     +0.52
   148    682.00   215,783     216.55     11.32     6,713     215.79      9.53     +0.76
   151    682.75   215,783     227.11     12.08     6,713     226.99      9.87     +0.12
   154    683.50   215,783     215.89     11.21     6,713     215.04      9.53     +0.85
   157    684.25   215,783     227.88     12.10     6,713     227.73      9.90     +0.15
   159    684.75   215,783     218.00     11.18     6,713     217.32      9.34     +0.68
   161    685.25   215,783     216.27     11.14     6,713     215.45      9.44     +0.82
   163    685.75   215,783     226.91     11.89     6,713     226.72      9.73     +0.20
   167    686.75   215,783     215.71     10.91     6,713     214.81      9.34     +0.90
   170    687.50   215,783     225.44     11.76     6,713     225.18      9.61     +0.26
   173    688.25   215,783     215.70     10.56     6,713     214.81      9.05     +0.90
   176    689.00   215,783     226.55     11.74     6,713     226.27      9.62     +0.28
   180    690.00   215,783     216.93     10.41     6,713     216.06      8.87     +0.88
   185    691.25   215,783     217.20      9.89     6,713     216.47      8.32     +0.73
   187    691.75   215,783     216.73     10.07     6,713     215.95      8.51     +0.77
   193    693.25   215,783     216.84      9.14     6,713     216.21      7.64     +0.63
   199    694.75   215,783     217.81      8.02     6,713     217.49      6.49     +0.32
   205    696.25   215,783     219.34      7.24     6,713     219.35      5.63     -0.01
   207    696.75   215,783     218.35      9.17     6,713     217.84      7.53     +0.51
   210    697.50   215,783     220.21      8.48     6,713     219.96      6.76     +0.25
   212    698.00   215,783     220.37      6.85     6,713     220.59      5.20     -0.22
   214    698.50   215,783     219.43      9.28     6,713     219.00      7.57     +0.43
   217    699.25   215,783     221.60      7.00     6,713     221.84      5.30     -0.24
   219    699.75   215,783     221.15      6.60     6,713     221.54      4.93     -0.39
   222    700.50   215,783     219.77      8.91     6,713     219.38      7.21     +0.39
   224    701.00   215,783     222.67      6.42     6,713     223.21      4.73     -0.53
   226    701.50   215,783     222.01      6.42     6,713     222.58      4.74     -0.56
   230    702.50   215,783     222.98      6.29     6,713     223.52      4.62     -0.55
   232    703.00   215,783     223.51      6.27     6,713     224.25      4.60     -0.74
   236    704.00   215,783     225.37      6.08     6,713     226.38      4.47     -1.01
   239    704.75   215,783     226.74      6.37     6,713     228.22      4.82     -1.48
   243    705.75   215,783     227.57      6.21     6,713     228.98      4.71     -1.41
   246    706.50   215,783     229.04      6.68     6,713     230.87      5.17     -1.82
   249    707.25   215,783     229.31      6.56     6,713     231.08      5.11     -1.78
   252    708.00   215,783     231.29      7.06     6,713     233.44      5.61     -2.15
   254    708.50   215,783     226.96      6.41     6,713     228.16      4.69     -1.20
   260    710.00   215,783     229.81      6.67     6,713     231.57      5.04     -1.76
   262    710.50   215,783     231.97      7.14     6,713     234.21      5.68     -2.24
   265    711.25   215,783     235.42      8.15     6,713     238.28      6.69     -2.85
   267    711.75   215,783     231.07      6.81     6,713     233.04      5.26     -1.97
   275    713.75   215,783     231.97      6.97     6,713     234.03      5.43     -2.05
   282    715.50   215,783     230.98      7.14     6,713     233.09      5.48     -2.12
   294    718.50   215,783     230.31      6.44     6,713     232.00      4.88     -1.70
   296    719.00   215,783     226.50      6.13     6,713     227.53      4.49     -1.02
   299    719.75   215,783     221.15      8.32     6,713     220.86      6.60     +0.29
   303    720.75   215,783     227.57     11.74     6,713     227.37      9.59     +0.20
   306    721.50   215,783     228.14      6.57     6,713     229.69      4.97     -1.55
   323    725.75   215,783     235.51      8.32     6,713     238.53      6.63     -3.02
   327    726.75   215,783     249.96     13.32     6,713     255.74     11.22     -5.79
   329    727.25   215,783     233.21      7.64     6,713     235.79      5.99     -2.57
   335    728.75   215,783     232.55      7.47     6,713     234.95      5.79     -2.41
   345    731.25   215,783     247.16     11.70     6,713     251.58      9.69     -4.42
   347    731.75   215,783     234.48      7.90     6,713     237.17      6.21     -2.69
   350    732.50   215,783     248.75     12.87     6,713     254.25     10.86     -5.50
   354    733.50   215,783     233.04      7.61     6,713     235.57      5.96     -2.53
   356    734.00   215,783     248.49     12.77     6,713     253.95     10.78     -5.46
   360    735.00   215,783     233.05      7.60     6,713     235.57      5.93     -2.52
   366    736.50   215,783     233.54      7.71     6,713     236.14      6.06     -2.61
   371    737.75   215,783     242.25     10.50     6,713     246.54      8.73     -4.29
   373    738.25   215,783     240.59      9.81     6,713     244.50      8.09     -3.91
   375    738.75   215,783     251.58     13.70     6,713     257.56     11.49     -5.99
   377    739.25   215,783     243.94     11.02     6,713     248.51      9.16     -4.57
   379    739.75   215,783     239.92      9.43     6,713     243.62      7.70     -3.70
   381    740.25   215,783     250.86     13.06     6,713     256.42     10.81     -5.57
   383    740.75   215,783     241.05      9.24     6,713     244.48      7.48     -3.43
   386    741.50   215,783     234.21      6.82     6,713     236.32      5.40     -2.11
   389    742.25   215,783     242.75     10.15     6,713     246.79      8.32     -4.05
   398    744.50   215,783     251.07     13.11     6,713     256.46     10.74     -5.39
   401    745.25   215,783     248.50     11.95     6,713     253.24      9.74     -4.73
   404    746.00   215,783     254.52     15.22     6,713     261.42     12.84     -6.90
   407    746.75   215,783     252.79     14.43     6,713     259.27     12.13     -6.48
   410    747.50   215,783     255.29     15.55     6,713     262.39     13.11     -7.09
   414    748.50   215,783     245.03     11.00     6,713     249.52      9.01     -4.49
   416    749.00   215,783     255.29     15.33     6,713     262.16     12.83     -6.87
   426    751.50   215,783     248.72     12.47     6,713     254.15     10.36     -5.43
   428    752.00   215,783     257.31     16.31     6,713     264.82     13.71     -7.51
   432    753.00   215,783     249.99     12.84     6,713     255.59     10.63     -5.61
   434    753.50   215,783     258.00     16.65     6,713     265.70     14.02     -7.69
   439    754.75   215,783     249.95     12.21     6,713     254.77      9.91     -4.82
   445    756.25   215,783     254.01     14.35     6,713     260.36     11.85     -6.35
   457    759.25   215,783     257.93     16.60     6,713     265.69     14.04     -7.76
   515    773.75   215,783     264.51     20.60     6,713     274.50     17.84     -9.98
   546    781.50   215,783     264.77     20.76     6,713     274.85     18.00    -10.08
   552    783.00   215,783     264.69     20.66     6,713     274.70     17.88    -10.00
   559    784.75   215,783     257.14     15.28     6,713     263.26     12.57     -6.12
   566    786.50   215,783     264.91     20.77     6,713     274.98     18.00    -10.08
   571    787.75   215,783     265.00     20.85     6,713     275.12     18.08    -10.12
   573    788.25   215,783     264.99     20.85     6,713     275.12     18.10    -10.13
   646    806.50   215,783     264.09     20.05     6,713     273.76     17.25     -9.67
   662    810.50   215,783     265.56     21.13     6,713     275.88     18.39    -10.32
   668    812.00   215,783     265.58     21.14     6,713     275.91     18.40    -10.33
   756    834.00   215,783     265.98     21.29     6,713     276.41     18.57    -10.44
   867    861.75   215,783     266.23     21.36     6,713     276.72     18.67    -10.49
   906    871.50   215,783     263.06     18.72     6,713     271.64     15.75     -8.58
   921    875.25   215,783     266.31     21.38     6,713     276.80     18.69    -10.49
  1027    901.75   215,783     266.70     21.41     6,713     277.20     18.73    -10.50
  1046    906.50   215,783     265.71     20.42     6,713     275.54     17.56     -9.83
  1121    925.25   215,783     264.74     19.45     6,713     273.88     16.57     -9.14
  1133    928.25   215,783     266.81     21.22     6,713     277.21     18.59    -10.40
  1191    942.75   215,783     266.44     20.74     6,713     276.57     18.11    -10.13
  1194    943.50   215,783     267.37     21.36     6,713     277.83     18.72    -10.46
  1271    962.75   215,783     267.67     21.39     6,713     278.15     18.79    -10.48
  1479   1014.75   215,783     253.22     15.27     6,713     260.05     13.15     -6.83
  1509   1022.25   215,783     252.29     16.39     6,713     259.82     14.68     -7.53
  1513   1023.25   215,783     250.27     15.74     6,713     257.36     14.10     -7.09
  1521   1025.25   215,783     249.76     15.70     6,713     256.79     14.07     -7.03
  1536   1029.00   215,783     249.91     15.17     6,713     256.64     13.29     -6.73
  1574   1038.50   215,783     245.22     14.12     6,713     251.11     12.42     -5.89
  1579   1039.75   215,783     241.59     12.89     6,713     246.40     10.97     -4.81
  1585   1041.25   215,783     242.00     13.58     6,713     247.26     11.78     -5.26
  1587   1041.75   215,783     242.87     13.67     6,713     248.30     11.86     -5.43
  1626   1051.50   215,783     246.83     14.38     6,713     252.94     12.53     -6.11
  1639   1054.75   215,783     247.23     14.86     6,713     253.68     13.23     -6.45
  1643   1055.75   215,783     248.17     13.83     6,713     254.12     11.92     -5.96
  1652   1058.00   215,783     249.86     15.06     6,713     256.63     13.43     -6.77
  1658   1059.50   215,783     248.90     14.84     6,713     255.49     13.24     -6.58
  1671   1062.75   215,783     250.60     14.65     6,713     257.18     12.87     -6.58
  1786   1091.50   215,783     263.31     17.97     6,713     271.58     15.07     -8.27
  1805   1096.25   215,783     266.90     21.07     6,713     277.31     18.64    -10.40
  1884   1116.00   215,783     267.08     21.17     6,713     277.54     18.72    -10.46
  1991   1142.75   215,783     267.15     21.17     6,713     277.62     18.72    -10.47
  2019   1149.75   215,783     260.82     16.42     6,713     268.03     13.58     -7.21
  2094   1168.50   215,783     267.18     21.19     6,713     277.64     18.71    -10.46
  2119   1174.75   215,783     253.09     12.51     6,713     257.59     10.33     -4.51
  2213   1198.25   215,793     258.38     14.90     6,713     264.46     12.24     -6.08
  2239   1204.75   215,793     267.38     21.26     6,713     277.87     18.74    -10.49
  2271   1212.75   215,793     257.81     14.47     6,713     263.59     11.80     -5.77
  2321   1225.25   215,793     252.97     12.41     6,713     257.37     10.27     -4.40
  2398   1244.50   215,793     251.96     12.21     6,713     256.24     10.17     -4.29
  2701   1320.25   215,793     242.09      9.43     6,713     244.74      8.28     -2.65
  2889   1367.25   215,793     258.01     14.44     6,713     263.36     11.84     -5.35
  2958   1384.50   215,793     257.25     14.08     6,713     262.35     11.60     -5.10
  2993   1393.25   215,793     250.32     11.54     6,713     253.84      9.95     -3.52
  3002   1395.50   215,793     238.52      8.81     6,713     240.71      8.00     -2.19
  3049   1407.25   215,793     254.46     12.90     6,713     258.77     10.83     -4.31
  3105   1421.25   215,793     244.80     10.29     6,713     247.62      9.16     -2.82
  3110   1422.50   215,793     248.74     11.28     6,713     252.05      9.86     -3.31
  5381   1990.25   215,833     262.11     15.32     6,713     268.21     12.57     -6.09
  5399   1994.75   215,836     264.13     16.61     6,713     271.18     13.76     -7.05
  5480   2015.00   215,835     265.27     17.48     6,713     272.97     14.61     -7.70

======================================================================
Overall Summary
======================================================================
                           MetOpIASI       UFSObsSat
              Source         L1C raw GSI-assimilated
                Rows      37,546,527       1,168,062
            Channels             174             174
     Common channels             174             174
          BT min (K)           157.6           177.5
          BT max (K)           315.2           313.9
         BT mean (K)           239.3           242.8

@NickGeneva
Copy link
Copy Markdown
Collaborator Author

Histograms

Comparison with UFSObsSat IASI: BT distributions match well for upper-atmosphere sounding channels (< 1 K bias). Window channels show 5–15 K cold bias in MetOpIASI vs UFSObsSat due to cloud contamination — MetOpIASI returns all FOVs (clear+cloudy) while GSI pre-thins observations. This is expected behavior for unfiltered L1C data.

sanity_check_iasi_histograms_p1 sanity_check_iasi_histograms_p2
Plotting Script
import sys
from datetime import datetime, timedelta

import matplotlib

matplotlib.use("Agg")
import matplotlib.pyplot as plt
import numpy as np

from earth2studio.data import MetOpIASI, UFSObsSat


def fetch_data(time_val):
    """Fetch both sources and return aligned DataFrames with common channels."""
    # UFS
    ds_ufs = UFSObsSat(cache=True, verbose=True, time_tolerance=timedelta(hours=2))
    df_ufs = ds_ufs(time_val, "iasi")
    df_ufs = df_ufs[df_ufs["satellite"] == "metop-c"].copy()

    ufs_ch_array = np.array(sorted(df_ufs["channel_index"].unique()), dtype=np.int32)
    print(f"UFS metop-c channels (1-based): {len(ufs_ch_array)}")

    # MetOpIASI.channel_indices accepts 0-based indices (0..8460),
    # UFS channels are already 1-based, so subtract 1 for MetOpIASI input.
    ds_metop = MetOpIASI(
        satellite="metop-c",
        channel_indices=ufs_ch_array - 1,
        time_tolerance=timedelta(hours=2),
        cache=True,
        verbose=True,
    )
    df_metop = ds_metop(time_val, "iasi")

    # Both sources output 1-based channel numbers — no shift needed
    common = sorted(
        set(df_metop["channel_index"].unique()) & set(df_ufs["channel_index"].unique())
    )
    print(f"Common channels (1-based): {len(common)}")
    return df_metop, df_ufs, common


def plot_histogram_grid(df_metop, df_ufs, channels, page, outpath):
    """Plot a 10x10 grid of overlaid BT histograms."""
    nrows, ncols = 10, 10
    fig, axes = plt.subplots(nrows, ncols, figsize=(28, 24))
    fig.subplots_adjust(hspace=0.55, wspace=0.3)

    for idx in range(nrows * ncols):
        row, col = divmod(idx, ncols)
        ax = axes[row, col]

        if idx >= len(channels):
            ax.set_visible(False)
            continue

        ch = channels[idx]
        wn = (2581 + (ch - 1)) * 25.0 / 100.0

        m_obs = (
            df_metop.loc[df_metop["channel_index"] == ch, "observation"].dropna().values
        )
        u_obs = df_ufs.loc[df_ufs["channel_index"] == ch, "observation"].dropna().values

        if len(m_obs) == 0 and len(u_obs) == 0:
            ax.set_visible(False)
            continue

        # Common bin range from both sources
        all_obs = np.concatenate([v for v in (m_obs, u_obs) if len(v) > 0])
        vmin, vmax = np.percentile(all_obs, [1, 99])
        bins = np.linspace(vmin, vmax, 50)

        if len(m_obs) > 0:
            ax.hist(
                m_obs,
                bins=bins,
                density=True,
                alpha=0.5,
                color="#2196F3",
                label="MetOp",
            )
        if len(u_obs) > 0:
            ax.hist(
                u_obs,
                bins=bins,
                density=True,
                alpha=0.5,
                color="#FF5722",
                label="UFS",
            )

        ax.set_title(f"Ch {ch}\n{wn:.1f} cm⁻¹", fontsize=7, pad=2)
        ax.tick_params(axis="both", labelsize=5)
        ax.set_xlabel("BT (K)", fontsize=5)
        ax.set_ylabel("Density", fontsize=5)

        # Only add legend on first subplot
        if idx == 0:
            ax.legend(fontsize=6, loc="upper right")

    page_label = f"Page {page}"
    ch_range = f"Ch {channels[0]}{channels[-1]}"
    fig.suptitle(
        f"MetOpIASI vs UFSObsSat — BT Histogram Comparison ({page_label}, {ch_range})\n"
        f"Blue = MetOpIASI (L1C raw), Orange = UFSObsSat (GSI)\n"
        f"metop-c, 2025-01-15 12:00 UTC ±2h",
        fontsize=14,
        y=0.995,
    )
    fig.savefig(outpath, dpi=150, bbox_inches="tight")
    plt.close(fig)
    print(f"Saved: {outpath}")


if __name__ == "__main__":
    time_val = datetime(2025, 1, 15, 12, 0)

    df_metop, df_ufs, common = fetch_data(time_val)
    if not common:
        print("ERROR: No common channels!")
        sys.exit(1)

    # Split into pages of 100
    page_size = 100
    for i in range(0, len(common), page_size):
        page_channels = common[i : i + page_size]
        page_num = i // page_size + 1
        outpath = f"/localhome/local-ngeneva/earth2studio/sanity_check_iasi_histograms_p{page_num}.png"
        plot_histogram_grid(df_metop, df_ufs, page_channels, page_num, outpath)

    print(f"\nDone! {(len(common) - 1) // page_size + 1} page(s) generated.")

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 10, 2026

Greptile Summary

Adds MetOpIASI as a new DataFrameSource for EUMETSAT IASI Level 1C infrared brightness temperature data, following the same eumdac-based download and struct-based binary parsing pattern as the existing MetOp AMSU-A/MHS/AVHRR sources. The EPS native binary parsing (GRH/MPHR/GIADR/MDR record walking, GIADR scale factors, inverse Planck conversion) is logically correct and well tested with a synthetic binary builder. All P2 findings are performance and robustness suggestions that do not affect correctness.

Confidence Score: 5/5

Safe to merge — all findings are P2 style/performance suggestions with no correctness impact.

The binary parsing logic, Planck conversion constants, unit conversions, wavenumber formula, and geolocation handling are all correct and validated by a thorough test suite. Remaining comments are performance optimisations and a logging improvement — none affect data correctness.

earth2studio/data/metop_iasi.py — spectral extraction loop and per-MDR offset recomputation are worth addressing for production performance on full orbits.

Important Files Changed

Filename Overview
earth2studio/data/metop_iasi.py Core implementation: EPS binary parsing, wavenumber/radiance-to-BT conversion, and async download are logically correct; inner spectral extraction loop and per-MDR offset recomputation are performance concerns.
test/data/test_metop_iasi.py Comprehensive synthetic binary builder covers GRH/MPHR/GIADR/MDR records; unit tests cover parsing, geolocation, quality flags, empty files, and Planck conversion; mock tests cover the full call path.
earth2studio/lexicon/metop_iasi.py Minimal lexicon mapping 'iasi' to identity modifier; band table and docstring are accurate; no issues.
earth2studio/data/init.py Single import line added for MetOpIASI; no issues.
earth2studio/lexicon/init.py Single import line added for MetOpIASILexicon; no issues.

Reviews (1): Last reviewed commit: "feat: add MetOp IASI Level 1C infrared b..." | Re-trigger Greptile

- Vectorize spectral extraction with np.frombuffer + indexing (was per-element struct.unpack)
- Flatten geo/angle/quality loops into single slice assignments
- Compute MDR field offsets once and reuse as relative offsets
- Add logger.debug for silent entry-listing failure
- Widen GIADR int16 to int32 at parse time to prevent latent truncation
@NickGeneva
Copy link
Copy Markdown
Collaborator Author

/blossom-ci

@NickGeneva
Copy link
Copy Markdown
Collaborator Author

/blossom-ci

1 similar comment
@NickGeneva
Copy link
Copy Markdown
Collaborator Author

/blossom-ci

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant