feat: add MetOp IASI Level 1C infrared brightness temperature data source#811
feat: add MetOp IASI Level 1C infrared brightness temperature data source#811NickGeneva wants to merge 6 commits intoNVIDIA:mainfrom
Conversation
…urce Add MetOpIASI DataFrameSource for EUMETSAT IASI L1C calibrated spectral radiance data from MetOp-A/B/C satellites. Parses EPS native binary format with 8461 infrared channels (645-2760 cm⁻¹), converting to brightness temperature via inverse Planck function. Includes MetOpIASILexicon, unit tests (22 passing + 4 network xfail), and documentation updates.
HistogramsComparison with UFSObsSat IASI: BT distributions match well for upper-atmosphere sounding channels (< 1 K bias). Window channels show 5–15 K cold bias in MetOpIASI vs UFSObsSat due to cloud contamination — MetOpIASI returns all FOVs (clear+cloudy) while GSI pre-thins observations. This is expected behavior for unfiltered L1C data.
Plotting Scriptimport sys
from datetime import datetime, timedelta
import matplotlib
matplotlib.use("Agg")
import matplotlib.pyplot as plt
import numpy as np
from earth2studio.data import MetOpIASI, UFSObsSat
def fetch_data(time_val):
"""Fetch both sources and return aligned DataFrames with common channels."""
# UFS
ds_ufs = UFSObsSat(cache=True, verbose=True, time_tolerance=timedelta(hours=2))
df_ufs = ds_ufs(time_val, "iasi")
df_ufs = df_ufs[df_ufs["satellite"] == "metop-c"].copy()
ufs_ch_array = np.array(sorted(df_ufs["channel_index"].unique()), dtype=np.int32)
print(f"UFS metop-c channels (1-based): {len(ufs_ch_array)}")
# MetOpIASI.channel_indices accepts 0-based indices (0..8460),
# UFS channels are already 1-based, so subtract 1 for MetOpIASI input.
ds_metop = MetOpIASI(
satellite="metop-c",
channel_indices=ufs_ch_array - 1,
time_tolerance=timedelta(hours=2),
cache=True,
verbose=True,
)
df_metop = ds_metop(time_val, "iasi")
# Both sources output 1-based channel numbers — no shift needed
common = sorted(
set(df_metop["channel_index"].unique()) & set(df_ufs["channel_index"].unique())
)
print(f"Common channels (1-based): {len(common)}")
return df_metop, df_ufs, common
def plot_histogram_grid(df_metop, df_ufs, channels, page, outpath):
"""Plot a 10x10 grid of overlaid BT histograms."""
nrows, ncols = 10, 10
fig, axes = plt.subplots(nrows, ncols, figsize=(28, 24))
fig.subplots_adjust(hspace=0.55, wspace=0.3)
for idx in range(nrows * ncols):
row, col = divmod(idx, ncols)
ax = axes[row, col]
if idx >= len(channels):
ax.set_visible(False)
continue
ch = channels[idx]
wn = (2581 + (ch - 1)) * 25.0 / 100.0
m_obs = (
df_metop.loc[df_metop["channel_index"] == ch, "observation"].dropna().values
)
u_obs = df_ufs.loc[df_ufs["channel_index"] == ch, "observation"].dropna().values
if len(m_obs) == 0 and len(u_obs) == 0:
ax.set_visible(False)
continue
# Common bin range from both sources
all_obs = np.concatenate([v for v in (m_obs, u_obs) if len(v) > 0])
vmin, vmax = np.percentile(all_obs, [1, 99])
bins = np.linspace(vmin, vmax, 50)
if len(m_obs) > 0:
ax.hist(
m_obs,
bins=bins,
density=True,
alpha=0.5,
color="#2196F3",
label="MetOp",
)
if len(u_obs) > 0:
ax.hist(
u_obs,
bins=bins,
density=True,
alpha=0.5,
color="#FF5722",
label="UFS",
)
ax.set_title(f"Ch {ch}\n{wn:.1f} cm⁻¹", fontsize=7, pad=2)
ax.tick_params(axis="both", labelsize=5)
ax.set_xlabel("BT (K)", fontsize=5)
ax.set_ylabel("Density", fontsize=5)
# Only add legend on first subplot
if idx == 0:
ax.legend(fontsize=6, loc="upper right")
page_label = f"Page {page}"
ch_range = f"Ch {channels[0]}–{channels[-1]}"
fig.suptitle(
f"MetOpIASI vs UFSObsSat — BT Histogram Comparison ({page_label}, {ch_range})\n"
f"Blue = MetOpIASI (L1C raw), Orange = UFSObsSat (GSI)\n"
f"metop-c, 2025-01-15 12:00 UTC ±2h",
fontsize=14,
y=0.995,
)
fig.savefig(outpath, dpi=150, bbox_inches="tight")
plt.close(fig)
print(f"Saved: {outpath}")
if __name__ == "__main__":
time_val = datetime(2025, 1, 15, 12, 0)
df_metop, df_ufs, common = fetch_data(time_val)
if not common:
print("ERROR: No common channels!")
sys.exit(1)
# Split into pages of 100
page_size = 100
for i in range(0, len(common), page_size):
page_channels = common[i : i + page_size]
page_num = i // page_size + 1
outpath = f"/localhome/local-ngeneva/earth2studio/sanity_check_iasi_histograms_p{page_num}.png"
plot_histogram_grid(df_metop, df_ufs, page_channels, page_num, outpath)
print(f"\nDone! {(len(common) - 1) // page_size + 1} page(s) generated.") |
Greptile SummaryAdds
|
| Filename | Overview |
|---|---|
| earth2studio/data/metop_iasi.py | Core implementation: EPS binary parsing, wavenumber/radiance-to-BT conversion, and async download are logically correct; inner spectral extraction loop and per-MDR offset recomputation are performance concerns. |
| test/data/test_metop_iasi.py | Comprehensive synthetic binary builder covers GRH/MPHR/GIADR/MDR records; unit tests cover parsing, geolocation, quality flags, empty files, and Planck conversion; mock tests cover the full call path. |
| earth2studio/lexicon/metop_iasi.py | Minimal lexicon mapping 'iasi' to identity modifier; band table and docstring are accurate; no issues. |
| earth2studio/data/init.py | Single import line added for MetOpIASI; no issues. |
| earth2studio/lexicon/init.py | Single import line added for MetOpIASILexicon; no issues. |
Reviews (1): Last reviewed commit: "feat: add MetOp IASI Level 1C infrared b..." | Re-trigger Greptile
- Vectorize spectral extraction with np.frombuffer + indexing (was per-element struct.unpack) - Flatten geo/angle/quality loops into single slice assignments - Compute MDR field offsets once and reuse as relative offsets - Add logger.debug for silent entry-listing failure - Widen GIADR int16 to int32 at parse time to prevent latent truncation
|
/blossom-ci |
|
/blossom-ci |
1 similar comment
|
/blossom-ci |



Description
Add
MetOpIASIDataFrameSource for EUMETSAT IASI Level 1C calibrated infrared spectral radiance data from MetOp-A/B/C polar-orbiting satellites. The source parses EPS native binary format files, extracting per-IFOV geolocation, viewing geometry, quality flags, and spectral radiance for user-selected channels, converting to brightness temperature via the inverse Planck function.Data source details
EUMETSAT_CONSUMER_KEY+EUMETSAT_CONSUMER_SECRETenv varsData licensing
Dependencies added
No new dependencies needed —
eumdac>=3.1.0is already in thedataoptional dependency group.Key implementation details
channel_indicesparameter (0-based, default 100 representative channels) to avoid loading all 8461 channels per IFOV_compute_mdr_field_offsets()to handle variable-size MDR header fields (no hardcoded offsets)eumdacfor download, struct-based binary parsing, long-format DataFrame outputValidation
See sanity-check validation in PR comments below.
Checklist