feat: add JPSS CrIS FSR Level 1 spectral radiance DataFrameSource (PAINFUL)#809
feat: add JPSS CrIS FSR Level 1 spectral radiance DataFrameSource (PAINFUL)#809NickGeneva wants to merge 12 commits intoNVIDIA:mainfrom
Conversation
Add JPSS_CRIS DataFrameSource for Cross-track Infrared Sounder (CrIS) Full Spectral Resolution (FSR) Level 1 radiance data from NOAA S3 buckets (NOAA-20, NOAA-21, Suomi NPP). Includes JPSSCrISLexicon, unit tests, validation script, and documentation updates. - 2223 spectral channels (LW:717, MW:869, SW:637) - Paired HDF5 file decoding (SDR radiance + GEO geolocation) - Granule key matching for SDR/GEO pairing - Quality flags combined from per-band QF3 into uint16 - Updated create-data-source skill with two-phase test verification
The validation script should not be committed per project conventions.
- Add subsample parameter to JPSS_CRIS for FOR-level spatial sub-sampling (selects every Nth cross-track Field-of-Regard, default 1 = no sub-sampling) - Remove ~160 lines of unreachable dead code after _decode_hdf5 return - Fix mypy union-attr errors on fallback S3 listing calls - Add type annotation for unique_mask in _compile_dataframe - Add test_jpss_cris_subsample_mock verifying subsample=1/3/5 row counts
…ubsampling Subsample now selects every Nth granule instead of every Nth FOR along the cross-track dimension. This provides more uniform spatial coverage while still reducing data volume.
Greptile SummaryThis PR adds
|
| Filename | Overview |
|---|---|
| earth2studio/data/jpss_cris.py | New 1291-line DataFrameSource for JPSS CrIS FSR L1 brightness temperatures; sound architecture mirroring JPSS_ATMS, with a few minor correctness nits in QF packing and dead code in _SDR_QF_KEYS |
| earth2studio/lexicon/jpss.py | Adds JPSSCrISLexicon; CRIS_BAND_RANGES upper bounds are off by 2 channels (states 4 high-end guards in comment, actual layout is 2) |
| test/data/test_jpss_cris.py | Comprehensive offline mock tests plus network slow-tests; covers apodized/unapodized paths, subsampling, schema field selection, validation, and filename parsing |
| earth2studio/data/init.py | One-line export of JPSS_CRIS — correct |
| earth2studio/lexicon/init.py | Exports JPSSCrISLexicon alongside existing JPSS lexicons — correct |
Reviews (2): Last reviewed commit: "Merge branch 'main' into ngeneva/cris" | Re-trigger Greptile
Switch JPSS_CRIS channel_index from 0-based sequential (0..2222) to GSI sensor_chan numbering (1..2219) so that channel indices are directly comparable with UFSObsSat crisfsr data. The GSI convention numbers LWIR channels 1-713, then MWIR 714-1582 and SWIR 1583-2219, omitting four LWIR band-edge channels (1095.625-1097.5 cm-1) that are not assimilated. Those four channels are assigned sensor_chan 0 as a sentinel. Add _CRIS_GSI_SENSOR_CHAN module-level lookup table. Update docstrings and test assertions to reflect the new range.
Apply inverse Planck function to convert raw spectral radiance (mW m^-2 sr^-1 (cm^-1)^-1) from JPSS CrIS SDR files into brightness temperature (K), matching the units used by UFSObsSat. This makes the observation column directly comparable between the two sources. Adds module-level _CRIS_WAVENUMBER array and _radiance_to_bt() helper. Updates docstrings and test assertions to reflect BT output.
|
/blossom-ci |







Description
Add
JPSS_CRISDataFrameSource for JPSS Cross-track Infrared Sounder (CrIS) Full Spectral Resolution (FSR) Level 1 spectral radiance observations from NOAA Open Data on AWS.This complements the existing
JPSS_ATMSmicrowave sounder source by adding the infrared hyperspectral sounder from the same JPSS satellite constellation (NOAA-20, NOAA-21, Suomi NPP).Data source details
Data licensing
Dependencies added
No new dependencies needed — uses
h5py,s3fs,pandas,pyarrow,numpywhich are already in core deps.Changes
earth2studio/data/jpss_cris.py(~1315 lines)JPSSCrISLexiconinearth2studio/lexicon/jpss.py(2223 channel mappings)crisfsradded toE2STUDIO_VOCABtest/data/test_jpss_cris.pywith mock and network testsdatasources_dataframe.rstKey implementation details
subsampleparameter[0.23, 0.54, 0.23]kernel) with optional bypass viaapodize=FalseCrIS channel mapping — validation against UFS/GSI
The
channel_indexcolumn uses the GSIsensor_channumbering convention (1–2211 contiguous for science channels) to enable direct comparison withUFSObsSatCrIS observations.(Guard bands dropped after apodization)
Apodization verification
The Hamming apodization implementation was verified against the CrIS SDR ATBD (JPSS 474-00032, Section 3.7.2, Equation 54). The 3-tap spectral convolution
[0.23, 0.54, 0.23]is mathematically equivalent to interferogram-domain Hamming multiplication — confirmed numerically to machine precision (~1e-14) via DCT-I round-trip testing. Boundary handling at band edges uses reflect padding, which differs from the ATBD's one-sided formula, but this only affects guard channels that are trimmed in the output.Verification sources consulted
read_cris.f90,setuprad.f90,crtm_interface.f90— confirmed BUFRCRCHNMchannel numbers match CRTMsensor_channelvia direct integer equality (no offset/transform)Checklist
Dependencies
None — all required packages are already in core dependencies.