Skip to content

Add UFS tracing instrumentation to nuopc/cmeps driver#1075

Draft
NickSzapiro-NOAA wants to merge 17 commits intoCICE-Consortium:mainfrom
NickSzapiro-NOAA:ufs_tracing_consortium
Draft

Add UFS tracing instrumentation to nuopc/cmeps driver#1075
NickSzapiro-NOAA wants to merge 17 commits intoCICE-Consortium:mainfrom
NickSzapiro-NOAA:ufs_tracing_consortium

Conversation

@NickSzapiro-NOAA
Copy link
Copy Markdown
Contributor

@NickSzapiro-NOAA NickSzapiro-NOAA commented Dec 16, 2025

For detailed information about submitting Pull Requests (PRs) to the CICE-Consortium,
please refer to: https://github.com/CICE-Consortium/About-Us/wiki/Resource-Index#information-for-developers

PR checklist

  • Short (1 sentence) summary of your PR:
    Add UFS tracing instrumentation to nuopc/cmeps driver
  • Developer(s):
    @DusanJovic-NOAA
  • Suggest PR reviewers from list in the column to the right.
  • Please copy the PR test results link or provide a summary of testing completed below.
    No changes in UFS regression tests (Add tracing instrumentation ufs-community/ufs-weather-model#2884)
  • How much do the PR code changes differ from the unmodified code?
    • bit for bit
    • different at roundoff level
    • more substantial
  • Does this PR create or have dependencies on Icepack or any other models?
    • Yes
    • No
  • Does this PR update the Icepack submodule? If so, the Icepack submodule must point to a hash on Icepack's main branch.
    • Yes
    • No
  • Does this PR add any new test cases?
    • Yes
    • No
  • Is the documentation being updated? ("Documentation" includes information on the wiki or in the .rst files from doc/source/, which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/. A test build of the technical docs will be performed as part of the PR testing.)
    • Yes
    • No, does the documentation need to be updated at a later time?
      • Yes
      • No
  • Please document the changes in detail, including why the changes are made. This will become part of the PR commit log.

ufs-community/ufs-weather-model#2884 adds a simple tracing module to UFS and updates some subcomponents' nuopc drivers to produce a trace file which can be used to identify performance issues. The tracing module is not built and used by default. It can be enabled by setting a build option `-DUFS_TRACING=ON'. The generated trace files can be visualized using the chrome-tracing tool or, for example, the Perfetto UI online tool (https://ui.perfetto.dev/) like in ufs-community/ufs-weather-model#2883 . Tracing has been useful in optimizing GFS runtime.

@NickSzapiro-NOAA
Copy link
Copy Markdown
Contributor Author

This is a draft PR as 1) the UFS PR hasn't been merged yet and 2) to ask if the changes are acceptable to the nuopc/cmeps cap

Maybe asking @dabail10 @anton-seaice in particular if there is objection to UFS_TRACING ifdef at this level (and any other preferences/criticisms/...). There is an alternate flavor to have the ifdefs at lower layer more like cice_wrapper_mod.F90 (with wrapper like med_ufs_trace_wrapper_mod in NOAA-EMC/CMEPS#151)

Copy link
Copy Markdown
Contributor

@anton-seaice anton-seaice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All ok from me, we might find that useing the same driver code eventually becomes unmanageable, but for now happy to keep trying to keep it all together.

I should have asked more about the tracing work on Tuesday. Is there a summary repo / issue / notebook somewhere ?

Comment on lines +142 to +145
call ESMF_GridCompGet(gcomp, vm=vm,rc=rc)
if (ChkErr(rc,__LINE__,u_FILE_u)) return
call ESMF_VMGet(vm, localpet=mype, rc=rc)
if (ChkErr(rc,__LINE__,u_FILE_u)) return
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    if (ChkErr(rc,__LINE__,u_FILE_u)) return
    call ESMF_VMGet(vm, localpet=mype, rc=rc)
    if (ChkErr(rc,__LINE__,u_FILE_u)) return

Are these used anywhere?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to set mype (before my_task and master_task are available in InitializeAdvertise)

Copy link
Copy Markdown
Contributor

@dabail10 dabail10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally looks fine, but we already have variables for this sort of thing in CICE.

if (ChkErr(rc,__LINE__,u_FILE_u)) return

#ifdef UFS_TRACING
if (mype == 0) call ufs_trace_init()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remain consistent with the rest of the code and check:

if (my_task == master_task)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I thought the same. But, the tracing starts before my_task and master_task are available in InitializeAdvertise

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we call init_grid1 before this phase?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not (?) as I think call input_data has to happen before call init_grid1 like in cice_init1 to have all the namelist variables

But, it's more that conflicts with the intent to start the tracing as soon as possible in all subcomponents (before calling much else) towards getting the fullest timeline of the run. And for this to be "unobtrusive"

fwiw, I think master_task and mype==0 are really the same since

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess these lines could be moved up to SetServices from InitializeAdvertise if there is really strong preference

call ESMF_GridCompGet(gcomp, vm=vm, rc=rc)
if (ChkErr(rc,__LINE__,u_FILE_u)) return
call ESMF_VMGet(vm, mpiCommunicator=lmpicom, localPet=localPet, PetCount=npes, rc=rc)
if (ChkErr(rc,__LINE__,u_FILE_u)) return
#ifdef CESMCOUPLED
call ESMF_VMGet(vm, pet=localPet, peCount=nthrds, rc=rc)
if (ChkErr(rc,__LINE__,u_FILE_u)) return
if (nthrds==1) then
call NUOPC_CompAttributeGet(gcomp, "nthreads", value=cvalue, rc=rc)
if (ESMF_LogFoundError(rcToCheck=rc, msg=ESMF_LOGERR_PASSTHRU, line=__LINE__, file=u_FILE_u)) return
read(cvalue,*) nthrds
endif
!$ call omp_set_num_threads(nthrds)
#endif
!----------------------------------------------------------------------------
! Initialize cice communicators
!----------------------------------------------------------------------------
call init_communicate(lmpicom) ! initial setup for message passing
mastertask = .false.
if (my_task == master_task) mastertask = .true.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is not guaranteed on all systems that master_task = 0. Does the call to ufs_trace_init really have to happen at this phase? I realize there are often issues with circular dependencies. I feel like the tracing doesn't need to happen before the scatter and PE decomposition.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's just easier to trace fully than "be smart" about what needs to be traced

One example where accounting/separating the different phases was found to be very useful was in trying to reduce GFS initialization time (ufs-community/ufs-weather-model#2831). Like removing some advertised fields from FV3 reduced InitializeAdvertise (top to bottom)
image

character(*), parameter :: u_FILE_u = &
__FILE__

integer :: mype = -1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already my_task available in the code.

@DeniseWorthen
Copy link
Copy Markdown
Contributor

DeniseWorthen commented Dec 16, 2025

All ok from me, we might find that useing the same driver code eventually becomes unmanageable, but for now happy to keep trying to keep it all together.

I should have asked more about the tracing work on Tuesday. Is there a summary repo / issue / notebook somewhere ?

I think both CESM and UFS have benefited greatly from sharing common nuopc_caps between MOM6, CICE6 and WW3, coupled to CMEPS (and CDEPS). I think we should strive to maintain that inter-operability.

But when there are features or options that are really used by only one system, I think we all benefit from making those use cases as unobtrusive as possible, thus the existence of various wrapper mods in the shared NUOPC caps. That was my thinking in asking Nick to get feedback from the other cap users.

The ufs-tracing was something one of our colleagues developed during the recent push to get the GFSv17 ready for operations. The existing EMSF based profiling didn't provide the sort of "time line" feature he wanted, so he developed a very basic tracing code that could sit inside each component. I think it is an exceptionally useful little feature, but it would need a CESM user to enable it across their infrastructure system (driver etc).

@NickSzapiro-NOAA
Copy link
Copy Markdown
Contributor Author

I should have asked more about the tracing work on Tuesday. Is there a summary repo / issue / notebook somewhere ?

@anton-seaice Not afaik, but it's intended to be a straightforward setup. Each component writes to its own trace file and then combine them (ufs_tracing/combine_traces.sh) to look like:

{"name":"SetServices", "ph":"B", "ts":"1416357460", "pid":"1", "tid":"cice"},
{"name":"SetServices", "ph":"E", "ts":"1416359964", "pid":"1", "tid":"cice"},
{"name":"InitializeP0", "ph":"B", "ts":"1416431949", "pid":"1", "tid":"cice"},
{"name":"InitializeP0", "ph":"E", "ts":"1416432126", "pid":"1", "tid":"cice"},
{"name":"InitializeAdvertise", "ph":"B", "ts":"1419655739", "pid":"1", "tid":"cice"},
{"name":"InitializeAdvertise", "ph":"E", "ts":"1425338210", "pid":"1", "tid":"cice"},
{"name":"InitializeRealize", "ph":"B", "ts":"1498516061", "pid":"1", "tid":"cice"},
{"name":"InitializeRealize", "ph":"E", "ts":"1506396931", "pid":"1", "tid":"cice"},
{"name":"ModelAdvance", "ph":"B", "ts":"1580100026", "pid":"1", "tid":"cice"},
{"name":"ModelAdvance", "ph":"E", "ts":"1581058909", "pid":"1", "tid":"cice"},
...
{"name":"ModelAdvance", "ph":"B", "ts":"2194075055", "pid":"1", "tid":"cice"},
{"name":"ModelAdvance", "ph":"E", "ts":"2196529209", "pid":"1", "tid":"cice"},
{"name":"ModelFinalize", "ph":"B", "ts":"2196531389", "pid":"1", "tid":"cice"},
{"name":"ModelFinalize", "ph":"E", "ts":"2196531441", "pid":"1", "tid":"cice"},

Then, can open with Perfetto UI online tool (https://ui.perfetto.dev/) and navigate wasd

@NickSzapiro-NOAA NickSzapiro-NOAA changed the title Add UFS tracing instrumentation to nuopc driver Add UFS tracing instrumentation to nuopc/cmeps driver Dec 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants