Skip to content

Sync to CICE-Consortium (2026-02-08)#108

Open
NickSzapiro-NOAA wants to merge 15 commits into
NOAA-EMC:developfrom
NickSzapiro-NOAA:sync_cice_2026-02
Open

Sync to CICE-Consortium (2026-02-08)#108
NickSzapiro-NOAA wants to merge 15 commits into
NOAA-EMC:developfrom
NickSzapiro-NOAA:sync_cice_2026-02

Conversation

@NickSzapiro-NOAA
Copy link
Copy Markdown
Collaborator

@NickSzapiro-NOAA NickSzapiro-NOAA commented Feb 10, 2026

For detailed information about submitting Pull Requests (PRs) to the CICE-Consortium,
please refer to: https://github.com/CICE-Consortium/About-Us/wiki/Resource-Index#information-for-developers

PR checklist

  • Short (1 sentence) summary of your PR:
    Sync CICE-Consortium/main into EMC fork, including baseline changes for zapping residual ice, bug fixes in history variables, and new CMIP7 history variables. Also adds history restart feature.
  • Developer(s):
    See PRs at CICE-Consortium
  • Suggest PR reviewers from list in the column to the right.
  • Please copy the PR test results link or provide a summary of testing completed below.
    UFS regression testing (Update CICE (2026-02) ufs-community/ufs-weather-model#3086) and preceding CICE-Consortium testing
  • How much do the PR code changes differ from the unmodified code?
    • bit for bit
    • different at roundoff level
    • more substantial
  • Does this PR create or have dependencies on Icepack or any other models?
    • Yes
    • No
  • Does this PR update the Icepack submodule? If so, the Icepack submodule must point to a hash on Icepack's main branch.
    • Yes
    • No
  • Does this PR add any new test cases?
    • Yes
    • No
  • Is the documentation being updated? ("Documentation" includes information on the wiki or in the .rst files from doc/source/, which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/. A test build of the technical docs will be performed as part of the PR testing.)
    • Yes
    • No, does the documentation need to be updated at a later time?
      • Yes
      • No
  • Please document the changes in detail, including why the changes are made. This will become part of the PR commit log.

EMC/CICE sync, including baseline changes for zapping residual ice, bug fixes in history variables, and new CMIP7 history variables. Also adds history restart feature. Closes #109

eclare108213 and others added 13 commits November 26, 2025 11:15
…rs to remove residual ice (CICE-Consortium#1067)

Removes residual amounts of ice that are not otherwise handled by the numerics. The controlling parameters (itd_area_min and itd_mass_min, implemented in Icepack) set minimum ice area and mass values below which all ice is removed following the thermodynamics and ridging calculations. For the B-grid, these parameters are currently set to the dynamics stability minima, which are being reduced to extremely small values based on testing in multiple modeling systems. If needed, users can revert these parameters to the original, larger values by adding them to ice_in. Setting them to 0 turns off the new zapping completely. These parameters are set to the larger, original values in the C-grid test scripts, pending further work.

This updates Icepack and changes answers.
After an upgrade, several of Carpenter's modules were removed. These changes update the modules and software versions used to compiler and run CICE.
There is a great deal of confusion about how various history variables are time-averaged, e.g. SIMIP history output implementation CICE-Consortium#1038, Fixes for sitemptop, sitempbot, and sitempsnic. CICE-Consortium#1054. This PR attempts to clarify the situation. These averages are also relevant for conservative coupling.
…ormat) (CICE-Consortium#1079)

Add ability to read an extended grid (supported for pop netcdf file format)

Add subroutine popgrid_nc_ext to read an extended grid pop netcdf file
Add 'nc_ext' option to grid_format namelist
The extended grid will apply to the kmt file as well as these are specified by the same grid_format namelist
Modify gridbox_verts to operate on a local array instead of a global array, this should improve performance and removes redundant extrapolation calculations. This approach also supports both regular and extended grid reads.
The implementation largely duplicates subroutine popgrid_nc but for an extended grid in subroutine popgrid_nc_ext. The extended grid represents the active points plus the full halo. As much as possible, the extended grid (LON, LAT, ANGLE, KMT) is read in on the halo instead of being computed. For some grid metrics (DXT, DYT, DXU, DYU, etc), extrapolation is still required onto the halo.

Remove some trailing blanks is other places as needed.
Adds a namelist flag to allow significant wave height to be passed into the ice model from a coupler. In addition, this PR moves wave_spec_height out of icepack interface argument lists, since it is initialized via icepack_init_parameters.

See CICE-Consortium/Icepack#545

Update Icepack to #0bcde255637a594

Update ice_step_mod.F90 in opticep unit test to be consistent with latest changes

---------

Co-authored-by: apcraig <anthony.p.craig@gmail.com>
Updated all of the variable names, long names, and units to correspond to the CMIP7 data request.
Added new variables requested in the CMIP7 data request.
Added documentation about the CMIP6 to CMIP7 update.
Simplified the accumulation of some fields where possible and added prognostic sea ice density.
Added accumulation of variables relative to aice_init or aice.
Bug fix for flwout (sifllwutop) where aice_init = 0, but aice > 0.
Bug fix for shortwave abosrbed and albedo computation (more coming later)
Bug fix: Some variables that were scaled by aice, should be multiplied by aice (not aice_init) to get the _ai quantities,  including fswabs, fsens, flat, etc.
Removed f_CMIP flag and added set_nml.cmip option instead.
Added comment field for SIMIP variables that uses part of the description field in the CMIP data request table.
Added long_name field to address issue: time_bounds, lat?_bounds, lon?_bounds attributes CICE-Consortium#1057
Partly addresses aice versus aice_init aice vs. aice/aice_init factor in ice_history CICE-Consortium#1033
Partial fix for albedo variables [albedo]_ai history variables over 100% CICE-Consortium#1051
Addresses issue: Some CMIP variables are computed using a mix of U and T quantities CICE-Consortium#904
Add history restart to netcdf and pio IO options. Binary was not included due to the complexity of having to track history fields in binary files. History restart files are written automatically for history streams that are averaged and when a restart is written during the middle of a history accumulation period. There is one history restart file per history stream. File are written in the restart directory using the history name, an appended "_r[histfreq]", and the model date. An ice_read_hist subroutine was added to the ice_history_write.F90 file. For binary, calling this returns with a warning message that history restarts are not implemented. When history restarts are read, the model will only read files and fields that are found and continue with the accumulator initialized to zero for fields that are not found. For production runs, this should work fine. If a user modifies the history streams in the middle or a run, then an assessment should be made of which fields are valid on the first restart run.

The history restart files are basically history files, written at double precision, writing the accumulated fields. In addition, some additional fields are written including time_beg, avgct, albcnt, and snwcnt which represent accumulation counters for time average history output.

A new histall10d set_nml option was added that turns on 3 averaged history streams and all history fields. When used in a restart test, the scripts will verify bit-for-bit history files and history restart files across the restart. Several tests were added to the io_suite to include formal testing of bit-for-bit history restarts. Two fields, mlt_onset and frz_onset and not turned on with histall10d because they do not restart properly and they are unable to restart bit-for-bit on the history file, see CICE-Consortium#1068.

Several history fields have a bug in them and have been written out incorrectly, and these bugs were fixed. The bug in these cases was that the fields were accumulated during the timestep across categories but were not zeroed out at the start of the timestep. As a result, those fields were accumulating over the entire run incorrectly. The fields that had to be zeroed out were evaps and evaps plus upNO, upNH, bTiz, bphi, iDi, and iki associated with bgc. The bit-for-bit history restart test discovered these errors.

Add a new namelist, write_histrest, to turn off history restart writing. The default is that history restarts are on.

Update set_nml.cmip to fix an error in f_apond_ai setting.
Update Copyright to 2026

Remove trailing whitespace

Update Icepack to #2f31ee37f3a70, Icepack v1.5.3
Bug fix for lwout in CESM driver
Also some FSD stuff for coupling
Fix define for sitimefrac
Add the CESM3 namelist changes
…ICE-Consortium#1089)

Update Icepack to #daa41638c6cef to include

Enforce minimum snow grain radius (CICE-Consortium#552)

If the snow grain radius is set to zero, possibly because of zapping small ice or if ice disappears mid-timestep, then updates of snow grain radius will produce NaNs. Snow grain radius is usually bounded between a min and max so this generally doesn't happen, but a recent coupled E3SM bgc run crashed with this error. While the error seems to be relatively rare, this bug fix changes answers when the snow grain radius is nonzero but still less than the minimum.
Derecho shared node jobs intermittently abort with error message
"start failed on dec2436: No reply from shepherd after 108s"
due to PBS/MPI launch conflicts. Derecho qstat output was also recently changed to return output for completed jobs which prevented the job checking scripts from identifying jobs that have completed.

Update derecho shared batch job submission to both increase the number of shared node jobs and control the number of jobs per shared node by submitting the shared jobs on more cores than needed. In the end, an upgrade to PBS seemed to fix the shared node aborts, so this change was commented out in the PR. Derecho will continue to be closely watched.

Fix potential bug in setting ICE_MACHINE_QSTAT if the string has spaces in it.

Update job checking logic to avoid PBS output that shows completed jobs, added -v " historical ". This is far from ideal and not particularly future proof, but PBS qstat has become a mess.

Update create fails to identify test suite jobs that failed to run then generate a script to resubmit them.
@gspetro-NOAA
Copy link
Copy Markdown

Could we get a review of this PR so that we can schedule its WM parent PR 3086?

@NickSzapiro-NOAA
Copy link
Copy Markdown
Collaborator Author

Thanks @gspetro-NOAA . I'd like to check the new baselines for a few more days and then I'll request reviews

Minor fix to initialize worka=0 for sifb history variable accumulation (like elsewhere in ice_history) so don't have uninitialized values being accumulated. This is needed to fix out-of-range history values for sifb (in UFS)
Initialize worka for sifb
@gspetro-NOAA
Copy link
Copy Markdown

@DeniseWorthen Any chance you can review this PR so that we can process ufs-community/ufs-weather-model#3086 ?

@DeniseWorthen
Copy link
Copy Markdown
Collaborator

Sure, I had looked at it previously but didn't formally approve.

Comment on lines +975 to 981
enddo
enddo
do j = 1,ny_block
do i = 1,nx_block
if (kmt(i,j,iblk) >= p5) hm(i,j,iblk) = c1
enddo
enddo
Copy link
Copy Markdown
Collaborator Author

@NickSzapiro-NOAA NickSzapiro-NOAA Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: check kmt and hm mpi exchange after this

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, in makemask

enddo
enddo
endif
call scatter_global(work1, work_g2, &
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: gridbox_verts used to scatter_global

@NickSzapiro-NOAA
Copy link
Copy Markdown
Collaborator Author

Can I check in with you here @apcraig @DeniseWorthen ? The changes brought in here to update EMC/CICE to CICE-Consortium fail to reproduce when changing the number of MPI tasks in several ufs-weather-model regression tests ufs-community/ufs-weather-model#3086 (comment)

The changes in cicecore/cicedyn/infrastructure/ice_grid.F90 are really all I see. Does any of this look suspicious to you?

@apcraig
Copy link
Copy Markdown

apcraig commented Feb 25, 2026

OK, just deleted my last comment. Our standalone testing suggests bit-for-bit results when running different tasks/threads. @NickSzapiro-NOAA, are you suggesting that is no longer the case in UFS or are you asking whether the CICE answers have changed relative to UFS current version?

@NickSzapiro-NOAA
Copy link
Copy Markdown
Collaborator Author

These UFS mpi regression tests pass with CICE at CICE-Consortium#1054 @apcraig . That is no longer the case for CICE at top of CICE-Consortium/main

@apcraig
Copy link
Copy Markdown

apcraig commented Feb 25, 2026

OK, CICE-Consortium#1054 is Nov, 2025. Since then answer changes were introduced in CICE-Consortium#1067, CICE-Consortium#1089 if you're using the new snow physics,

@NickSzapiro-NOAA
Copy link
Copy Markdown
Collaborator Author

Maybe I should be clearer. The baselines have changed (with zap residual). Coupled runs using different MPI tasks don't match each other

@apcraig
Copy link
Copy Markdown

apcraig commented Feb 25, 2026

OK, I understand now. Standalone CICE testing suggests runs are bit-for-bit with different block size, task, and thread counts. Let me know if I can help pin this issue down.

@NickSzapiro-NOAA
Copy link
Copy Markdown
Collaborator Author

Great that CICE standalone passes @apcraig ! We use grid_format='nc' so that narrows things.

I can only figure it's related to CICE-Consortium#1079 , particularly in the changes around gridbox_verts as now call gridbox_verts after scatter_global and ice_HaloExtrapolate (?)

@apcraig
Copy link
Copy Markdown

apcraig commented Feb 25, 2026

Another idea. We do not test a MOM grid. It would be great to add one. Maybe someone can provide a coarse grid we can use. But, it's possible something was missed along the way related to the MOM grid implementation?

@NickSzapiro-NOAA
Copy link
Copy Markdown
Collaborator Author

NickSzapiro-NOAA commented Feb 26, 2026

I'll try to isolate the troublesome commit

Well, @DeniseWorthen has made several MOM fix files from 1/4 to 9 degrees:
https://noaa-ufs-regtests-pds.s3.amazonaws.com/index.html#input-data-20251015/MOM6_FIX/
But we haven't used mom_nc gridtype yet. Probably Anton has more there

I'm sorry I should know more about CICE standalone+unit testing @apcraig
Are there tests to check that CICE standalone for our 1 degree passes
https://noaa-ufs-regtests-pds.s3.amazonaws.com/index.html#input-data-20251015/CICE_FIX/100/

@apcraig
Copy link
Copy Markdown

apcraig commented Feb 26, 2026

@NickSzapiro-NOAA, there are no standalone tests with any MOM grids. I will try to setup some standalone testing with one of the lower resolution grids. Separately, I encourage you to try to identify when the problem was introduced. There were several updates over the last few months, including some that impacted infrastructure.

@NickSzapiro-NOAA
Copy link
Copy Markdown
Collaborator Author

Thanks @apcraig . fwiw, reverting extended grid commit 26a5cfe doesn't fix it. I'm trying the zap residual and history changes too separately

Denise found that the first restart reproduces, but the next one doesn't ... maybe the thresholding in zap residual is sensitive to the order of operations across tasks (?)

@apcraig
Copy link
Copy Markdown

apcraig commented Feb 26, 2026

My understanding of zap residual is that it's just local and the parameters are identical at all grid points. I would be surprised if that result had a decomposition issue. But definitely worth confirming.

@NickSzapiro-NOAA
Copy link
Copy Markdown
Collaborator Author

Quick summary is that there is a lack of reproducibility in this update when change number of MPI tasks for 3 UFS RTs (cpld_mpi_p8, cpld_mpi_gfsv17, cpld_mpi_pdlib_p8). Towards isolating problem, I tried these today:

  • Reverting extended grid 26a5cf doesn't fix
  • Changing itd_area_min=itd_mass_min=0 doesn't fix
  • Reverting histrest 29f63e doesn't fix
  • Reverting CMIP7 a214a7 + 27a498 fixes cpld_mpi_p8 but not cpld_mpi_gfsv17 or cpld_mpi_pdlib_p8
  • Remove aicen sifb from history doesn't fix

Also, there is a code difference in ice_read_hist for netcdf vs. pio if history restart file does not exist

@apcraig
Copy link
Copy Markdown

apcraig commented Feb 27, 2026

Changing code and fixing some cases sounds to me like a compiler issue. Maybe the next step is to reduce the optimization of the compilation in the ice model just to see if that makes the problem go away. If it does, then that doesn't absolve the ice model completely, but might point us one way or another. Is there something about the compilation in UFS that is too aggressive? Does this happen on all machines? Has the CICE code update created a situation where the compiler is being too aggressive or generating an error? Can a CICE modification provide a temporary fix?

What machine does this happen on? What compiler is being used? What compiler options are being used? I could try to duplicate in the standalone model.

@NickSzapiro-NOAA
Copy link
Copy Markdown
Collaborator Author

NickSzapiro-NOAA commented Feb 27, 2026

Thanks @apcraig . It's the same failure on Ursa, Derecho, and Gaea.c6 at least, all with Intel OneAPI as in spack-stack 1.9.2 like
https://github.com/ufs-community/ufs-weather-model/blob/develop/modulefiles/ufs_derecho.intel.lua
https://github.com/ufs-community/ufs-weather-model/blob/develop/modulefiles/ufs_common.lua

I haven't tried with less optimization and will test in debug mode

It's puzzling ... it's only happening on the 1 degree mesh coupled to active atmosphere.
Not with data atmosphere, 1/4 degree, or 5 degree tests

@DeniseWorthen
Copy link
Copy Markdown
Collaborator

DeniseWorthen commented Feb 27, 2026

@NickSzapiro-NOAA Using the cpld_control_pdlib_p8_intel and cpld_mpi_pdlib_p8_intel I was able to get repro by setting ice_ic = 'default'.

@NickSzapiro-NOAA
Copy link
Copy Markdown
Collaborator Author

Interesting. Is there ice in Hudson Bay with that @DeniseWorthen ? Maybe RTs have have ice on land or such

@apcraig
Copy link
Copy Markdown

apcraig commented Feb 27, 2026

Just throwing out some additional ideas. Could there be a mismatch with the mask on the grid and initial condition? I don't know why this would be a problem though. Could there be an issue with a diagnostic that is seeing sea ice on land from the initial condition? What is not bit-for-bit, does the entire model solution diverge or are just some sea ice diagnostics different?

@DeniseWorthen
Copy link
Copy Markdown
Collaborator

DeniseWorthen commented Feb 27, 2026

This is the difference in the exported ifrac using the mediator history files at the first coupling timestep:

screenshot_2026-02-26_at_9 06 09___am_720

The IC we're using has always reproduced in our control/mpi test previously (we've been using it for years). It is March, so there is ice in Hudson Bay.

More info, we're using slenderX2; on either nprocs=10 or nprocs=20 on a 360x320 domain.

EDIT: This difference is at the 2nd not first coupling timestep (makes a big difference!)

@apcraig
Copy link
Copy Markdown

apcraig commented Feb 27, 2026

So diffs are small (~1.0e-7) and only in Hudson bay. There are no local mods in your version, right?

There was a change, CICE-Consortium#1062, that updated the haloUpdate and other infrastructure features quite dramatically. This should have been bit-for-bit and it came before CICE-Consortium#1054, so I assume it's not the problem. Your CICE-Consortium#1054 "version" has CICE-Consortium#1062 merged too, right?

Maybe a haloupdate has changed somewhere and during initialization, halo values are not updated when they should be. That would be consistent with different results with different block sizes. Can you run on 1 pe with different block sizes? If not, how about a fixed pe count with different block sizes?

@NickSzapiro-NOAA
Copy link
Copy Markdown
Collaborator Author

So in addition to cpld_mpi_p8 , cpld_mpi_gfsv17 , cpld_mpi_pdlib_p8 all failing in "intel" compiler options across HPCs.
these UFS RTs also fail with intel debug, gnu, and intelllvm compiler options on Ursa.

The EMC/CICE branch is kept similar to CICE-Consortium/main. Currently, the only code difference is not relevant (i.e., the UFS tracing PR has been merged here but not yet at Consortium)

fyi, these "mpi" tests change the number of tasks for each coupled component from their control, like:

#  cpld_control_gfsv17 --------
export INPES=$INPES_cpl_unstr
export JNPES=$JNPES_cpl_unstr
export WRTTASK_PER_GROUP=$(( WPG_cpl_unstr * THRD_cpl_unstr ))

OCN_tasks=$OCN_tasks_cpl_unstr
ICE_tasks=$ICE_tasks_cpl_unstr
WAV_tasks=$WAV_tasks_cpl_unstr

export atm_omp_num_threads=$THRD_cpl_unstr
export med_omp_num_threads=$atm_omp_num_threads

vs.

# cpld_mpi_gfsv17 --------
export INPES=$INPES_cpl_unstr_mpi
export JNPES=$JNPES_cpl_unstr_mpi
export atm_omp_num_threads=$THRD_cpl_unstr_mpi
export WRTTASK_PER_GROUP=$(( WPG_cpl_unstr_mpi * THRD_cpl_unstr_mpi ))

OCN_tasks=$OCN_tasks_cpl_unstr_mpi
ICE_tasks=$ICE_tasks_cpl_unstr_mpi
WAV_tasks=$WAV_tasks_cpl_unstr_mpi

export CICE_NPROC=$ICE_tasks
export np2=`expr $CICE_NPROC / 2`
export CICE_BLCKX=`expr $NX_GLB / $np2`
export CICE_BLCKY=`expr $NY_GLB / 2`

Most notably, if change only the ice MPI tasks from the control and keep other components with same tasks, tests pass in intel debug mode on Ursa. So it's weird ... how can the code changes in CICE relate to changing the number of tasks in another component?
I can try to isolate which other component.

Since diffs are in first coupling interval, I imagine it's ice related to ATM or CMEPS.
The coupling to ice can have its own threshold(s) for where there is sea ice too ... the MIN_SEAICE=1.0e-6 for FV3 is suspiciously similar to the scale on the Hudson Bay diff map. These tests now zap that as residual ice so I don't know if there can be an issue in coupled component when ice some<--->none

On another line of thought, still curious why reverting the CMIP7 history PRs changed a test failure to pass. Maybe flwout?

@DeniseWorthen
Copy link
Copy Markdown
Collaborator

@NickSzapiro-NOAA I think you're on to something here w/ the ATM min_seaice parameter. It would explain why I can't get a DATM config to fail.

I don't remember checking explicitly, but if it is in ATM, I think we'd see that the fields toATM on the first coupling timestep are the same, but it the first fields from ATM (2nd coupling interval) are different.

@apcraig
Copy link
Copy Markdown

apcraig commented Mar 2, 2026

Several thoughts. The CMIP7 history PRs do change model output and diagnostics but do not alter the prognostic solution. But not sure how that affects answer changes with different pes.

Can you clarify a few things. If you revert the zap PR, CICE-Consortium#1067, either by undoing the PR or setting

dyn_area_min    = 0.001d0
dyn_mass_min    = 0.01d0

do you recover the prior bit-for-bit capability? I know you tested these set to zero, but that turns it off. What you want to do is "turn it up" to recover the prior settings. If that has an impact, my guess is there is some interaction (still TBD) between the initial condition, the zapping parameter, and the coupling MIN_SEAICE. Although none of those things should be block/task variable. With the smaller values currently implemented (or setting the two parameters to zero), you will carry around more small ice concentrations. Maybe, with the old dyn_*_min values, the MIN_SEAICE was never invoked but now it is? Do your initial conditions have small ice concentrations that used to get zapped and now don't, particularly in the Labrador?

@NickSzapiro-NOAA
Copy link
Copy Markdown
Collaborator Author

NickSzapiro-NOAA commented Mar 3, 2026

Let me take a few more hours to clean up a branch with reproducer(s)

For your clarifying points, we still have

dyn_area_min    = 0.001d0 
dyn_mass_min    = 0.01d0

with a todo to try to reduce these with current UFS regression tests.

And no, I was not able to keep current answers after the zap_residual PR. UFS tests do have concentrations down to ~ puny . It seemed that the added zap_residual = .true. condition changed answers but I did not confirm. We figured to accept the baseline change from zap residual as a (good) science change from Consortium.

For UFS, maybe a good constraint is to have dyn_area_min < min_seaice so CICE is solving over the ice atmosphere sees. If it all works...

@NickSzapiro-NOAA
Copy link
Copy Markdown
Collaborator Author

NickSzapiro-NOAA commented Mar 4, 2026

I started a ufs-weather-model reproducer branch here:
https://github.com/NickSzapiro-NOAA/ufs-weather-model/tree/tests_cice_202602

Two points to highlight:

I made the reproducer branch pointing to CICE-Consortium/main , if that's cleaner

And thanks for the dialogue. It's been so helpful

@NickSzapiro-NOAA
Copy link
Copy Markdown
Collaborator Author

Changing ice+ocn and ice+wav tasks reproduce the control, but changing ice+atm+med fails.

So the lack of reproducibility is in UFSATM or CMEPS. Tracing fice through the code, current suspect is when use_cice_alb this is uninitialized if fice<min_seaice:

 if (Model%use_cice_alb) then
      if (sfc%var2(i,j,50) < -9990.0_kind_phys) then
        !$omp parallel do default(shared) private(nb, ix, im)
        do nb = 1, Atm_block%nblks
          do ix = 1, Atm_block%blksz(nb)
            im = Model%chunk_begin(nb)+ix-1
            if (Sfcprop%oceanfrac(im) > zero .and. &
                 Sfcprop%fice(im) >= Model%min_seaice) then
              Sfcprop%albdirvis_ice(im) = 0.6_kind_phys
              Sfcprop%albdifvis_ice(im) = 0.6_kind_phys
              Sfcprop%albdirnir_ice(im) = 0.6_kind_phys
              Sfcprop%albdifnir_ice(im) = 0.6_kind_phys
            endif
          enddo
        enddo
      endif

The code is in 2 places for some reason (?)
https://github.com/NOAA-EMC/ufsatm/blob/071307b2f42c1c368cd5bbe8df635fe4dc4cb8cf/io/fv3atm_sfc_io.F90#L1629-L1644

https://github.com/NOAA-EMC/ufsatm/blob/071307b2f42c1c368cd5bbe8df635fe4dc4cb8cf/io/fv3atm_sfc_io.F90#L1834-L1849

@DeniseWorthen
Copy link
Copy Markdown
Collaborator

For my test case, the first differences come back from the ATM; ICE sends the identical fields

cprnc -m ufs.cpld.cpl.hi.atm.2021-03-22-22320.nc ../mpi/ufs.cpld.cpl.hi.atm.2021-03-22-22320.nc |grep RMS
49: RMS atmImp_Faxa_lwnet                8.0238E-03            NORMALIZED  1.2332E-04
58: RMS atmImp_Faxa_rain                 1.4614E-10            NORMALIZED  2.7758E-06
74: RMS atmImp_Faxa_snow                 9.4701E-10            NORMALIZED  1.1749E-04
132: RMS atmImp_Sa_pbot                   5.5183E-06            NORMALIZED  5.6027E-11
155: RMS atmImp_Sa_shum                   1.0501E-08            NORMALIZED  1.1476E-06
171: RMS atmImp_Sa_tbot                   1.0620E-04            NORMALIZED  3.6948E-07
208: RMS atmImp_Sa_z                      3.9742E-06            NORMALIZED  3.7928E-07

I think we need to track back in ATM commits to find the culprit.

@NickSzapiro-NOAA
Copy link
Copy Markdown
Collaborator Author

NickSzapiro-NOAA commented Mar 6, 2026

Changing line in ufsatm/io/fv3atm_sfc_io.F90 fcprop%fice(im) >= Model%min_seaice to fcprop%fice(im) > Model%zero , mpi tests stll fail when change ATM tasks.

I can try some older atmosphere hashes but there's no guarantee this ever worked (as zap residual is new). At what point do we make this a UFSATM issue?

@DeniseWorthen
Copy link
Copy Markdown
Collaborator

DeniseWorthen commented Mar 6, 2026

My thinking is that a) all the differences show up only on tile3 and only in that specific region; we're zapping ice globally, so why don't we see differences globally? b) ATM gets identical values from ICE w/ either mpi or control but sends back diff values c) the control reproduces itself (not definitive, but less likely to be an uninitialized variable) d) I can't get a DATM config to fail. I think it has to be a decomp bug in the ATM.

I've also edited my comment associated w/ the ice difference field from the mediator ice history files (above). This difference is the 2nd coupling timestep. On the first coupling, the ice mediator fields are identical. At the 2nd, the ice sends back diff values because it got different values from ATM at the first coupling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sync with CICE-Consortium (2026-02)

6 participants