Multilayer Canopy #1996

slevis-lmwg · 2023-05-04T17:20:55Z

Description of changes

I am putting Gordon Bonan's code on a branch starting at ctsm5.1.dev123. I have not tested, yet. I'm opening this PR to keep a record of the work and to allow Gordon to review how I brought his code into ctsm5.1.dev123.

UPDATE: I have merged PR's #2155 branch to this one and updated to a later ctsm5.1.dev tag.

Specific notes

Contributors other than yourself, if any:
@gbonan @wwieder @olyson

CTSM Issues Fixed (include github issue #):
Fixes #1922

Are answers expected to change (and if so in what way)?
Yes. New science.

Any User Interface Changes (namelist or namelist defaults changes)?
Not, yet.

Testing performed, if any:
I have not tested, yet.

slevis-lmwg · 2023-05-15T23:08:08Z

I'm testing like this:
./create_test SMS_Ld3_PS.f09_g17.IHistClm50BgcCrop.cheyenne_intel.clm-f09_dec1990Start_GU_LULCC

For now the code builds and begins to run but stops with this error:

Attempting to read RSL look-up table .....
(GETFIL): attempting to find local file
psihat.nc
(GETFIL): failed getting file from full path: ../rsl_lookup_tables/psihat.nc

 ERROR: GETFIL: FAILED to get ../rsl_lookup_tables/psihat.nc

slevis-lmwg · 2023-06-01T00:46:36Z

Obtained psihat.nc from @gbonan's repository and submitted new test.

For now I placed the file in /glade/p/cesmdata/inputdata/lnd/clm2/rsl_lookup_tables and kept the path hardwiring in the code.

When we are ready to share more broadly, we will discuss with Erik whether to keep the file here and ./rimport to the dataset repository or place elsewhere and provide differently.

The new test failed here /glade/scratch/slevis/SMS_Ld3_PS.f09_g17.IHistClm50BgcCrop.cheyenne_intel.clm-f09_dec1990Start_GU_LULCC.20230531_190934_uacu7c/run with

Attempting to initialize multilayer canopy vertical structure .....
 ENDRUN:
 ERROR:  ERROR: initVerticalStructure: zw(p,0) improperly defined

Troubleshooting with write statements led to strange behavior, so I have added _D to the test for debug mode.

slevis-lmwg · 2023-06-01T23:52:55Z

I have found that
zw(p,0) = -2.8e-14
in line 128 of MLinitVerticalMod.F90

This seems like a computer precision error to me, so I proposed to Gordon that I change the if statement to:

if (zw(p,0) < 0._r8 and zw(p,0) >= -1.e-10_r8) then
   zw(p,0) = 0._r8
else if (zw(p,0) > 1.e-10_r8 .or. zw(p,0) < -1.e-10_r8) then
   call endrun (msg=' ERROR: initVerticalStructure: zw(p,0) improperly defined')
end if

Test SMS_Ld3_PS.f09_g17.IHistClm50BgcCrop.cheyenne_intel.clm-f09_dec1990Start_GU_LULCC now fails with ERROR: initVerticalStructure: zw(p,0) improperly defined

slevis-lmwg · 2023-06-05T21:54:38Z

Switching to this simpler test:
./create_test SMS_Ld3_D.f19_g17.I2000Clm50BgcCrop.cheyenne_intel --walltime 00:25:00
gave this new error:
ERROR: initVerticalStructure: plant area does not match CLM input
Running it with f09 gave the zw(p,0) improperly defined error again.
I'm troubleshooting.

slevis-lmwg · 2023-06-06T01:33:52Z

Increasing the following error tolerances and submitting the f19 test above:

zw(p,0) improperly defined to 1e-9
pai_err to 1e-5
sai_err to 1e-5

Now I get an error that seems bigger than the previous ones:

Attempting to initialize multilayer canopy vertical structure .....
 ncan, p, nlevmlcan =         304           5         100
 ENDRUN:
 ERROR:  ERROR: initVerticalStructure: ncan > nlevmlcan

I printed a few more variables for p = 5:

ntop, n_zref, dz_zref, zref, ztop =
11   293   29.776757   30.89297   1.116

This leads me to believe that ncan(p) > nlevmlcan due to small ztop. Does this make sense to you @gbonan? If so, how would you like to handle it differently?

This global test SMS_Ld3_D.f19_g17.I2000Clm50BgcCrop.cheyenne_intel currently fails with this error, likely due to small ztop: Attempting to initialize multilayer canopy vertical structure ..... ncan, p, nlevmlcan = 304 5 100 ntop, n_zref, dz_zref, zref, ztop = 11 293 29.776757 30.89297 1.116 ENDRUN: ERROR: ERROR: initVerticalStructure: ncan > nlevmlcan

Resolved conflicts: cime_config/buildlib

slevis-lmwg · 2024-02-27T23:13:22Z

Notes from discussing with Will and Gordon last week:

Will updated the PLUMBER2 plumbing (csv file, wrapper script, usermods, and scripts). #2155 branch to a recent dev tag (dev168)
Merge PLUMBER2 plumbing (csv file, wrapper script, usermods, and scripts). #2155 branch to this one
Try Will's Plumber example /glade/campaign/cgd/tss/people/wwieder/cheyenne_archive/PLUMBER2_tests/ZM-Mon_test2
Are single- and multi-layer history fields are working? There may be one example of each in the code already. Does anything come out to history?
Try for US-NR1 (Ameriflux and Fluxnet code for Niwot Ridge).
Gordon takes over from there. Gordon and I will have a "tutorial" on derecho and/or izumi.

slevis resolved conflicts:

Add hillslope hydrology parameterization Changes include multiple soil columns per vegetated landunit, additional meteorological downscaling, new subsurface lateral flow equations, and a hillslope routing parameterization.

slevis-lmwg · 2024-03-02T00:31:23Z

Attempting the 3rd checkbox above:
./create_newcase --case ~/cases_plumber2/ZM-Mon_test2_slevis --res CLM_USRDAT --compset HIST_DATM%1PT_CLM51%BGC_SICE_SOCN_SROF_SGLC_SWAV_SESP --run-unsupported --user-mods-dir PLUMBER2/ZM-Mon

The create_newcase ends with errput: ERROR: No variable PLUMBER2SITE found in case

This may not be a fatal error, so I'm proceeding with

./case.setup
./case.build
./case.submit

The first attempt failed due to the presence of quotes after the equal sign for CLM_USRDAT.PLUMBER2:datafiles in user_nl_datm_streams.

The second attempt failed with

(GETFIL): failed getting file from full path: /glade/u/home/wwieder/CTSM/tools/site_and_regional/subset_data_single_point/surfdata_1x1_PLUMBER2__hist_16pfts_Irrig_CMIP6_simyr2000_c231005.nc

due to the missing "ZM-Mon" in the fsurdat file name.

Third attempt SUCCESSFUL.

ekluzek · 2024-03-02T00:43:03Z

@slevis-lmwg the setting of PLUMBER2SITE should have come from the update from #2155 so I'm doubtful this will work. But, you should also check that the update from #2155 came in correctly.

slevis-lmwg · 2024-03-02T02:14:00Z

GPP_ML and LWP_ML appear in history, though I don't know whether they contain valid data. I checked the corresponding checkbox for now, because the simulation completes.

slevis-lmwg · 2024-03-06T01:34:50Z

Gordon, I created a new case for US-NR1 and submitted a run. It fails with this error:

Attempting to initialize multilayer canopy vertical structure .....
ENDRUN:
ERROR:  ERROR: initVerticalStructure: ncan > nlevmlcan

The site that I tested last week, ZM-Mon, completed a year without error. Maybe Niwot Ridge has no trees and triggers this error?

gbonan · 2024-03-06T19:05:27Z

Hi Sam, Is this an SP run or a BGC run? The difference is that canopy height is specified in SP but is calculated in BGC. If the canopy height is small (e.g., zero in an initial BGC run) and the forcing height is large, the maximum number of canopy layers can be exceeded. I cannot remember if we discussed this last week, but the run should be SP. Here is what should be happening: The tower height (forcing height) at US-NR1 is 26 m, and the canopy height is 12-13 m. The model should be using a layer thickness of 0.5 cm, so there is a total of 52 layers up to the tower forcing height. This is less than the maximum number of layer (nlevmlcan = 100). Here are some things to check in subroutine initVerticalStructure (MLinitVerticalMod.F90): What is the value for zref(p)? This should be the tower height (about 26 m) What is the value for ztop(p)? This should be the canopy height (about 12-13 m) What is the value for dz_within? This should be 0.5 m for a tall canopy (> 2m), but will be set to 0.1 m for a short canopy (< 2m) What is the value for ncan(p)? This should be 52 (i.e., 26 m / 0.5 m)

…

On Tue, Mar 5, 2024 at 6:35 PM Samuel Levis ***@***.***> wrote: Gordon, I created a new case for US-NR1 and submitted a run. It fails with this error: Attempting to initialize multilayer canopy vertical structure ..... ENDRUN: ERROR: ERROR: initVerticalStructure: ncan > nlevmlcan The site that I tested last week, ZM-Mon, completed a year without error. Maybe Niwot Ridge has no trees and triggers this error? — Reply to this email directly, view it on GitHub <#1996 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AD7K5GYXQ3UNFKVXEKCU72TYWZXELAVCNFSM6AAAAAAXWCIJMGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZZHEZDAOBTGQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

wwieder · 2024-03-06T19:11:39Z

Maybe one other question, were you able to run a 'normal' SP or BGC case at US-NR1 before turning on the ML canopy?

I do know that the default surface dataset for this site has a very high LAI for evergreen needleleaf trees, but I don't know what is being assigned for HTOP?

slevis-lmwg · 2024-03-07T00:29:02Z

Thank you for these comments, Gordon and Will.

I have switched the case from BGC to SP and got past the initialization error.
The model now completes 4 time steps and triggers this error:

(shr_strdata_readstrm) reading file ub: /glade/work/oleson/PLUMBER2/input_files/US-NR1/LAI_stream_US-NR1_1999-2014.nc       6
 
 hist_htapes_wrapup : Writing current time sample to local history file 
 ./US-NR1_sp.clm2.h1.1999-01-01-25200.nc at nstep =            4 
  for history time interval beginning at   6.250000000000000E-002  and ending at
    8.333333333333333E-002
 
(shr_orb_params) ------ Computed Orbital Parameters ------
[...]
(shr_orb_params) -----------------------------------------
(shr_strdata_readstrm) reading file ub: /glade/work/oleson/PLUMBER2/input_files/US-NR1/LAI_stream_US-NR1_1999-2014.nc       7
 ENDRUN:
 ERROR: ERROR: Norman: total longwave conservation error

I like @wwieder's suggestion that I confirm next whether this runs with the multilayer canopy OFF:

I checked out the PLUMBERcsv branch and submitted the same case.
This is out to nstep > 1000, so it seems to work.
The PLUMBERcsv branch is at dev168 and multilayer_canopy is at dev170, so I updated the former to dev170 and repeated.
Seems to work.

gbonan · 2024-03-07T00:40:24Z

Sam, The error message relates to longwave radiation in the multilayer canopy. Check that CLM runs okay without the multilayer canopy. Then we will have to figure out how to debug this error. I have not encountered it in my own uncoupled simulations, so I suspect something is not getting initialized correctly. Gordon

…

On Wed, Mar 6, 2024 at 5:29 PM Samuel Levis ***@***.***> wrote: Thank you for these comments, Gordon and Will. I have switched the case from BGC to SP and got past the initialization error. The model now completes 4 time steps and triggers this error: (shr_strdata_readstrm) reading file ub: /glade/work/oleson/PLUMBER2/input_files/US-NR1/LAI_stream_US-NR1_1999-2014.nc 6 hist_htapes_wrapup : Writing current time sample to local history file ./US-NR1_sp.clm2.h1.1999-01-01-25200.nc at nstep = 4 for history time interval beginning at 6.250000000000000E-002 and ending at 8.333333333333333E-002 (shr_orb_params) ------ Computed Orbital Parameters ------ [...] (shr_orb_params) ----------------------------------------- (shr_strdata_readstrm) reading file ub: /glade/work/oleson/PLUMBER2/input_files/US-NR1/LAI_stream_US-NR1_1999-2014.nc 7 ENDRUN: ERROR: ERROR: Norman: total longwave conservation error I like @wwieder <https://github.com/wwieder>'s suggestion that I confirm next whether this runs with the multilayer canopy OFF. — Reply to this email directly, view it on GitHub <#1996 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AD7K5GYP4QI4RQUVCMDXN23YW6YGHAVCNFSM6AAAAAAXWCIJMGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBSGEYTONZYGU> . You are receiving this because you were mentioned.Message ID: ***@***.***>

slevis-lmwg · 2024-03-13T16:53:47Z

src/multilayer_canopy/MLCanopyFluxesMod.F90

+    ! distinction between grid cell (g), column (c), and patch (p)
+    ! variables. All multilayer canopy variables are for patches.
+
+    do fp = 1, num_mlcan


Debugging:
Focus on this loop and next ending with line 417.

This far the "write statements" seem to point to tleaf(p=1, ic=9), which increases out of control from 285 to 1264K by nstep = 10. The trend in tleaf(1,9) is already apparent from nstep > 3.

Likely related... I see lw_source < 0 for ic = 9 and 10. From the tiny fracsun I infer that ic = 9 is nearest to the ground.

What I don't know how to interpret is whether the tridiagonal matrix solver has anything to do with the problem.

Gordon, it may help if you look at the output that I'm looking at.

Open the last lnd.log file in:
/glade/derecho/scratch/slevis/US-NR1_sp/run

The write statements are in

/glade/work/slevis/git/multilayer_canopy/src/multilayer_canopy/src/multilayer_canopy/MLCanopyFluxesMod.F90 /glade/work/slevis/git/multilayer_canopy/src/multilayer_canopy/src/multilayer_canopy/MLLongwaveRadiationMod.F90

I think you can type git diff in /glade/work/slevis/git/multilayer_canopy to see clearly what I did.

gbonan · 2024-03-18T16:29:28Z

Sam, Thanks. This is very good detective work! And the model output and print statements are easy to follow. You are correct in identifying lw_source < 0 for layers ic=9 and ic=10 on the very first time step as the problem. lw_source is the emitted longwave radiation from the leaf layer. It has two components: lw_source_sun for sunlit leaves and lw_source_sha for shaded leaves. These depend on leaf temperature (tleaf) for the sunlit and shaded leaves. These temperatures are okay (283.600006103516 K), so the component fluxes should be good. The sunlit and shaded fluxes are then weighted by the sunlit fraction (fracsun) and shaded fraction (1-fracsun). The sunlit fraction is very small (10^-39), but this small value is not directly causing the negative lw_source. The weighted fluxes are also multiplied by (1-td). td > 1 for layers ic=9 and ic=10, which is causing the negative lw_source for these layers. td is the fractional transmittance of diffuse radiation for the canopy layer. It should be < 1. td is calculated in MLSolarRadiationMod (lines 196-203). For each leaf layer, the code loops through 9 angle classes (angle = 5, ..., 85 degrees) to calculate an extinction coefficient (gdirj) that is then weighted and summed over the 9 classes. Within this loop print out: solar_zen(p), kb(p,ic), tb(p,ic) gdirj, phi1(p,ic), phi2(p,ic), angle, chil(p,ic) exp(-gdirj / cos(angle) * dpai(p,ic) * clump_fac_ic(p,ic)) * sin(angle) * cos(angle) dpai(p,ic), clump_fac_ic(p,ic) td(p,ic) You only have to do this for ic=9 and ic=10 and for the first time step. The first line of output is for the direct beam transmittance (tb). I want to see if something strange is going on with that too.

…

On Fri, Mar 15, 2024 at 2:48 PM Samuel Levis ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In src/multilayer_canopy/MLCanopyFluxesMod.F90 <#1996 (comment)>: > + call initVerticalStructure (bounds, num_mlcan, filter_mlcan, & + canopystate_inst, frictionvel_inst, mlcanopy_inst) + + call initVerticalProfiles (num_mlcan, filter_mlcan, & + atm2lnd_inst, wateratm2lndbulk_inst, mlcanopy_inst) + + if (masterproc) then + write (iulog,*) 'Successfully initialized multilayer canopy vertical structure' + end if + end if + + ! Copy CLM variables to multilayer canopy variables. Note the + ! distinction between grid cell (g), column (c), and patch (p) + ! variables. All multilayer canopy variables are for patches. + + do fp = 1, num_mlcan This far the "write statements" seem to point to tleaf(p=1, ic=9), which increases out of control from 285 to 1264K by nstep = 10. The trend in tleaf(1,9) is already appearent from nstep > 3. Likely related... I see lw_source < 0 for ic = 9 and 10. From the tiny fracsun I infer that ic = 9 is nearest to the ground. What I don't know how to interpret is whether the tridiagonal matrix solver has anything to do with the problem. Gordon, it may help if you look at the output that I'm looking at. - Open the last lnd.log file in: /glade/derecho/scratch/slevis/US-NR1_sp/run - The write statements are in /glade/work/slevis/git/multilayer_canopy/src/multilayer_canopy/src/multilayer_canopy/MLCanopyFluxesMod.F90 /glade/work/slevis/git/multilayer_canopy/src/multilayer_canopy/src/multilayer_canopy/MLLongwaveRadiationMod.F90 I think you can type git diff in /glade/work/slevis/git/multilayer_canopy to see clearly what I did. — Reply to this email directly, view it on GitHub <#1996 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AD7K5G7YTYK5BNFQ35DO2G3YYNNCJAVCNFSM6AAAAAAXWCIJMGVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTSNBQGU3DGNBRHA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

slevis-lmwg · 2024-03-18T18:23:52Z

Ok, thank you, Gordon. You can go back to the same place to see my code changes.

The lnd.log files (including the latest one) now appear in /glade/derecho/scratch/slevis/archive/US-NR1_sp/logs because I ran for 3 timesteps and the simulation completed (i.e. didn't crash) and, therefore, copied files to the archive directory.

gbonan · 2024-03-18T20:09:02Z

Thanks, Sam. I think I see what is going on. The total plant area index (leaves + stem) for these two layers is very small: ic = 10, dpai = 3.569665593485061E-003 ic = 9, dpai = 1.325427162346319E-003 These low values of dpai cause td > 1. In my tower simulations, dpai is always greater than 0.01. If that value is used, I do not get td > 1. So let's try a simple test of printing out some canopy structure information (so I can see the default values) and then resetting dlai and dsai > 0.01. Let's see if this gets beyond the error. MLCanopyFluxesMod.F90: line 404: ! Vertical profiles write(iulog,*) lai(p),sai(p),mlcanopy_inst%ztop_canopy(p),mlcanopy_inst%zbot_canopy write(iulog,*) mlcanopy_inst%ntop_canopy,mlcanopy_inst%nbot_canopy do ic = 1, ncan(p) dlai(p,ic) = ... dsai(p,ic) = ... dpai(p,ic) = ... write (iulog,*) ic,dlai(p,ic),dsai(p,ic),dpai(p,ic),mlcanopy_inst%dz_profile(p,ic) ! Now reset values to minimum dlai(p,ic) = max(dlai(p,ic), 0.01_r8) dsai(p,ic) = max(dsai(p,ic), 0.01_r8) dpai(p,ic) = dlai(p,ic) + dsai(p,ic) end do What this does is print out canopy structure: lai, sai, top height, bottom height layer index for canopy top, layer index for canopy bottom for each layer: layer index,lai, sai, lai+sai, layer thickness Then it resets lai and sai for each layer to > 0.01

…

On Mon, Mar 18, 2024 at 12:24 PM Samuel Levis ***@***.***> wrote: Ok, thank you, Gordon. You can go back to the same place to see my code changes. The lnd.log files (including the latest one) now appear in /glade/derecho/scratch/slevis/archive/US-NR1_sp/logs because I ran for 3 timesteps and the simulation completed (i.e. didn't crash) and, therefore, copied files to the archive directory. — Reply to this email directly, view it on GitHub <#1996 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AD7K5GYYLPXOPCJ7AVKURADYY4WM7AVCNFSM6AAAAAAXWCIJMGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBUGYZTKOJVGM> . You are receiving this because you were mentioned.Message ID: ***@***.***>

slevis-lmwg · 2024-03-18T21:10:12Z

Gordon, I commented out the error check right after the lines of code that you suggested and now this works. The simulation is in progress in the same scratch/.../run directory that I mentioned above.

gbonan · 2024-03-18T23:27:39Z

Sam, It looks like the code ran for 20937 timesteps (when I last checked), which is over a year. You can stop the run. There is no need to keep it going. Check to see if there are valid multilayer canopy fields on the history file (GPP_ML, LWP_ML). It looks like LWP_ML has some numbers, but GPP_ML is empty.

…

On Mon, Mar 18, 2024 at 3:10 PM Samuel Levis ***@***.***> wrote: Gordon, I commented out the error check right after the lines of code that you suggested and now this works. The simulation is in progress in the same scratch/.../run directory that I mentioned above. — Reply to this email directly, view it on GitHub <#1996 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AD7K5G66EJYVBGX7UWSW5IDYY5J4VAVCNFSM6AAAAAAXWCIJMGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBUHE4TMNBVGQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

slevis-lmwg · 2024-03-19T00:15:17Z

Gordon, I see the same. Shall we debug that next?

gbonan · 2024-03-19T19:51:53Z

Yes, debug the two multilayer fields that go onto the history file (GPP_ML, LWP_ML). The header is correct for both, but the values seem to be wrong. GPP_ML is a single-level field, but has no data. LWP_ML is a multi-level field (nlevmlcan) and seems to have bad data. Data is written for 73 levels, but the leaf layers are levels 9 to 24. All other layers should have missing data. You can set up the run for 1 month (July, 31 days) and write history files every time step.

…

On Mon, Mar 18, 2024 at 6:15 PM Samuel Levis ***@***.***> wrote: Gordon, I see the same. Shall we debug that next? — Reply to this email directly, view it on GitHub <#1996 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AD7K5G6CUEWXIGY3ECKL5RLYY57SXAVCNFSM6AAAAAAXWCIJMGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBVGQYTKMRUGU> . You are receiving this because you were mentioned.Message ID: ***@***.***>

slevis-lmwg · 2024-03-21T00:38:12Z

Hi Gordon,
For now I ran 16 timesteps rather than a month, and the output got moved to the /archive directory:

/glade/derecho/scratch/slevis/archive/US-NR1_sp/lnd/hist
/glade/derecho/scratch/slevis/archive/US-NR1_sp/logs

I used ncdump to make ascii copies of the first timestep from this run and of the first month from the earlier run. My initial concern is that GPP_ML is NaN from the first timestep.

You'll see output from write statements in the latest lnd.log file showing that gpp = NaN because sunlit agross = NaN for ic = 1 to 8. (Sorry I had intended but didn't write out the ic values.) So for starters I would stop adding the non-existent layers to the sum, right?

I will put this aside tomorrow and hope to get back to it on Friday.

gbonan · 2024-03-21T21:28:07Z

Hi Sam, Try running again printing out values for ic (so we know what layers we are looking at). Then we should discuss how to debug the error. It seems to suggest that agross (which comes from LeafPhotosynthesis) is not calculated correctly. It might be better for me to debug because I know the code. I talked with Erik and Adrianna and they both suggested running on izumi using a login node. They said the code will run without queuing, so jobs can turn around quickly (i.e., add write statements, run the model, look at output, add more write statements, ...). We can talk about this to see if it is the best path forward.

…

On Wed, Mar 20, 2024 at 6:38 PM Samuel Levis ***@***.***> wrote: Hi Gordon, For now I ran 16 timesteps rather than a month, and the output got moved to the /archive directory: /glade/derecho/scratch/slevis/archive/US-NR1_sp/lnd/hist /glade/derecho/scratch/slevis/archive/US-NR1_sp/logs I used ncdump to make ascii copies of the first timestep from this run and of the first month from the earlier run. My initial concern is that GPP_ML is NaN from the first timestep. You'll see output from write statements in the latest lnd.log file showing that gpp = NaN because sunlit agross = NaN for ic = 1 to 8. (Sorry I had intended but didn't write out the ic values.) So for starters I would stop adding the non-existent layers to the sum, right? I will put this aside tomorrow and hope to get back to it on Friday. — Reply to this email directly, view it on GitHub <#1996 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AD7K5GY6TWZ4DRRCPONOFMLYZITYVAVCNFSM6AAAAAAXWCIJMGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJQHE3TKOJWGQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

slevis-lmwg · 2024-03-22T00:46:16Z

Ok, Gordon, look for the new lnd.log file in the same /archive directory and let me know what you'd like to do next.

About izumi, I did not know that you could run outside the queues (and would be interested in seeing how), but even within the izumi queues there's rarely any wait-time.

gbonan · 2024-03-22T16:46:32Z

This is very helpful, Sam. I think I see the problem (or at least one problem). In our previous quick fix to prevent small values of dpai, we messed up the indexing of what canopy layers have leaves and stems. nbot = Bottom canopy layer with leaves and stems. My notes show nbot = 9 ntop = Top canopy layer with leaves and stems. My notes show ntop = 24 Our quick fix was to set dlai and dsai to a minimum of 0.01 for all layers (1 to ncan) regardless of if they had leaves or stems. So there is a mismatch now between the top and bottom of the canopy (ntop, nbot) and layers with leaves and stems (dpai > 0). I think this is what is causing NaN for agross. So let's try this fix in CanopyFluxesMod.F90 (lines 411-413): if (dlai(p,ic) > 0._r8) dlai(p,ic) = max(dlai(p,ic), 0.01_r8) if (dsai(p,ic) > 0._r8) dsai(p,ic) = max(dsai(p,ic), 0.01_r8) dpai(p,ic) = dlai(p,ic) + dsai(p,ic) This way, dlai and dsai are only reset in layers that have leaves and stems.

…

On Thu, Mar 21, 2024 at 6:46 PM Samuel Levis ***@***.***> wrote: Ok, Gordon, look for the new lnd.log file in the same /archive directory and let me know what you'd like to do next. About izumi, I did not know that you could run outside the queues (and would be interested in seeing how), but even within the izumi queues there's rarely any wait-time. — Reply to this email directly, view it on GitHub <#1996 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AD7K5G2IRDJKGCPOZ64UPTLYZN5O5AVCNFSM6AAAAAAXWCIJMGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJUGEYTONRQGU> . You are receiving this because you were mentioned.Message ID: ***@***.***>

ekluzek · 2024-03-22T17:41:24Z

@gbonan and @slevis-lmwg FYI on running on the login node. You do this using:

./case.submit --no-batch

I remember having some weird subtleties the last time I did it. So if you see that -- let me know and I
can likely help. I don't remember well enough off hand what they were. So just try that and see if it
works.

As @slevis-lmwg says sending to the queue on Izumi is rarely a problem. While it often is on Derecho.

slevis-lmwg · 2024-03-22T21:08:10Z

src/multilayer_canopy/MLCanopyFluxesMod.F90

-          dlai(p,ic) = max(dlai(p,ic), 0.01_r8)
-          dsai(p,ic) = max(dsai(p,ic), 0.01_r8)
+          if (dlai(p,ic) > 0._r8) dlai(p,ic) = max(dlai(p,ic), 0.01_r8)
+          if (dsai(p,ic) > 0._r8) dsai(p,ic) = max(dsai(p,ic), 0.01_r8)
          dpai(p,ic) = dlai(p,ic) + dsai(p,ic)


Gordon, these are the lines that you recommended changing (410 - 411).

slevis-lmwg · 2024-03-22T21:09:53Z

src/multilayer_canopy/MLCanopyNitrogenProfileMod.F90

-       end if
+!      if (abs(numerical-analytical) > 1.e-06_r8) then
+!         call endrun (msg='ERROR: CanopyNitrogenProfile: canopy integration error')
+!      end if


After changing lines 410-411 in MLCanopyFluxesMod.F90, this error got triggered, so I commented it out, because I do not think that we should expect it to pass anymore.

slevis-lmwg · 2024-03-22T21:12:32Z

src/multilayer_canopy/MLCanopyFluxesMod.F90

+       ! introducing minimum dlai and dsai a few lines up from here
+!      if (abs(totpai - (lai(p)+sai(p))) > 1.e-06_r8) then
+!         call endrun (msg=' ERROR: MLCanopyFluxes: plant area index not updated correctly')
+!      end if


I don't think I mentioned before, but this error was triggered by the previous change to lines 410-411. Again, I commented it out because I do not think that we should expect it to pass anymore.

slevis-lmwg · 2024-03-22T21:14:14Z

src/multilayer_canopy/MLCanopyFluxesMod.F90

-       lat = grc%latdeg(g) * pi / 180._r8
-       lon = grc%londeg(g) * pi / 180._r8
+       lat = grc%lat(g)
+       lon = grc%lon(g)


I did not bring up this one before, either. It's a simplification that does not change answers.

slevis-lmwg · 2024-03-22T21:21:37Z

Gordon, I think that results look better now, but I will let you tell me what you think.

I used the same write statements in the latest lnd.log file, and I made new ncdumps in the corresponding /lnd/hist directory. Both are in the same /archive area as before.

gbonan · 2024-03-22T22:27:48Z

Yea!!! Well done, Sam. It looks like the model is running correctly, based on the history files. The output for GPP_ML and LWP_ML look good. As a recap, the problem was thin canopy layers with very small dlai and/or dpai. We put in a quick fix, but that caused additional problems. The fix we have now works, but requires some warnings to be turned off because the sum of dlai+dsai does not equal the expected total lai+sai. So I will need to investigate the correct fix. The best way to do that will be for me to run the code so that I can investigate alternatives. It does sound like running on izumi is a good option. So, Sam, maybe you could get the case moved over to izumi and try it out. Then you can give be a tutorial on running the code. Thanks for your help with this, and have a good weekend.

…

On Fri, Mar 22, 2024 at 3:22 PM Samuel Levis ***@***.***> wrote: Gordon, I think that results look better now, but I will let you tell me what you think. I used the same write statements in the latest lnd.log file, and I made new ncdumps in the corresponding /lnd/hist directory. Both are in the same /archive area as before. — Reply to this email directly, view it on GitHub <#1996 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AD7K5G4BZC6UJQC6LW2XZ7DYZSOHPAVCNFSM6AAAAAAXWCIJMGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJVHE2DOOBYGE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

slevis-lmwg · 2024-03-23T00:20:57Z

Ok, I have the same case running on izumi now.

gbonan · 2024-03-25T17:48:56Z

I have a login on izumi. We can schedule a meeting at your convenience so I can learn how to run the case.

…

On Fri, Mar 22, 2024 at 6:21 PM Samuel Levis ***@***.***> wrote: Ok, I have the same case running on izumi now. — Reply to this email directly, view it on GitHub <#1996 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AD7K5GZJHCMO6CBTNDMG353YZTDH7AVCNFSM6AAAAAAXWCIJMGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJWGIYTMMZZGU> . You are receiving this because you were mentioned.Message ID: ***@***.***>

slevis-lmwg · 2024-03-26T21:33:54Z

@gbonan and I met today and completed a step-by-step tutorial on izumi on cloning the ctsm, checking out this PR's branch, creating a new case, and building and running the case.

Gordon, you should receive a github notification that I gave you collaborator permissions so that you may push commits to this PR. When you're ready we will go over instructions for that.

slevis-lmwg · 2024-03-26T21:58:29Z

Though, best if you act on the "collaborator" invitation by the end of the week, because it has an expiration date, I'm pretty sure...

gbonan · 2024-03-26T22:37:10Z

Sam, I accepted the github invitation. Also, I have been running the code today on izumi. All is going well. Thanks for the tutorial. Gordon

…

On Tue, Mar 26, 2024 at 3:34 PM Samuel Levis ***@***.***> wrote: @gbonan <https://github.com/gbonan> and I met today and completed a step-by-step tutorial on izumi on cloning the ctsm, checking out this PR's branch, creating a new case, and building and running the case. Gordon, you should receive a github notification that I gave you collaborator permissions so that you may push commits to this PR. When you're ready we will go over instructions for that. — Reply to this email directly, view it on GitHub <#1996 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AD7K5GYX6M7VBRZ7ELHCWVDY2HSVRAVCNFSM6AAAAAAXWCIJMGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRRGUYTGMZSGM> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Untested first draft of Multilayer Canopy code

29c72a9

slevis-lmwg self-assigned this May 4, 2023

Corrections for the code to build and begin running

b3f245a

Corrections and updates to get past certain errors

4641c8d

Test SMS_Ld3_PS.f09_g17.IHistClm50BgcCrop.cheyenne_intel.clm-f09_dec1990Start_GU_LULCC now fails with ERROR: initVerticalStructure: zw(p,0) improperly defined

slevis-lmwg added 3 commits June 7, 2023 11:37

Code updates provided by @gbonan

447fc97

Merge remote-tracking branch 'escomp/master' into multilayer_canopy

b996a7a

Resolved conflicts: cime_config/buildlib

ekluzek added enhancement new capability or improved behavior of existing capability tag: enh - new science labels Jan 20, 2024

slevis-lmwg added 4 commits February 27, 2024 16:31

Merge remote-tracking branch 'olyson/PLUMBERcsv' into multilayer_canopy

90fcf64

slevis resolved conflicts:

Merge tag 'ctsm5.1.dev170' into multilayer_canopy

018ac84

Add hillslope hydrology parameterization Changes include multiple soil columns per vegetated landunit, additional meteorological downscaling, new subsurface lateral flow equations, and a hillslope routing parameterization.

Correct typo in PLUMBER2/defaults/shell_commands

fb60fe8

Correct conflict that got missed in a "git merge" earlier

9152d3a

slevis-lmwg added tag: enh - new science and removed tag: enh - new science labels Mar 5, 2024

Rm QFLX_SUB_SNOW from requested history vars as it doesn't exist

5146260

slevis-lmwg commented Mar 13, 2024

View reviewed changes

Rename QFLX_SUB_SNOW -> QFLX_SOLIDEVAP_FROM_TOP_LAYER in user_nl_clm

95710d8

Code updates as a result of debugging

76deeb3

Debugging updates

d730e34

slevis-lmwg commented Mar 22, 2024

View reviewed changes

Multilayer canopy code to run at US-NR1

7de8f78

samsrabin added science Enhancement to or bug impacting science and removed enh - new science labels Aug 8, 2024

Multilayer Canopy #1996

Are you sure you want to change the base?

Multilayer Canopy #1996

Conversation

slevis-lmwg commented May 4, 2023 • edited Loading

Description of changes

Specific notes

slevis-lmwg commented May 15, 2023

slevis-lmwg commented Jun 1, 2023 • edited Loading

slevis-lmwg commented Jun 1, 2023 • edited Loading

slevis-lmwg commented Jun 5, 2023 • edited Loading

slevis-lmwg commented Jun 6, 2023

slevis-lmwg commented Feb 27, 2024 • edited Loading

slevis-lmwg commented Mar 2, 2024 • edited Loading

ekluzek commented Mar 2, 2024

slevis-lmwg commented Mar 2, 2024 • edited Loading

slevis-lmwg commented Mar 6, 2024

gbonan commented Mar 6, 2024 via email

wwieder commented Mar 6, 2024

slevis-lmwg commented Mar 7, 2024 • edited Loading

gbonan commented Mar 7, 2024 via email

slevis-lmwg Mar 13, 2024

Choose a reason for hiding this comment

slevis-lmwg Mar 15, 2024 • edited Loading

Choose a reason for hiding this comment

gbonan commented Mar 18, 2024 via email

slevis-lmwg commented Mar 18, 2024

gbonan commented Mar 18, 2024 via email

slevis-lmwg commented Mar 18, 2024

gbonan commented Mar 18, 2024 via email

slevis-lmwg commented Mar 19, 2024

gbonan commented Mar 19, 2024 via email

slevis-lmwg commented Mar 21, 2024

gbonan commented Mar 21, 2024 via email

slevis-lmwg commented Mar 22, 2024

gbonan commented Mar 22, 2024 via email

ekluzek commented Mar 22, 2024

slevis-lmwg Mar 22, 2024

Choose a reason for hiding this comment

slevis-lmwg Mar 22, 2024

Choose a reason for hiding this comment

slevis-lmwg Mar 22, 2024

Choose a reason for hiding this comment

slevis-lmwg Mar 22, 2024

Choose a reason for hiding this comment

slevis-lmwg commented Mar 22, 2024

gbonan commented Mar 22, 2024 via email

slevis-lmwg commented Mar 23, 2024

gbonan commented Mar 25, 2024 via email

slevis-lmwg commented Mar 26, 2024

slevis-lmwg commented Mar 26, 2024

gbonan commented Mar 26, 2024 via email

slevis-lmwg commented May 4, 2023 •

edited

Loading

slevis-lmwg commented Jun 1, 2023 •

edited

Loading

slevis-lmwg commented Jun 1, 2023 •

edited

Loading

slevis-lmwg commented Jun 5, 2023 •

edited

Loading

slevis-lmwg commented Feb 27, 2024 •

edited

Loading

slevis-lmwg commented Mar 2, 2024 •

edited

Loading

slevis-lmwg commented Mar 2, 2024 •

edited

Loading

slevis-lmwg commented Mar 7, 2024 •

edited

Loading

slevis-lmwg Mar 15, 2024 •

edited

Loading