Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multilayer Canopy #1996

Draft
wants to merge 15 commits into
base: master
Choose a base branch
from
Draft

Conversation

slevis-lmwg
Copy link
Contributor

@slevis-lmwg slevis-lmwg commented May 4, 2023

Description of changes

I am putting Gordon Bonan's code on a branch starting at ctsm5.1.dev123. I have not tested, yet. I'm opening this PR to keep a record of the work and to allow Gordon to review how I brought his code into ctsm5.1.dev123.

UPDATE: I have merged PR's #2155 branch to this one and updated to a later ctsm5.1.dev tag.

Specific notes

Contributors other than yourself, if any:
@gbonan @wwieder @olyson

CTSM Issues Fixed (include github issue #):
Fixes #1922

Are answers expected to change (and if so in what way)?
Yes. New science.

Any User Interface Changes (namelist or namelist defaults changes)?
Not, yet.

Testing performed, if any:
I have not tested, yet.

@slevis-lmwg slevis-lmwg self-assigned this May 4, 2023
@slevis-lmwg
Copy link
Contributor Author

I'm testing like this:
./create_test SMS_Ld3_PS.f09_g17.IHistClm50BgcCrop.cheyenne_intel.clm-f09_dec1990Start_GU_LULCC

For now the code builds and begins to run but stops with this error:

Attempting to read RSL look-up table .....
(GETFIL): attempting to find local file
psihat.nc
(GETFIL): failed getting file from full path: ../rsl_lookup_tables/psihat.nc

 ERROR: GETFIL: FAILED to get ../rsl_lookup_tables/psihat.nc

@slevis-lmwg
Copy link
Contributor Author

slevis-lmwg commented Jun 1, 2023

Obtained psihat.nc from @gbonan's repository and submitted new test.

For now I placed the file in /glade/p/cesmdata/inputdata/lnd/clm2/rsl_lookup_tables and kept the path hardwiring in the code.

When we are ready to share more broadly, we will discuss with Erik whether to keep the file here and ./rimport to the dataset repository or place elsewhere and provide differently.

The new test failed here /glade/scratch/slevis/SMS_Ld3_PS.f09_g17.IHistClm50BgcCrop.cheyenne_intel.clm-f09_dec1990Start_GU_LULCC.20230531_190934_uacu7c/run with

Attempting to initialize multilayer canopy vertical structure .....
 ENDRUN:
 ERROR:  ERROR: initVerticalStructure: zw(p,0) improperly defined

Troubleshooting with write statements led to strange behavior, so I have added _D to the test for debug mode.

@slevis-lmwg
Copy link
Contributor Author

slevis-lmwg commented Jun 1, 2023

I have found that
zw(p,0) = -2.8e-14
in line 128 of MLinitVerticalMod.F90

This seems like a computer precision error to me, so I proposed to Gordon that I change the if statement to:

if (zw(p,0) < 0._r8 and zw(p,0) >= -1.e-10_r8) then
   zw(p,0) = 0._r8
else if (zw(p,0) > 1.e-10_r8 .or. zw(p,0) < -1.e-10_r8) then
   call endrun (msg=' ERROR: initVerticalStructure: zw(p,0) improperly defined')
end if

Test
SMS_Ld3_PS.f09_g17.IHistClm50BgcCrop.cheyenne_intel.clm-f09_dec1990Start_GU_LULCC
now fails with
ERROR: initVerticalStructure: zw(p,0) improperly defined
@slevis-lmwg
Copy link
Contributor Author

slevis-lmwg commented Jun 5, 2023

Switching to this simpler test:
./create_test SMS_Ld3_D.f19_g17.I2000Clm50BgcCrop.cheyenne_intel --walltime 00:25:00
gave this new error:
ERROR: initVerticalStructure: plant area does not match CLM input
Running it with f09 gave the zw(p,0) improperly defined error again.
I'm troubleshooting.

@slevis-lmwg
Copy link
Contributor Author

Increasing the following error tolerances and submitting the f19 test above:

  • zw(p,0) improperly defined to 1e-9
  • pai_err to 1e-5
  • sai_err to 1e-5

Now I get an error that seems bigger than the previous ones:

Attempting to initialize multilayer canopy vertical structure .....
 ncan, p, nlevmlcan =         304           5         100
 ENDRUN:
 ERROR:  ERROR: initVerticalStructure: ncan > nlevmlcan

I printed a few more variables for p = 5:

ntop, n_zref, dz_zref, zref, ztop =
11   293   29.776757   30.89297   1.116

This leads me to believe that ncan(p) > nlevmlcan due to small ztop. Does this make sense to you @gbonan? If so, how would you like to handle it differently?

This global test
SMS_Ld3_D.f19_g17.I2000Clm50BgcCrop.cheyenne_intel
currently fails with this error, likely due to small ztop:

Attempting to initialize multilayer canopy vertical structure .....
ncan, p, nlevmlcan =         304           5         100
ntop, n_zref, dz_zref, zref, ztop =
11   293   29.776757   30.89297   1.116
 ENDRUN:
 ERROR:  ERROR: initVerticalStructure: ncan > nlevmlcan
@ekluzek ekluzek added enhancement new capability or improved behavior of existing capability tag: enh - new science labels Jan 20, 2024
@slevis-lmwg
Copy link
Contributor Author

slevis-lmwg commented Feb 27, 2024

Notes from discussing with Will and Gordon last week:

Add hillslope hydrology parameterization

Changes include multiple soil columns per vegetated landunit, additional meteorological downscaling, new subsurface lateral flow equations, and a hillslope routing parameterization.
@slevis-lmwg
Copy link
Contributor Author

slevis-lmwg commented Mar 2, 2024

Attempting the 3rd checkbox above:
./create_newcase --case ~/cases_plumber2/ZM-Mon_test2_slevis --res CLM_USRDAT --compset HIST_DATM%1PT_CLM51%BGC_SICE_SOCN_SROF_SGLC_SWAV_SESP --run-unsupported --user-mods-dir PLUMBER2/ZM-Mon

The create_newcase ends with errput: ERROR: No variable PLUMBER2SITE found in case

This may not be a fatal error, so I'm proceeding with

./case.setup
./case.build
./case.submit

The first attempt failed due to the presence of quotes after the equal sign for CLM_USRDAT.PLUMBER2:datafiles in user_nl_datm_streams.

The second attempt failed with

(GETFIL): failed getting file from full path: /glade/u/home/wwieder/CTSM/tools/site_and_regional/subset_data_single_point/surfdata_1x1_PLUMBER2__hist_16pfts_Irrig_CMIP6_simyr2000_c231005.nc                     

due to the missing "ZM-Mon" in the fsurdat file name.

Third attempt SUCCESSFUL.

@ekluzek
Copy link
Collaborator

ekluzek commented Mar 2, 2024

@slevis-lmwg the setting of PLUMBER2SITE should have come from the update from #2155 so I'm doubtful this will work. But, you should also check that the update from #2155 came in correctly.

@slevis-lmwg
Copy link
Contributor Author

slevis-lmwg commented Mar 2, 2024

GPP_ML and LWP_ML appear in history, though I don't know whether they contain valid data. I checked the corresponding checkbox for now, because the simulation completes.

@slevis-lmwg
Copy link
Contributor Author

Gordon, I created a new case for US-NR1 and submitted a run. It fails with this error:

Attempting to initialize multilayer canopy vertical structure .....
ENDRUN:
ERROR:  ERROR: initVerticalStructure: ncan > nlevmlcan

The site that I tested last week, ZM-Mon, completed a year without error. Maybe Niwot Ridge has no trees and triggers this error?

@gbonan
Copy link
Collaborator

gbonan commented Mar 6, 2024 via email

@wwieder
Copy link
Contributor

wwieder commented Mar 6, 2024

Maybe one other question, were you able to run a 'normal' SP or BGC case at US-NR1 before turning on the ML canopy?

I do know that the default surface dataset for this site has a very high LAI for evergreen needleleaf trees, but I don't know what is being assigned for HTOP?

@slevis-lmwg
Copy link
Contributor Author

slevis-lmwg commented Mar 7, 2024

Thank you for these comments, Gordon and Will.

I have switched the case from BGC to SP and got past the initialization error.
The model now completes 4 time steps and triggers this error:

(shr_strdata_readstrm) reading file ub: /glade/work/oleson/PLUMBER2/input_files/US-NR1/LAI_stream_US-NR1_1999-2014.nc       6
 
 hist_htapes_wrapup : Writing current time sample to local history file 
 ./US-NR1_sp.clm2.h1.1999-01-01-25200.nc at nstep =            4 
  for history time interval beginning at   6.250000000000000E-002  and ending at
    8.333333333333333E-002
 
(shr_orb_params) ------ Computed Orbital Parameters ------
[...]
(shr_orb_params) -----------------------------------------
(shr_strdata_readstrm) reading file ub: /glade/work/oleson/PLUMBER2/input_files/US-NR1/LAI_stream_US-NR1_1999-2014.nc       7
 ENDRUN:
 ERROR: ERROR: Norman: total longwave conservation error

I like @wwieder's suggestion that I confirm next whether this runs with the multilayer canopy OFF:

  • I checked out the PLUMBERcsv branch and submitted the same case.
  • This is out to nstep > 1000, so it seems to work.
  • The PLUMBERcsv branch is at dev168 and multilayer_canopy is at dev170, so I updated the former to dev170 and repeated.
  • Seems to work.

@gbonan
Copy link
Collaborator

gbonan commented Mar 7, 2024 via email

! distinction between grid cell (g), column (c), and patch (p)
! variables. All multilayer canopy variables are for patches.

do fp = 1, num_mlcan
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debugging:
Focus on this loop and next ending with line 417.

Copy link
Contributor Author

@slevis-lmwg slevis-lmwg Mar 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This far the "write statements" seem to point to tleaf(p=1, ic=9), which increases out of control from 285 to 1264K by nstep = 10. The trend in tleaf(1,9) is already apparent from nstep > 3.

Likely related... I see lw_source < 0 for ic = 9 and 10. From the tiny fracsun I infer that ic = 9 is nearest to the ground.

What I don't know how to interpret is whether the tridiagonal matrix solver has anything to do with the problem.

Gordon, it may help if you look at the output that I'm looking at.

  • Open the last lnd.log file in:
    /glade/derecho/scratch/slevis/US-NR1_sp/run
  • The write statements are in
/glade/work/slevis/git/multilayer_canopy/src/multilayer_canopy/src/multilayer_canopy/MLCanopyFluxesMod.F90
/glade/work/slevis/git/multilayer_canopy/src/multilayer_canopy/src/multilayer_canopy/MLLongwaveRadiationMod.F90

I think you can type git diff in /glade/work/slevis/git/multilayer_canopy to see clearly what I did.

@gbonan
Copy link
Collaborator

gbonan commented Mar 18, 2024 via email

@slevis-lmwg
Copy link
Contributor Author

Ok, thank you, Gordon. You can go back to the same place to see my code changes.

The lnd.log files (including the latest one) now appear in /glade/derecho/scratch/slevis/archive/US-NR1_sp/logs because I ran for 3 timesteps and the simulation completed (i.e. didn't crash) and, therefore, copied files to the archive directory.

@gbonan
Copy link
Collaborator

gbonan commented Mar 18, 2024 via email

@slevis-lmwg
Copy link
Contributor Author

Gordon, I commented out the error check right after the lines of code that you suggested and now this works. The simulation is in progress in the same scratch/.../run directory that I mentioned above.

@gbonan
Copy link
Collaborator

gbonan commented Mar 18, 2024 via email

@slevis-lmwg
Copy link
Contributor Author

Gordon, I see the same. Shall we debug that next?

@gbonan
Copy link
Collaborator

gbonan commented Mar 19, 2024 via email

@slevis-lmwg
Copy link
Contributor Author

Hi Gordon,
For now I ran 16 timesteps rather than a month, and the output got moved to the /archive directory:

/glade/derecho/scratch/slevis/archive/US-NR1_sp/lnd/hist
/glade/derecho/scratch/slevis/archive/US-NR1_sp/logs

I used ncdump to make ascii copies of the first timestep from this run and of the first month from the earlier run. My initial concern is that GPP_ML is NaN from the first timestep.

You'll see output from write statements in the latest lnd.log file showing that gpp = NaN because sunlit agross = NaN for ic = 1 to 8. (Sorry I had intended but didn't write out the ic values.) So for starters I would stop adding the non-existent layers to the sum, right?

I will put this aside tomorrow and hope to get back to it on Friday.

@gbonan
Copy link
Collaborator

gbonan commented Mar 21, 2024 via email

@slevis-lmwg
Copy link
Contributor Author

Ok, Gordon, look for the new lnd.log file in the same /archive directory and let me know what you'd like to do next.

About izumi, I did not know that you could run outside the queues (and would be interested in seeing how), but even within the izumi queues there's rarely any wait-time.

@gbonan
Copy link
Collaborator

gbonan commented Mar 22, 2024 via email

@ekluzek
Copy link
Collaborator

ekluzek commented Mar 22, 2024

@gbonan and @slevis-lmwg FYI on running on the login node. You do this using:

./case.submit --no-batch

I remember having some weird subtleties the last time I did it. So if you see that -- let me know and I
can likely help. I don't remember well enough off hand what they were. So just try that and see if it
works.

As @slevis-lmwg says sending to the queue on Izumi is rarely a problem. While it often is on Derecho.

dlai(p,ic) = max(dlai(p,ic), 0.01_r8)
dsai(p,ic) = max(dsai(p,ic), 0.01_r8)
if (dlai(p,ic) > 0._r8) dlai(p,ic) = max(dlai(p,ic), 0.01_r8)
if (dsai(p,ic) > 0._r8) dsai(p,ic) = max(dsai(p,ic), 0.01_r8)
dpai(p,ic) = dlai(p,ic) + dsai(p,ic)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gordon, these are the lines that you recommended changing (410 - 411).

end if
! if (abs(numerical-analytical) > 1.e-06_r8) then
! call endrun (msg='ERROR: CanopyNitrogenProfile: canopy integration error')
! end if
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After changing lines 410-411 in MLCanopyFluxesMod.F90, this error got triggered, so I commented it out, because I do not think that we should expect it to pass anymore.

! introducing minimum dlai and dsai a few lines up from here
! if (abs(totpai - (lai(p)+sai(p))) > 1.e-06_r8) then
! call endrun (msg=' ERROR: MLCanopyFluxes: plant area index not updated correctly')
! end if
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I mentioned before, but this error was triggered by the previous change to lines 410-411. Again, I commented it out because I do not think that we should expect it to pass anymore.

lat = grc%latdeg(g) * pi / 180._r8
lon = grc%londeg(g) * pi / 180._r8
lat = grc%lat(g)
lon = grc%lon(g)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not bring up this one before, either. It's a simplification that does not change answers.

@slevis-lmwg
Copy link
Contributor Author

Gordon, I think that results look better now, but I will let you tell me what you think.

I used the same write statements in the latest lnd.log file, and I made new ncdumps in the corresponding /lnd/hist directory. Both are in the same /archive area as before.

@gbonan
Copy link
Collaborator

gbonan commented Mar 22, 2024 via email

@slevis-lmwg
Copy link
Contributor Author

Ok, I have the same case running on izumi now.

@gbonan
Copy link
Collaborator

gbonan commented Mar 25, 2024 via email

@slevis-lmwg
Copy link
Contributor Author

@gbonan and I met today and completed a step-by-step tutorial on izumi on cloning the ctsm, checking out this PR's branch, creating a new case, and building and running the case.

Gordon, you should receive a github notification that I gave you collaborator permissions so that you may push commits to this PR. When you're ready we will go over instructions for that.

@slevis-lmwg
Copy link
Contributor Author

Though, best if you act on the "collaborator" invitation by the end of the week, because it has an expiration date, I'm pretty sure...

@gbonan
Copy link
Collaborator

gbonan commented Mar 26, 2024 via email

@samsrabin samsrabin added science Enhancement to or bug impacting science and removed enh - new science labels Aug 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement new capability or improved behavior of existing capability science Enhancement to or bug impacting science
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Multi-layer canopy code
5 participants