Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-reproducible SSP with NorESM2.0.9 #588

Open
mvdebolskiy opened this issue Nov 8, 2024 · 59 comments
Open

Non-reproducible SSP with NorESM2.0.9 #588

mvdebolskiy opened this issue Nov 8, 2024 · 59 comments
Assignees
Labels
bug Something isn't working

Comments

@mvdebolskiy
Copy link

Describe the bug
Please provide a clear and concise description of what the bug is.

  • NorESM version: release-noresm2.0.9
  • HPC platform: betzy, fram
  • Compiler (if applicable): intel
  • Compset (if applicable): NSSP585frc2
  • Resolution (if applicable): f19_tn14
  • Error message (if applicable): None

To Reproduce
Steps to reproduce the behavior:

  1. Follow instructions here
  2. set STOP_N=1,STOP_OPTION=nyears,REST_N=1,REST_OPTION=nyears,RESUBMIT=3
  3. set ./xmlchange --subgroup case.run JOB_WALLCLOCK_TIME=02:30:00
  4. submit case
  5. Finish run
  6. ncdiff history

Expected behavior
The results should be B4B or at least roundoff with /nird/projects/NS9560K/noresm/cases/NSSP585frc2_f19_tn14_20191014/.

Screenshots
image
image

Additional context

@adagj copied me to /cluster/projects/nn2345k/olivie/cases-cmip6/NSSP585frc2_f19_tn14_20191014 for me and I've checked PElayouts and env_*.xml to match (apart from batch and mach_specific).

@TomasTorsvik
Copy link
Contributor

@mvdebolskiy - thanks for reporting, I didn't look at SSP runs when testing.
Could be useful to know if this is specific to SSP585, all SSP simulations or something more general.
I don't have time for testing now, will try to look at this over the weekend.

@oyvindseland
Copy link

Did you run it on Fram or Betzy?

@mvdebolskiy
Copy link
Author

@oyvindseland I ran on both. All are different with the history files on nird.

@DirkOlivie
Copy link
Contributor

The original simulation was on Fram. Has the new one been run with the same number of PEs for the atmosphere (768)?

@mvdebolskiy
Copy link
Author

@DirkOlivie Yes. I ran on fram too.
I used your case as a reference. diff between your env_mach_pes.xml and the one in my case says 0 differences.

@mvdebolskiy
Copy link
Author

Also, ran N1850 and it is also has differences with history on nird. I wonder if I am doing something wrong.

@mvertens
Copy link

mvertens commented Nov 8, 2024

Were there SoureMods or xml changes that were used in the runs on NIRD? Are the case directories still around?

@mvdebolskiy
Copy link
Author

NSSP585frc2_f19_tn14_20191014 uses MICOM2 instead of blom.
Main differences in namelists are in ocn_in.
atm_in for release-noresm2.0.9 has aerosol tables while NSSP585frc2_f19_tn14_20191014 does not have them.
This is Dirk's old case's readme.

2019-10-13 21:26:53: ./create_newcase --case /cluster/projects/nn2345k/olivie/cases-cmip6/NSSP585frc2_f19_tn14_20191014 --res f19_tn14 --compset NSSP585frc2 --mach fram --project nn2345k --user-mods-dir cmip6_noresm_hifreq_xaer --run-unsupported
 ---------------------------------------------------
2019-10-13 21:26:53: Compset longname is SSP585_CAM60%NORESM%FRC2_CLM50%BGC-CROP_CICE%NORESM-CMIP6_MICOM%ECO_MOSART_SGLC_SWAV_BGC%BDRDDMS
 ---------------------------------------------------
2019-10-13 21:26:53: Compset specification file is /cluster/projects/nn2345k/olivie/noresm-cmip6/noresm2-20191012-scenarios/cime_config/config_compsets.xml
 ---------------------------------------------------
2019-10-13 21:26:53: Pes     specification file is /cluster/projects/nn2345k/olivie/noresm-cmip6/noresm2-20191012-scenarios/cime_config/config_pes.xml
 ---------------------------------------------------
2019-10-13 21:26:53: Forcing is 
 ---------------------------------------------------
2019-10-13 21:26:53: Using None coupler instances
 ---------------------------------------------------
2019-10-13 21:26:53: Component ATM is CAM cam6 physics:
 ---------------------------------------------------
2019-10-13 21:26:53: ATM_GRID is 1.9x2.5
 ---------------------------------------------------
2019-10-13 21:26:53: Component LND is clm5.0:BGC (vert. resol. CN and methane) with prognostic crop:
 ---------------------------------------------------
2019-10-13 21:26:53: LND_GRID is 1.9x2.5
 ---------------------------------------------------
2019-10-13 21:26:53: Component ICE is Sea ICE (cice) model version 5 :with NORESM modifications appropriate for CMIP6 experiments
 ---------------------------------------------------
2019-10-13 21:26:53: ICE_GRID is tnx1v4
 ---------------------------------------------------
2019-10-13 21:26:53: This component includes user_mods /cluster/projects/nn2345k/olivie/noresm-cmip6/noresm2-20191012-scenarios/components/cice/cime_config/usermods_dirs/noresm-cmip6
 ---------------------------------------------------
2019-10-13 21:26:53: Component OCN is MICOM default:MICOM/Ecosystem:
 ---------------------------------------------------
2019-10-13 21:26:53: OCN_GRID is tnx1v4
 ---------------------------------------------------
2019-10-13 21:26:53: Component ROF is MOSART: MOdel for Scale Adaptive River Transport
 ---------------------------------------------------
2019-10-13 21:26:53: ROF_GRID is r05
 ---------------------------------------------------
2019-10-13 21:26:53: Component GLC is Stub glacier (land ice) component
 ---------------------------------------------------
2019-10-13 21:26:53: GLC_GRID is null
 ---------------------------------------------------
2019-10-13 21:26:53: Component WAV is Stub wave component
 ---------------------------------------------------
2019-10-13 21:26:53: WAV_GRID is null
 ---------------------------------------------------
2019-10-13 21:26:53: ESP_GRID is None
 ---------------------------------------------------
2019-10-13 21:26:53: INFORMATION ABOUT YOUR GIT VERSION CONTROL SYSTEM :
 ---------------------------------------------------
2019-10-13 21:26:53: remote branch:origin	https://DirkOlivie@github.com/metno/noresm-dev.git (fetch)
origin	https://DirkOlivie@github.com/metno/noresm-dev.git (push)
 ---------------------------------------------------
2019-10-13 21:26:53: git branch:* featureCESM2.1.0-OsloDevelopment 11073fc [origin/featureCESM2.1.0-OsloDevelopment] Merge branch 'featureCESM2.1.0-OsloDevelopment' of https://github.com/metno/noresm-dev into featureCESM2.1.0-OsloDevelopment
  master                           0e21727 [origin/master] settings for parallel IO and processors count for quarter degree ocean with HAMOCC
 ---------------------------------------------------
2019-10-13 21:26:53: git log:commit 11073fc88428f330dc17fb6762a78ce4992c560b
Merge: ede8130 1888282
Author: Alf Kirkevåg <alf.kirkevag@met.no>
Date:   Fri Oct 11 11:33:26 2019 +0200

    Merge branch 'featureCESM2.1.0-OsloDevelopment' of https://github.com/metno/noresm-dev into featureCESM2.1.0-OsloDevelopment
 ---------------------------------------------------

@JorgSchwinger
Copy link
Contributor

@mvdebolskiy @oyvindseland @DirkOlivie @mvertens

I have set-up and tested a scenario that was originally run on betzy in 2021 (NSSP534frc2_f19_tn14_20210427). This is branched from NSSP585frc2_f19_tn14_20191014 (hybrid restart) at 2040-01-01. The original simulation was done with release2.0.5

I get bit-for-bit identical results with release2.0.9, also after a restart.

Did we ever test bit-for-bit reproducibility on fram after there was an upgrade some time ago?

@mvdebolskiy
Copy link
Author

@JorgSchwinger I am not comparing to 2.0.5, but rather to cmip6 simulations made in 2019.

@JorgSchwinger
Copy link
Contributor

Yes, I know but 2.0.5 IS (on of the versions of) the CMIP6 code of the model. It is (should be) bit-for-bit compatible with releases 2.0.0-2.0.4 (the SSP5-3.4 I tested is a CMIP6 simulation)

@mvdebolskiy
Copy link
Author

I am comparing my simulations against what is listed here
And located in here: /projects/NS9560K/noresm/cases.

@JorgSchwinger
Copy link
Contributor

My point is: I'm not sure if we expect to be able to reproduce old fram simulations bit-for-bit? We tested bfb regularly on betzy, and my test shows that that 2.0.9 still gives bfb for CMIP6 simulations run on betzy.

@mvdebolskiy
Copy link
Author

But where I can find new simulations that were done on betzy? Also, are they submitted to ESGF?

@DirkOlivie
Copy link
Contributor

Hi Matvey, a CMIP6 experiment run on Betzy long ago (December 2020) is
/cluster/projects/nn9560k/olivie/cases-cmip6/NF1850norbc_f19_20201226 (the data is stored on nird). Are you looking for something like that?
I can try to rerun it with NorESM2.0.9 on Betzy (not today), but you can also try to if you like. Best regards, Dirk

@oyvindseland
Copy link

Since I had a 2.0.9 set-up up and running I tested the case
/cluster/projects/nn2345k/oyvinds/NorESM2-CMIP6/cases/NF1850norbc_test209_20241112
reproduces Dirk's case BFB.

I do not know why there is a SourceMods subroutine in the case so I copied that file to my case folder as well.

@JorgSchwinger
Copy link
Contributor

The NSSP534frc2_f19_tn14_20210427 is on nird in /projects/NS9560K/noresm/cases/

It has also been published on ESGF (ssp534-over)

(I already tested that it is bfb with release 2-0.9)

@mvdebolskiy
Copy link
Author

The NSSP534frc2_f19_tn14_20210427 is on nird in /projects/NS9560K/noresm/cases/

It has also been published on ESGF (ssp534-over)

(I already tested that it is bfb with release 2-0.9)

Can you point to your case?
Because I suspect that the instructions in the docs which I have followed are not correct.

@JorgSchwinger
Copy link
Contributor

On bezty:

/cluster/projects/nn2345k/schwinger/cases/NSSP534frc2_f19_tn14_20210427

@mvdebolskiy
Copy link
Author

mvdebolskiy commented Nov 19, 2024

I do not have access to nn2345k. Whoever is the PI there has to add me, or you can copy the case to nn9560k.

Also, are you cloning your old cases?

@JorgSchwinger
Copy link
Contributor

Ok, copied:

/cluster/projects/nn9560k/schwinger/cases/NSSP534frc2_f19_tn14_20210427

I didn't use the clone command. I executed the create_newcase command as found in the README.case file.

@oyvindseland
Copy link

Not a scenario, but when I tested Dirk's simulation above I also used create_newcase not clone

@mvdebolskiy
Copy link
Author

mvdebolskiy commented Nov 20, 2024

@JorgSchwinger
I am looking at your case:

grep REFCASE env*

env_run.xml:    <entry id="RUN_REFCASE" value="NSSP585frc2_f19_tn14_20191014">
env_run.xml:    <entry id="GET_REFCASE" value="FALSE">
env_run.xml~:    <entry id="RUN_REFCASE" value="case.std">
env_run.xml~:    <entry id="GET_REFCASE" value="FALSE">
grep REFDATE env*
env_run.xml:    <entry id="RUN_REFDATE" value="2040-01-01">
env_run.xml~:    <entry id="RUN_REFDATE" value="0001-01-01">

I will try that. Can you try to make a case that starts in 2015 from a history run?

@oyvindseland can you put your casedir into /cluster/projects/nn9560/ on fram?

@TomasTorsvik
Copy link
Contributor

@mvdebolskiy , @gold2718 - we didn't use the test framework for NorESM tags until the 2.0.8 release. I see that Steve made a test that is probably 2.0.7 judging by the case name noresm_v7_cam6_3_123 in 2023. I see we don't have new baseline runs for 2.0.9, I will do that now.

@adagj
Copy link
Contributor

adagj commented Dec 10, 2024

Dear all,
@oyvindseland @DirkOlivie @matsbn @TomasTorsvik @mvdebolskiy @JorgSchwinger @MichaelSchulzMETNO

I have compared the SSP585 simulation Eveline ran (NSSP585frc2_f19_tn14_20241129) with the CMIP6 version (NSSP585frc2_f19_tn14_20191014). The differences for averages over the years 2070 - 2099, are much larger than what I would expect.

Variable NSSP585frc2_f19_tn14_20241129 NSSP585frc2_f19_tn14_20191014 Difference (20241129 - 20191014) RMSE
RESTOM 4.370 2.333 2.037 12.498
RESSURF 4.316 2.323 1.993 14.461
CLDHGH 49.909 33.466 16.443 22.291
CLDLOW 39.805 38.516 1.289 8.593
CLDMED 17.944 18.440 -0.496 4.926
CLDTOT 71.175 60.724 10.452 15.588
TREFHT 295.146 291.050 4.097 4.801
TS 296.227 292.184 4.043 4.857
TS_LAND 294.710 288.026 6.684 7.045
LWCF 33.399 23.032 10.367 15.676
SWCF -55.304 -48.742 -6.562 13.643

Link to table

The new simulation is warming much faster:

Other notable differences include:

Links to the diagnostics:

@oyvindseland
Copy link

Hi
Given those cloud values in NSSP585frc2_f19_tn14_20241129 it looks like you are not using the NorESM tuning / namelist values. It looks more like default CESM2 default values. The cloud forcing values are very different from what I have ever seen in NorESM.
Have you checked CaseDocs?

@adagj
Copy link
Contributor

adagj commented Dec 10, 2024

The only difference I see in atm_in related to clouds, it that in NSSP585frc2_f19_tn14_20241129 user name list for CAM, it is added:

micro_mg_falspeed_factor = 1.0 
micro_mg_falspeed_temp = 238.15 

which I don't see in NSSP585frc2_f19_tn14_20191014 atm_in.

I have copied both cases to /datalake/NS9560K/adagj/ on NIRD. Feel free to dig!

@DirkOlivie @oyvindseland

@mvdebolskiy
Copy link
Author

@adagj The clouds nl options do not change anything. It's just a scaling factor for ice fallout that is equal to 1 (does not scale anything).

@adagj
Copy link
Contributor

adagj commented Dec 10, 2024

Maybe these figures can provide some hints? The cloud cover is really different the even in the first year...

@oyvindseland @DirkOlivie

I also compared the CMIP6 SSP585 simulations from NorESM2-LM and NorESM2-MM, and those are quite similar - as expected: https://ns2345k.web.sigma2.no/datalake/diagnostics/noresm/adagj/NSSP585frc2_f09_tn14_20200919/CAM_DIAG/yrs2070to2099-NSSP585frc2_f19_tn14_20191014-yrs2070to2099/set1/table_GLBL_ANN.asc

@oyvindseland
Copy link

Are the results including the log files available somewhere? @mvdebolskiy @adagj

@oyvindseland
Copy link

Sorry I did not see until now that Ada also copied the results.

@adagj
Copy link
Contributor

adagj commented Dec 11, 2024

Are the results including the log files available somewhere? @mvdebolskiy @adagj

/datalake/NS9560K/adagj/NSSP585frc2_f19_tn14_20241129

@adagj
Copy link
Contributor

adagj commented Dec 11, 2024

Something funky is happening at high latitudes. Maybe we should look into the sea ice?

This figure shows the difference between MM and LM, for the first 5 years
image

And this is the same comparison, but between the new simulation and LM:
image

@oyvindseland
Copy link

It looks like the simulation is done with a very different code version though?
From README.case
git branch:* (HEAD detached at cime5.6.10_NorESM2_3_r5)

So an early version of NorESM2.3 ?

@oyvindseland
Copy link

No, sorry. I checked further and it looks fine. The cime commit was a bit misleading.

@oyvindseland
Copy link

The clouds nl options do not change anything. It's just a scaling factor for ice fallout that is equal to 1 (does not scale anything).

The options does not exist in the standard code though. Any extra code or is it just preparation for new code?

@mvdebolskiy
Copy link
Author

The options does not exist in the standard code though. Any extra code or is it just preparation for new code?
It's just changes for the GEOMIP, I've checked the scaling facor and it's b4b with 2.0.9 when it's 1.

@oyvindseland
Copy link

I am looking at the first month and the signal in the high clouds are very clear even then:
It does not seem to be any signal in aerosol, concentration and number or radiative properties.
I can not see any noticeable impact on water droplets or ice number in the aerosol freezing temperature regime, i.e. temperature warmer that 233 K.

@adagj
Copy link
Contributor

adagj commented Dec 11, 2024

I checked ch4vmr and co2vmr, and those are the same @DirkOlivie

@oyvindseland
Copy link

I checked a simulation that Dirk made based on the same code as the CMIP6 version with enough changes to make it run, and the results were comparable to the CMIP6 simulation. @DirkOlivie

@DirkOlivie
Copy link
Contributor

DirkOlivie commented Dec 11, 2024

@oyvindseland The simulation (NSSP585frc2_f19_tn14_20241121) was actually done with NorESM2.0.9 (I initially told you that it was with the original CMIP6-code, but that was not the case - sorry for the confusion). This simulation has simulated 1 year (a 2nd year is submitted and in the queue on fram).

On fram the simulation is in :
/cluster/projects/nn2345k/olivie/cases-cmip6-test/NSSP585frc2_f19_tn14_20241121

On nird the first year of the simulation is in :
/datalake/NS2345K/olivie/noresm/cases-group/fram-test/NSSP585frc2_f19_tn14_20241121

@adagj
Copy link
Contributor

adagj commented Dec 11, 2024

The run log from Evelien: run_environment.txt.1039657.241205-070533.txt

@oyvindseland
Copy link

The simulation (NSSP585frc2_f19_tn14_20241121) was actually done with NorESM2.0.9 (I initially told you that it was with the >original CMIP6-code,
So an even better result then actually.
@mvdebolskiy Perhaps the best solution is just to rebuild the experiment from a clean clone.

@adagj
Copy link
Contributor

adagj commented Dec 11, 2024

Hi all,
I have compared the Externals.cfg files, and there is one difference. For CAM Evelien is pointing to her fork:

[cam]
branch = feature/cirrus-speed-scaling
protocol = git
repo_url = https://github.com/EvelienvanDijk/CAM
local_path = components/cam
required = True

And as far as I can see, this is a copy of the 2.0.9 tag => cam_cesm2_1_rel_05-Nor_v1.0.5
except the new stuff: EvelienvanDijk/CAM@d6508a1
which given the user_nl_cam options (listed above somewhere), shouldn't have an impact, but maybe they have?

@oyvindseland

@oyvindseland
Copy link

oyvindseland commented Dec 12, 2024

This may at least give a difference in micro_mg2

-Old code:
falouti = fi(i,:) * dumi(i,:)
faloutni = fni(i,:) * dumni(i,:)

  • End old code

Note falouti and faloutni are arrays,
real(r8) :: falouti(nlev)

-New code
do k = 1,nlev
falouti = fi(i,k) * dumi(i,k)
faloutni = fni(i,k) * dumni(i,k)
if (t(i,k) < micro_mg_falspeed_temp) then
falouti = fi(i,k) * dumi(i,k) * micro_mg_falspeed_factor
faloutni = fni(i,k) * dumni(i,k) * micro_mg_falspeed_factor
endif
end do
-- end new code
The results is as far as I can see that falouti is constant for all k values = fi(i,nlev)*dumi(i,nlev)
If you replace falouti and faloutni with falouti(k) and faloutni(k) I think it should be fine.
This is just one example. You need to check all similar structures that you have changed.

@mvdebolskiy
Copy link
Author

mvdebolskiy commented Dec 12, 2024

Oh, Right. Thanks for catching that.
Also, I'll upload unchanged 2.0.9 results as soon as fram compute nodes are up.

@mvdebolskiy
Copy link
Author

@adagj can you run diagnostics on this one: /nird/datalake/NS9560K/mdeb/noresm2.0/NSSP585frc2_f19_tn14_fram_release-noresm2.0.9_20241213 it goes until 2035.

@adagj
Copy link
Contributor

adagj commented Dec 19, 2024

Yes,

The simulations are not BFB, but it is very similar and within the variability range I would say, but I haven't conducted any proper analysis...

@DirkOlivie @mvdebolskiy @oyvindseland

@oyvindseland
Copy link

Good. Then it looks like the issue is finally resolved?

@TomasTorsvik
Copy link
Contributor

@oyvindseland @mvdebolskiy
Is the "old code" in micro_mg2 still the default in NorESM2.0.9? If so, I would agree that the issue is resolved.

@mvdebolskiy
Copy link
Author

@TomasTorsvik I am running the same case with iimpi, will ping ada for diagnostics when done, to see if that improves things.

@oyvindseland
Copy link

@mvdebolskiy Do you still try to get it bfb or do you want to run a longer simulation to test significance? One thing that you can not do is to draw significance conclusions out of short tests. Even the simplest ensemble set-up, i.e. changes in the last digit of temperature in one grid-point, may show local temperature changes of 5-10 degrees in matter of model days.

@mvdebolskiy
Copy link
Author

@oyvindseland I know. The dependecies should be bakcwards-compatible and result in bfb runs. In addition, it does not hurt to see if the variability is smaller in the new run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants