-
Notifications
You must be signed in to change notification settings - Fork 319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update CTSM Glacier Dataset - mountain glaciers and how to handle floating ice shelves #1406
Comments
@renerwijn Thank you very much for taking this on, and for your detailed description. I know that this is a side-track from the main work you're trying to do, so I really appreciate the time you have put into this. The issue of how to handle land mask always makes my head hurt, especially for PCT fields like this. I think every time I think about this, I come up with a different answer of what's "right". That said, my current thinking is that you probably don't want to do any adjustment of the pct glacier based on land mask. i.e., I'm not sure if this
is right. But this might warrant a Zoom discussion. I'm not sure which version of the Gardner-Fyke documentation you have read, but I just found a "full" version, which compared with the main version had these additional bullet points under the list of 3-minute datasets:
The second bullet point seems partly relevant to your question, though it may not completely answer it. Let me know if you'd like to schedule a Zoom call. My schedule is pretty open next week. |
@renerwijn, thanks so much for initiating this conversation and including all the details. Like @billsacks, my instinct would be not to adjust pct_glacier based on land mask, but I'm not sure I've thought through the issue completely. I had a question about the Greenland data set. The latest version of Mathieu Morlighem's BedMachine data is version 4, which came out in the past few months: https://nsidc.org/data/IDBMG4. Was there a reason why you choose version 3 instead of version 4? |
@billsacks Thanks for the reply. A Zoom call sounds like a good plan. What about next week Monday or Tuesday (9 AM or 5 PM MDT)?
I agree with you this is a good point to discuss. I guess since the land mask only has binary values (0 or 1) it cannot hurt to multiply the ice cover with the land mask since the ice cover will not change (i.e. as long the land mask indicate there is land of course). But let's discuss about it more. About the full description. I/We didn't have a full version of Jeremy Fykes description. I was wondering whether you can share the full description with us? @whlipscomb Good point. I have checked which file I used an it is indeed version 4 instead of version 3. So I have used the most recent update. I guess I got confused since at the website of the ice sheet modeling group of Mathieu Morlighem (https://sites.uci.edu/morlighem/dataproducts/bedmachine-greenland/) they write about BedMachine Greenland v3. But via there website there is a link to NSIDC (https://nsidc.org/data/IDBMG4) (same link as yours) where I downloaded the source data, i.e. BedMachine Greenland v4. Maybe the website of ice sheet modelling group has not been updated yet and that's the reason why version 3 kept stuck in my mind. |
A few summary points from today's discussion:
|
Thanks for summarizing that so concisely Bill. It's much clearer when you put it that way. During our meeting, my final mapping file completed (AIS bedmachine->30arcsec), and so I now have all the ESMF conserve mapping weights required to complete this work, and will list them here for reference. 30arcsec -> 3x3min wgts: GriS BedMachine -> 30arcsec: AIS BedMachine -> 30arcsec I went ahead an mapped the BedMachine data using the above wgts: As we discussed I did some jiu-jitsu to generate the AIS wgts (hence the Rotate2RotateBack modifier in the filenames), but it seems to have worked. The grid rotations are analytical so I was able to recover the lat-lon's of the initial grid exactly. Here is what the mask looks like on the 30arcsec grid: And here's the surface elevations: You'll notice that weird white line going through the center of the grid. This seems to always show up when I do grid rotations, and I actually don't know what it is, but it doesn't have any values associated with it. @renerwijn, let me know if this becomes a problem when your stitching this into the gmted dataset, and I can try another method for generating the AIS mapping weights. [edit - the white lines seem to be a plotting artifact; there are valid values in the array that coincide with these white lines.] |
@billsacks Thanks for the summary. It will help me to finish the job. @adamrher Thanks for the mapping weights and remapped BedMachine data. About the white line in the surface topography, I think I can work around it. Maybe I can use the GMTED topography to fill up the white line area (I assume it is missing data). I will come back to you once I have an issue or completed the job. [edit - I didn't see your edit. In that case I don't think it will be a problem and can just work with the data] |
@billsacks @adamrher @whlipscomb @gunterl Please check and let me know if something needs to be changed. Btw I have added one extra variable called TOPO, which represents the surface topography. Further I could not exactly remember what we agreed, but through the process to generate this dataset I have stitched in the BedMachine mask into the land mask as well (like the topography). @billsacks I have a few questions concerning the generation of the CTSM surface datasets.
[edit - I will complete the README and work flow description later] |
@renerwijn – Thanks for the update. This is a beautiful dataset. When I was plotting some HMA glacier percentages, I noticed a possible double-counting issue. In general, we should have pct_glacier equal to the sum over k of pct_glc_gic, and in nearly all grid cells this is true to within roundoff. But in cell (5406,2364), we have pct_glacier = 100.0, while the sum over k of pct_glc_gic = 104.2. Could you please take a look at this? |
@whlipscomb - Thanks for pointing this out. |
So I'm thinking you may want to make a new mapping file. I am unclear about the "mask" / "nomask" options when running ESMF_RegridWeightGen in mkmapdata.sh. @ekluzek is this mask option used when generating the mksrf_glacier_3x3min -> target_grid mapping weights? If so, is it taking the mask from
|
@renerwijn – Thanks for the quick reply on the glacier double-counting issue. It makes sense that this would be related to the RGI overlaps. |
Yes, you can change the file name in CTSM's bld/namelist_files/namelist_defaults_ctsm_tools.xml – look for mksrf_fglacier in that file. But we can also do that as a last step once everyone is happy with how this looks. Regarding questions (2) and (3) from #1406 (comment), they are closely connected, and like too many other things, the answer is a little more complex than you might hope.... The answer to this depends on whether you are using a recent version of CTSM (ctsm5.1.dev040 or later). In ctsm5.1.dev040, we simplified the process for bringing in new datasets that have different masks. Before that tag, if you introduce a data file with a new mask, you also need to create a new SCRIP grid file with that mask (see the reference to file Based on the error you're getting, I'm guessing you're using a CTSM tag earlier than ctsm5.1.dev040. Things will be easier if you update to a recent version and run mksurfdata_map from there. The downside is that there will be answer changes in other surface dataset fields due to the new method. The changes should be relatively small (unless there are other changes that have happened to mksurfdata_map in the intervening time that I'm not remembering). You will also be an early user of the new method, so it's possible that you'll run into other issues, but I'm happy to help if you do. I'm not sure how important it is that you avoid changes in other fields for this work. One approach could be: try with the new tag, then compare all fields in the new and old surface dataset to see the magnitude of changes (we have a tool to do this quickly if you don't), and reconsider if we're seeing too large changes in other fields. Let me know how you'd like to proceed. |
By the way, sorry for not bringing this to your attention earlier. I sometimes forget that people aren't always using the bleeding-edge version of the code.... |
@billsacks Thanks, I managed to install the newest tag (ctsm5.1.dev047). I will give it a try over the weekend or after the weekend. @whlipscomb @adamrher @gunterl @billsacks |
@renerwijn – The glacier dataset looks fantastic. I'm so glad to have an updated version. Are you planning to make a document describing the steps used to make the dataset? As we've learned, this can be very helpful when it's time for a future update. I was wondering if it would make sense to add this data set to the DASH repository and create a doi, with you as the lead author. Then the data set would be easily citable, and you're more likely to receive due credit. DASH info here: https://dashrepo.ucar.edu/ |
@whlipscomb - Yes I am planning to make a document describing the steps used to make the dataset. As a matter of fact I already have a README file. I can use that one to finish the dataset description and its workflow. About the DASH repository, I guess that is a good idea, but maybe it is better to discuss which files need to be preserved. During my last meeting with @adamrher @billsacks @gunterl there was already some discussion on this point. @billsacks - As I already mentioned I was able to install the newest version of CTSM and its tools (ctsm5.1.dev047). I was able to create the new mapping files and subsequently to run ./mksurfdata.pl -res usrspec -usr_gname HMACUBIT -usr_gdate 210712 -y 1979 -glc_nec 36. This time the aforementioned issue did not pop up, however I ran into another issue. It seems while processing the mkglacierregion the mapping areas are not conserved (see screenshot). Do you have an idea how this issue can be solved? |
@renerwijn – Sounds good. When your dataset description is done, we can talk about what's appropriate to submit to DASH. For now, I think the highest priority is to resolve the mapping issue and then restart the HMA runs with the new glacier dataset. Thanks! |
@renerwijn regarding the mapping error: This is weird: it's saying that areas on the output grid are 0 and on the input grid are negative. How did you create this mapping file – using CTSM's mkmapdata or some other tool? It seems strange that you're hitting this error for glacier region but not for other fields. Did you use the same process for all of them? I wonder: If you temporarily delete these lines: CTSM/tools/mksurfdata_map/src/mksurfdat.F90 Lines 601 to 603 in 05c1eae
is this same error encountered for any other fields? Let me know if you'd like to schedule a time to talk more about this, if you remain stuck. |
@billsacks right now we have the HMA grid installed in a sandbox that uses tag ctsm1.0.dev105. Is that going to be a problem if we use surface datasets generated from this more recent ctsm5.1.dev047? |
I think that should be okay, but I'm not 100% sure. @ekluzek do you know? If you want to return to the original attempt of using ctsm1.0.dev105 to generate the surface dataset (which might make sense anyway, given the new error @renerwijn is getting), then that should work as long as the scrip grid file you create for the 3min glacier dataset has a mask field that agrees with the mask on the dataset. It has been a while since I've created a new raw dataset, so I may be forgetting some details, but I can try to help with this if you need help (though I may be out in the second half of this week). |
I gave this a shot just to increase Rene's options. I used the ncl routine rectilinear_to_SCRIP and included landmask from the new 3x3 glacier dataset w/ this option: https://www.ncl.ucar.edu/Document/Functions/ESMF/rectilinear_to_SCRIP.shtml#GridMask. It's here: In the meantime I'm running mkmapdata scripts using ctsm5.1.dev047 to see if I can't at least generate the default fsurdat for the HMA grid (not using the new glacier dataset). So far it's working. I'll update this thread with whether I'm successful or not. |
@renerwijn I was able to generate an HMACUBIT fsurdat file using ctsm5.1.dev047. I did not use the new glacier dataset or the modified glacier region file w/ the HMA glacier region. The only issue I had is I needed to modify mkmapdata.sh since it didn't recognize the If you want to see my regridbatch.sh or other files in the mkmapdata and msurfdata_map directories, you can peruse them here: |
@billsacks @adamrher @whlipscomb @gunterl @billsacks @adamrher - Thanks for the different solutions. Actually I also used regridbatch.sh to generate the mapping files when using ctsm5.1.dev047. The only thing I did different is that created an environmental variable for CSMDATA that points to the Cheyenne path. I was able to produce the surface datasets as well with the old files but not with the new ones. In parallel I installed an older version ctsm1.0.dev105 as well. There I created the mapping files using the scrip file made by @adamrher amongst others. After that I was able to generate the surface dataset successful. The dataset can be found in /glade/scratch/wijngaard/glacier_dataset/glacier2netcdf/surfdata_HMACUBIT_hist_78pfts_CMIP6_simyr1979_c210713.nc I already checked the new version on its mean elevation differences between glaciers and the grid cells. I have attached one screenshot below with the elevation differences between the mean glacier elevation and mean grid cell elevation. As you can see the update did its work and the discrepancies that were largely there in the southeastern part of High Mountain Asia have disappeared. |
I am super happy with this new dataset! Thank you so much, Rene. I think we should focus now on resuming our HMA runs that uses the older ctsm tag ... let's chat over email on how best to proceed with this work.
I believe this is the final hurdle to closing this issue; getting the new glacier dataset to work with the new mkmapdata code avail in ctsm5.1dev040, or later. I may be able to look at this later this week, but since we are both mere users of the tools (i.e., mortals) our debugging abilities are limited and we may have to punt to next week when Bill Sacks is back, or until we can get the attention of Erik. In terms of providing documentation, why don't you start by pasting your README into a google doc and share it so we can all comment on it? One issue we have to figure out is what to do with all these massive 50-100GB intermediate files needed to generate the final product. Ideally we can just provide the scripts to reproduce everything so we don't need to keep all these files around. |
Today I have created a document and made a draft for the dataset description and the README. I have send you an invitation to edit/read the document. What concerns the amount of data, I have made several compressed files (*tar.gz) including the input glacier/ice sheet outlines, the polygon grids, the glacier grids, and data required for generating the final dataset. That reduced the amount of space needed significantly. Including the final dataset, but excluding the conservative weighting map, the amount of storage is now 29GB (i.e. 15 GB for the final dataset and 14GB for the remaining files). |
@ekluzek - this final dataset is in place at /glade/p/cesmdata/cseg/inputdata/lnd/clm2/rawdata/mksrf_glacier_3x3min_simyr2000.c20210708.nc . I'm assigning you to this because I assume you'll be the one to do the final piece of bringing this in as the new default, along with other surface dataset updates for ctsm5.2. I have not imported this to the svn repository, because I figured we should make sure everything looks right with a surface dataset generated with this new dataset before importing it. Note that @renerwijn @adamrher and others have looked closely at this, but they have not looked at a surface dataset generated with the latest version of CTSM, because they ran into problems with this latest version. So, when we get to the point of generating new surface datasets, it would be good to get someone to do a quick eyeball check to make sure that this field still looks right with the new surface dataset generation process. |
Actually, I have gone ahead and tried generating a surface dataset out of latest ctsm master with the new mksrf_glacier file, by running
I examined the diffs and/or actual fields (in old & new) for these variables, and this all looks reasonable. I haven't looked carefully, but from some quick looks nothing jumps out as problematic. If anyone else wants to look and sign off on this, the files are here: |
I have now confirmed that the new mksurdata_esmf has NOT been pointing to the new glacier dataset. I will make the change. |
Does switching to the new glacier dataset eliminate the need for both old datasets, the standard and the mergeGreenland? |
Good question. Yes, I'm pretty sure the plan was for this one dataset to replace both of the old ones. I think we were going to drop support for the mergeGreenland (aka mergeGIS or merge_gis) option, because I don't think it was actually being used. |
I also recall this conversation from Bill, that we couldn't think of a good reason to keep both around. Note that this dataset will only be used over Greenland when running with a stub GLC. When CISM is active (even in NOEVOLVE), the ice mask is currently passed to CLM via CISM, which overrides the ice mask in the surfdat file. Since this new glacier dataset uses the same ice mask as used to initialize CISM (BedMachine), there should be very little difference (if any) between these two separate configurations. |
This issue was addressed for ctsm5.2 purposes in #1732 |
Looking at the PR you linked, the mksurf namelists has replaced the 3x3min glacier dataset with:
Which is the new dataset created in this issue. Is it OK to close this issue before all the surfdat files have been re-generated w/ the new datasets, or has that been done already? |
We have not, yet, generated all the fsurdat files with the new raw data. |
@ekluzek and @slevisconsulting decided that this issue can be closed before generating all the new fsurdat files. |
Background Issue
Over High Mountain Asia discrepancies were found in the glacier elevation distribution that were retrieved from the CTSM/CLM glacier dataset (mksrf_glacier_3x3min_simyr2000.c120926.nc). The discrepancies could in particular be found over the southeastern part of High Mountain Asia where the mean glacier elevation has found to be up to 2000 m lower (i.e., around 3000 m) than the mean CLM grid cell elevation (around 5000 m). This gave a reason to find out where these discrepancies could come from, which were eventually found to be related to 1) inaccuracies in the Randolph Glacier Inventory version 1 glacier outlines (relative to RGI version 6) (see screenshot), and 2) potential smoothing issues in the input topography (GLOBE). In addition, the glacier dataset cover elevations up to 6000 m altitude + one extra bin covering altitudes between 6000 m and 10000 m, whereas the HMA grid works with a 36-EC scheme that covers altitudes up to 7000 m + one extra EC covering altitudes between 7000 m and 10000 m. This means a part of the glaciers (between 6000 and 7000 m altitude) are not well captured by CLM. For this reason, we decided to update the existing CLM glacier dataset.
Brief description dataset
The first version of the glacier dataset, developed by Jeremy Fykes, Alex Gardner, and Bill Sacks, uses glacier outlines retrieved from RGI version 1 (Arendt et al., 2012), vector data from the University of Zurich (Rastner et al., 2012) for the Greenland Icesheet, and vector data from the SCAR Antarctic Digital Database for the Antarctic Icesheet. The topography was retrieved from the 30-arcsec GLOBE topography (Hastings et al., 1999).
The updated version of the glacier dataset uses glacier outlines retrieved from RGI version 6 (Arendt et al., 2017). The vector data for the GrIS and AIS are retrieved from the masks of BedMachine version 3 and version 2 (Morlighem et al., 2017,2019), respectively. The 30-arcsec topography and land mask are retrieved from the 30-arcsec GMTED2010. BedMachine and GMTED2010 were chosen to keep the datasets consistent with what is already used for the CAM topography (GMTED) and within CISM (BedMachine).
Workflow
The workflow is comparable with the workflow that was used to develop the first glacier dataset. The workflow (until this point) consists of the following points:
Issues
In step 5a I mention the correction of the land mask. The land mask of GMTED2010 encompasses 2 possible values/flags: (1) for land and (0) for oceans, seas, and lakes. The generated 30-arcsec ice cover datasets contain the percent ice cover, including those of the floating ice shelves surrounding the GrIS and AIS. The grid cells containing floating ice shelf overlay land mask grid cells that are flagged as ocean. In order to get the total land fraction per grid cell covered by ice I need to multiply the ice cover datasets by the land mask. However, this becomes an issue for the floating ice shelves where the land mask is equal to 0. Therefore, I presume a correction of the land mask is required. My questions are as follows:
Goal
The goal is to use the updated glacier dataset as input for the HMA simulations. Further the updated glacier dataset can serve as an update in general for the input datasets required in the CTSM code base.
People involved: @whlipscomb @gunterl @adamrher
The text was updated successfully, but these errors were encountered: