Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable CDAT-migrated E3SM Diags #651

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

forsyth2
Copy link
Collaborator

Issue resolution

Select one: This pull request is...

  • a bug fix: increment the patch version
  • a small improvement: increment the minor version
  • an incompatible (non-backwards compatible) API change: increment the major version

1. Does this do what we want it to do?

Objectives:

  • Enable zppy to call the CDAT-migrated E3SM Diags

Required:

  • Product Management: I have confirmed with the stakeholders that the objectives above are correct and complete.
  • Testing: I have added or modified at least one "min-case" configuration file to test this change. Every objective above is represented in at least one cfg.
  • Testing: I have considered likely and/or severe edge cases and have included them in testing.

If applicable:

  • Testing: this pull request introduces an important feature or bug fix that we must test often. I have updated the weekly-test configuration files, not just a "min-case" one.
  • Testing: this pull request adds at least one new possible parameter to the cfg. I have tested using this parameter with and without any other parameter that may interact with it.

2. Are the implementation details accurate & efficient?

Required:

  • Logic: I have visually inspected the entire pull request myself.
  • Logic: I have left GitHub comments highlighting important pieces of code logic. I have had these code blocks reviewed by at least one other team member.

If applicable:

  • Dependencies: This pull request introduces a new dependency. I have discussed this requirement with at least one other team member. The dependency is noted in zppy/conda, not just an import statement.

3. Is this well documented?

Required:

  • Documentation: by looking at the docs, a new user could easily understand the functionality introduced by this pull request.

4. Is this code clean?

Required:

  • Readability: The code is as simple as possible and well-commented, such that a new team member could understand what's happening.
  • Pre-commit checks: All the pre-commits checks have passed.

If applicable:

  • Software architecture: I have discussed relevant trade-offs in design decisions with at least one other team member. It is unlikely that this pull request will increase tech debt.

@forsyth2
Copy link
Collaborator Author

forsyth2 commented Dec 10, 2024

To do:

  • (currently running) test just e3sm diags on all available sets: zppy -c tests/integration/generated/test_min_case_e3sm_diags_comprehensive_v3_chrysalis.cfg
  • Run the full weekly test suite
  • Merge this PR

Comment on lines 12 to 16
output = "#expand user_output#zppy_weekly_comprehensive_v3_output/#expand unique_id#/#expand case_name#"
partition = "#expand partition_short#"
qos = "#expand qos_short#"
www = "#expand user_www#zppy_weekly_comprehensive_v3_www/#expand unique_id#"
years = "1985:1989:2",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output and www need updated names!

@forsyth2
Copy link
Collaborator Author

Currently running full weekly test suite (doing this pre-merge because the CDAT migration was a large change, so there's a higher chance of breaking main).

$ python tests/integration/utils.py 
CFG FILES HAVE BEEN GENERATED FROM TEMPLATES WITH THESE SETTINGS:
UNIQUE_ID=test-346-20241210
unified_testing=False
diags_environment_commands=source /gpfs/fs1/home/ac.forsyth2/miniforge3/etc/profile.d/conda.sh; conda activate e3sm_diags_main_20241209
global_time_series_environment_commands=source /gpfs/fs1/home/ac.forsyth2/miniforge3/etc/profile.d/conda.sh; conda activate zi_dev_weekly_20241122
e3sm_to_cmip_environment_commands=
environment_commands=
Reminder: `e3sm_to_cmip_environment_commands=''` => the environment of the `ts` task will be used
zppy -c tests/integration/generated/test_weekly_comprehensive_v3_chrysalis.cfg
zppy -c tests/integration/generated/test_weekly_comprehensive_v2_chrysalis.cfg
zppy -c tests/integration/generated/test_weekly_bundles_chrysalis.cfg # Runs 1st part of bundles cfg

@forsyth2
Copy link
Collaborator Author

Continuing testing from previous comment

Continuing testing gives:

zppy -c tests/integration/generated/test_weekly_bundles_chrysalis.cfg # Runs 2nd part of bundles cfg

cd /lcrc/group/e3sm/ac.forsyth2/zppy_weekly_comprehensive_v3_output/test-346-20241210/v3.LR.historical_0051/post/scripts/
grep -v "OK" *status

cd /lcrc/group/e3sm/ac.forsyth2/zppy_weekly_comprehensive_v2_output/test-346-20241210/v2.LR.historical_0201/post/scripts
grep -v "OK" *status

cd /lcrc/group/e3sm/ac.forsyth2/zppy_weekly_bundles_output/test-346-20241210/v3.LR.historical_0051/post/scripts
grep -v "OK" *status

# Run integration tests
cd ~/ez/zppy
pytest tests/integration/test_*.py

That gives:

======================================================================= short test summary info ========================================================================
FAILED tests/integration/test_bash_generation.py::test_bash_generation - assert 256 == 0
FAILED tests/integration/test_weekly.py::test_comprehensive_v2_images - AssertionError
FAILED tests/integration/test_weekly.py::test_comprehensive_v3_images - AssertionError
FAILED tests/integration/test_weekly.py::test_bundles_images - AssertionError
============================================================== 4 failed, 10 passed in 1505.03s (0:25:05) ===============================================================

Bash generation test

The FAILED tests/integration/test_bash_generation.py::test_bash_generation - assert 256 == 0 line is benign. The changes are specifically because the e3sm_diags.bash template gets updated in this PR; i.e., these diffs are expected.

Image check failures

There's a large number of image check diffs in subdirectories of:

But as far as I can tell, the errors are mostly benign. Example errors:

  • Many involve actual having "RMSE" and "CORR" further to the left than in expected.
  • Slight differences in text shape/size -- e.g., actual and expected
  • Slight differences in contours like actual and expected
  • Differences in lat/lon labels -- actual and expected

Some errors seem more concerning:

Conclusions

@chengzhuzhang @tomvothecoder What's the best path forward here? None of the grep -v "OK" *status checks showed errors. That said, there are simply too many diffs in the image check tests to manually check them all. Most of the errors seem benign and likely arise from variations in matplotlib, but I found at least one instance of a more concerning change (noted above).

Do we accept these diffs, merge this PR, and update the expected images? Or do these need to be investigated more thoroughly?

@forsyth2 forsyth2 marked this pull request as ready for review December 10, 2024 19:58
@tomvothecoder
Copy link
Collaborator

@chengzhuzhang @tomvothecoder What's the best path forward here? None of the grep -v "OK" *status checks showed errors. That said, there are simply too many diffs in the image check tests to manually check them all. Most of the errors seem benign and likely arise from variations in matplotlib, but I found at least one instance of a more concerning change (noted above).

Do we accept these diffs, merge this PR, and update the expected images? Or do these need to be investigated more thoroughly?

I think we should expect that most of the differences will be benign. The underlying results from the dev branch won't exactly match the CDAT codebase, which can produce diffs.

Completely different stats and colors: actual and expected

However, any formatting differences (e.g., labels for axes) or major differences like the one you mentioned above should be investigated. I will open a separate GitHub issue in e3sm_diags to investigate the diffs and address the ones that should be fixed.

Afterwards, we can update the expected results and you can re-run the zppy test again.

@forsyth2
Copy link
Collaborator Author

Thanks @tomvothecoder!

@forsyth2
Copy link
Collaborator Author

forsyth2 commented Dec 23, 2024

zppy & post-CDAT-migration e3sm_diags to-do list, for the 1/15 Unified release-candidate deadline:

@tomvothecoder @chengzhuzhang ^FYI for timeline planning

@tomvothecoder
Copy link
Collaborator

@forsyth2 I updated your checklist above. I just merged E3SM-Project/e3sm_diags#907 and you should be able to re-test now. Hopefully this fixes the diffs in zppy.

@forsyth2
Copy link
Collaborator Author

@tomvothecoder Thanks! E3SM-Project/e3sm_diags#907 has extensive changes, so I will integrate that into zppy with this PR in the next release candidate.

@forsyth2 forsyth2 force-pushed the issue-346-diags-post-refactor branch 2 times, most recently from f77a75b to 84a2b59 Compare January 15, 2025 23:36
@forsyth2 forsyth2 force-pushed the issue-346-diags-post-refactor branch from 84a2b59 to c2188bf Compare January 15, 2025 23:37
@forsyth2
Copy link
Collaborator Author

Since the zppy release was delayed anyway yesterday (from the issues described on #634 (reply in thread)), I tried testing this pull request using the latest E3SM Diags (E3SM-Project/e3sm_diags#907), however I ran into a number of issues.

# Set up branch
cd ~/ez/zppy
git checkout issue-346-diags-post-refactor

# Set up and test zppy-interfaces env
# zi_dev_weekly_20250114 still exists from weekly testing

(That environment was created in #634 (comment))

# Set up e3sm_diags env
cd ~/ez/e3sm_diags
git status
# Check no file changes will persist when we switch branches.
git fetch upstream
git checkout main
git reset --hard upstream/main
git log
# Confirmed last commit, from 1/15: Address diffs v2.12.1 to v3 (#907)
conda clean --all --y
conda env create -f conda-env/dev.yml -n e3sm_diags_20250115
conda activate e3sm_diags_20250115
pip install .

# Set up zppy env
cd ~/ez/zppy
conda clean --all --y
conda env create -f conda/dev.yml -n zppy_dev_pr651_20250115
conda activate zppy_dev_pr651_20250115
pip install .

# Unit tests for zppy
pytest tests/test_*.py
# 23 passed, 1 warning in 0.35s
# (DeprecationWarning: invalid escape sequence \.)

# Integration testing for zppy

# Edit tests/integration/utils.py:
# UNIQUE_ID = "test_zppy_pr651_20250115"
#        "diags_environment_commands": "source /gpfs/fs1/home/ac.forsyth2/miniforge3/etc/profile.d/conda.sh; conda activate e3sm_diags_20250115",
#        "global_time_series_environment_commands": "source /gpfs/fs1/home/ac.forsyth2/miniforge3/etc/profile.d/conda.sh; conda activate zi_dev_weekly_20250114",
# Keep this as-is: generate_cfgs(unified_testing=False, dry_run=False)

# Run zppy to produce actual results to compare in the integration tests
python tests/integration/utils.py
# That prints:
# CFG FILES HAVE BEEN GENERATED FROM TEMPLATES WITH THESE SETTINGS:
# UNIQUE_ID=test_zppy_pr651_20250115
# unified_testing=False
# diags_environment_commands=source /gpfs/fs1/home/ac.forsyth2/miniforge3/etc/profile.d/conda.sh; conda activate e3sm_diags_20250115
# global_time_series_environment_commands=source /gpfs/fs1/home/ac.forsyth2/miniforge3/etc/profile.d/conda.sh; conda activate zi_dev_weekly_20250114
# environment_commands=
# Reminder: `environment_commands=''` => the latest E3SM Unified environment will be used

zppy -c tests/integration/generated/test_weekly_comprehensive_v3_chrysalis.cfg
zppy -c tests/integration/generated/test_weekly_comprehensive_v2_chrysalis.cfg
zppy -c tests/integration/generated/test_weekly_bundles_chrysalis.cfg # Runs 1st part of bundles cfg
# Wait to finish

# Here as of 2025-01-15 end of day
# Picking up 2025-01-16 
cd ~/ez/zppy
lcrc_conda
conda activate zppy_dev_pr651_20250115
pip install .
sq # alias for: squeue -o "%8u %.7a %.4D %.9P %7i %.2t %.10r %.10M %.10l %j" --sort=P,-t,-p -u ac.forsyth2
# The MPAS-Analysis errors:
# USER     ACCOUNT NODE PARTITION JOBID   ST     REASON       TIME TIME_LIMIT NAME
# ac.forsy    e3sm    1   compute 666747  PD Dependency       0:00      30:00 mpas_analysis_ts_1980-1990_climo_1985-1990
# ac.forsy    e3sm    1     debug 666748  PD Dependency       0:00      30:00 global_time_series_1980-1990
# ac.forsy    e3sm    1   compute 666819  PD Dependency       0:00      30:00 mpas_analysis_ts_1980-1990_climo_1985-1990
# ac.forsy    e3sm    1     debug 666820  PD Dependency       0:00      30:00 global_time_series_1980-1990

(These actually appear to be from earlier runs. The MPAS-Analysis errors are noted starting with #634 (reply in thread)).

zppy -c tests/integration/generated/test_weekly_bundles_chrysalis.cfg # Runs 2nd part of bundles cfg
sq
# USER     ACCOUNT NODE PARTITION JOBID   ST     REASON       TIME TIME_LIMIT NAME
# ac.forsy    e3sm    1   compute 666747  PD Dependency       0:00      30:00 mpas_analysis_ts_1980-1990_climo_1985-1990
# ac.forsy    e3sm    1     debug 666748  PD Dependency       0:00      30:00 global_time_series_1980-1990
# ac.forsy    e3sm    1   compute 666819  PD Dependency       0:00      30:00 mpas_analysis_ts_1980-1990_climo_1985-1990
# ac.forsy    e3sm    1     debug 666820  PD Dependency       0:00      30:00 global_time_series_1980-1990
# ac.forsy    e3sm    1   compute 667322  PD   Priority       0:00    7:00:00 ilamb_1985-1986
# ac.forsy    e3sm    1   compute 667323  PD   Priority       0:00    7:00:00 bundle3
# ac.forsy    e3sm    1   compute 667324  PD   Priority       0:00    7:00:00 bundle1

# Notice, bundle1 is rerunning... probably because of failed MPAS-Analysis?

cd /lcrc/group/e3sm/ac.forsyth2/zppy_weekly_comprehensive_v3_output/test_zppy_pr651_20250115/v3.LR.historical_0051/post/scripts/
grep -v "OK" *status
# e3sm_diags_atm_monthly_180x360_aave_model_vs_obs_1987-1988.status:ERROR (9)
# e3sm_diags_atm_monthly_180x360_aave_mvm_model_vs_model_1987-1988_vs_1985-1986.status:RUNNING 666785
# e3sm_diags_lnd_monthly_mvm_lnd_model_vs_model_1987-1988_vs_1985-1986.status:ERROR (9)
emacs e3sm_diags_atm_monthly_180x360_aave_model_vs_obs_1987-1988.o666784 
# slurmstepd: error: *** STEP 666784.0 ON chr-0221 CANCELLED AT 2025-01-15T18:24:58 ***
# Only ran for 44s
emacs e3sm_diags_atm_monthly_180x360_aave_mvm_model_vs_model_1987-1988_vs_1985-1986.o666785 
# slurmstepd: error: *** JOB 666785 ON chr-0222 CANCELLED AT 2025-01-15T23:24:35 DUE TO TIME LIMIT ***
emacs e3sm_diags_lnd_monthly_mvm_lnd_model_vs_model_1987-1988_vs_1985-1986.o666786 
# slurmstepd: error: *** STEP 666786.0 ON chr-0140 CANCELLED AT 2025-01-15T18:24:47 ***
# Only ran for 36s
emacs emacs mpas_analysis_ts_1985-1989_climo_1985-1989.o666787 
# All good
emacs emacs mpas_analysis_ts_1985-1995_climo_1990-1995.o666788
# All good
emacs global_time_series_1985-1995.o666789 
# cp: cannot stat 'global_time_series_1985-1995_results/viewer': No such file or directory

cd /lcrc/group/e3sm/ac.forsyth2/zppy_weekly_comprehensive_v2_output/test_zppy_pr651_20250115/v2.LR.historical_0201/post/scripts
grep -v "OK" *status
# e3sm_diags_atm_monthly_180x360_aave_model_vs_obs_1982-1983.status:RUNNING 666815
# e3sm_diags_atm_monthly_180x360_aave_mvm_model_vs_model_1980-1981_vs_1980-1981.status:RUNNING 666816
# e3sm_diags_lnd_monthly_mvm_lnd_model_vs_model_1982-1983_vs_1980-1981.status:RUNNING 666817
# global_time_series_1980-1990.status:WAITING 666820
# mpas_analysis_ts_1980-1984_climo_1980-1984.status:ERROR (1)
# mpas_analysis_ts_1980-1990_climo_1985-1990.status:WAITING 666819
emacs e3sm_diags_atm_monthly_180x360_aave_model_vs_obs_1982-1983.o666815 
# slurmstepd: error: *** JOB 666815 ON chr-0140 CANCELLED AT 2025-01-15T23:35:35 DUE TO TIME LIMIT ***
emacs e3sm_diags_atm_monthly_180x360_aave_mvm_model_vs_model_1980-1981_vs_1980-1981.o666816
# slurmstepd: error: *** JOB 666816 ON chr-0221 CANCELLED AT 2025-01-15T23:26:35 DUE TO TIME LIMIT ***
emacs e3sm_diags_lnd_monthly_mvm_lnd_model_vs_model_1982-1983_vs_1980-1981.o666817 
# slurmstepd: error: *** JOB 666817 ON chr-0224 CANCELLED AT 2025-01-15T23:24:35 DUE TO TIME LIMIT ***
emacs mpas_analysis_ts_1980-1984_climo_1980-1984.o666818
# KeyError: 'config_am_conservationcheck_enable'
# ...
# There were errors in task climatologyMapArgoSalinity: plotJFM_latlon_depth_-600

# No errors
cd /lcrc/group/e3sm/ac.forsyth2/zppy_weekly_bundles_output/test_zppy_pr651_20250115/v3.LR.historical_0051/post/scripts
grep -v "OK" *status
# bundle1.status:ERROR
# bundle3.status:ERROR
# e3sm_diags_atm_monthly_180x360_aave_model_vs_obs_1985-1986.status:ERROR (9)
# e3sm_diags_atm_monthly_180x360_aave_model_vs_obs_1987-1988.status:ERROR (9)
# e3sm_diags_atm_monthly_180x360_aave_mvm_model_vs_model_1987-1988_vs_1985-1986.status:ERROR (9)
emacs bundle1.o667324 # 2nd run
# === e3sm_diags_atm_monthly_180x360_aave_model_vs_obs_1985-1986.bash ===
# ...
# OSError: No file found for '' and 'MAM' in /lcrc/group/e3sm/diagnostics/observations/Atm/climatology/
# ...
# slurmstepd: error: *** STEP 667324.1 ON chr-0074 CANCELLED AT 2025-01-16T11:17:29 ***
emacs bundle3.o667323
# === e3sm_diags_atm_monthly_180x360_aave_mvm_model_vs_model_1987-1988_vs_1985-1986.bash ===
# ...
# OSError: The dataset file has no matching source variables for AODMOM
# ...
# slurmstepd: error: *** STEP 667323.0 ON chr-0224 CANCELLED AT 2025-01-16T11:17:07 ***

@chengzhuzhang
Copy link
Collaborator

chengzhuzhang commented Jan 16, 2025

@forsyth2 thank you for keep testing.
I looked at the zppy error logs.

For e3sm_diags:

  1. There is slurmstepd: error: mpi/pmi2: failed to read PMI1 init command in two tasks:
    atm_monthly_180x360_aave_model_vs_obs,
    atm_monthly_180x360_aave_mvm_model_vs_model
    This one is mysterious to me.

  2. However, the unit conversion error in lnd_monthly_mvm_lnd_model_vs_model_1987-1988, should be related due to the recent refactor, to replace genutil.udunit. I will file an issue in e3sm_diags for a fix.
    This run is also failed with a slurm error within one minute:

srun: Job step aborted: Waiting up to 92 seconds for job step to finish.
slurmstepd: error: *** STEP 666786.0 ON chr-0140 CANCELLED AT 2025-01-15T18:24:47 ***

@forsyth2
Copy link
Collaborator Author

Testing with CDAT-migrated E3SM Diags for zppy v3.0.0rc2.

Testing summary

v2 test:

Commits num. mismatched images num. missing images num. passing images num. total images
Both commits -- no more copying files* 2322 81 1973 4376
Just the first commit 2299 117 1960 4376

v3 test:

Commits num. mismatched images num. missing images num. passing images num. total images
Both commits -- no more copying files* 2868 102 2246 5216
Just the first commit 2845 138 2233 5216

*Address the issue of file paths (latest comment on that: E3SM-Project/e3sm_diags#866 (comment))

Note: the missing images don't show up as image diffs, like the mismatched images do. Missing images mean we expected that image (it's in the expected results), but there was no actual to diff against.

There's quite a few errors of both types (missing, mismatched) to consider...

Testing details

Testing steps
# Set up and test zppy-interfaces env
cd ~/ez/zppy-interfaces
git status
# Check no file changes will persist when we switch branches.
git fetch upstream main
git checkout -b test_zi_weekly_20250117 upstream/main
git log
# Last commit, from 1/17: Further PR template updates (#15)
conda clean --all --y
conda env create -f conda/dev.yml -n zi_dev_weekly_20250117
conda activate zi_dev_weekly_20250117
pip install .
pytest tests/unit/global_time_series/test_*.py
# 10 passed in 24.69s

# Set up e3sm_diags env
cd ~/ez/e3sm_diags
git status
# Check no file changes will persist when we switch branches.
git fetch upstream
git checkout -b main upstream main
git reset --hard upstream/main
git log
# Confirmed last commit, from 1/17: Revert "Drop Python 3.9 support (#919)"
conda clean --all --y
conda env create -f conda-env/dev.yml -n e3sm_diags_20250117
conda activate e3sm_diags_20250117
pip install .

With both commits:

# Set up branches
cd ~/ez/zppy
git status
# Check no file changes will persist when we switch branches.
git checkout issue-346-diags-post-refactor
git log
# b90506f7d18b77d41a4a74b5beff413bb6f1fe2b => 1/14 Update PR template (#658)
git fetch upstream
git rebase -i
# Drop the testing commit
# 2 commits:
# Remove symbolic links for ts
# Enable CDAT-migrated E3SM Diags
# Now rebased on 1/17: Further PR template updates (#662)

# Set up zppy env
conda clean --all --y
conda env create -f conda/dev.yml -n zppy_dev_pr651_both_commits_20250117
conda activate zppy_dev_pr651_both_commits_20250117
pip install .

# Unit tests for zppy
pytest tests/test_*.py
# 23 passed in 0.29s

# Integration testing for zppy

# Edit tests/integration/utils.py:
# UNIQUE_ID = "test_pr651_both_commits_20250117"
#        "diags_environment_commands": "source /gpfs/fs1/home/ac.forsyth2/miniforge3/etc/profile.d/conda.sh; conda activate e3sm_diags_20250117",
#        "global_time_series_environment_commands": "source /gpfs/fs1/home/ac.forsyth2/miniforge3/etc/profile.d/conda.sh; conda activate zi_dev_weekly_20250117",
# Keep this as-is: generate_cfgs(unified_testing=False, dry_run=False)

# Run zppy to produce actual results to compare in the integration tests
python tests/integration/utils.py

zppy -c tests/integration/generated/test_weekly_comprehensive_v3_chrysalis.cfg
zppy -c tests/integration/generated/test_weekly_comprehensive_v2_chrysalis.cfg

cd /lcrc/group/e3sm/ac.forsyth2/zppy_weekly_comprehensive_v3_output/test_pr651_both_commits_20250117/v3.LR.historical_0051/post/scripts/
grep -v "OK" *status
# No errors
cd /lcrc/group/e3sm/ac.forsyth2/zppy_weekly_comprehensive_v2_output/test_pr651_both_commits_20250117/v2.LR.historical_0201/post/scripts
grep -v "OK" *status
# No errors

cd ~/ez/zppy
git checkout issue-346-diags-post-refactor
git branch # Check that we're on issue-346-diags-post-refactor
ls tests/integration/test_*.py
pytest tests/integration/test_bash_generation.py
# 1 failed in 2.15s
# Diffs are expected
pytest tests/integration/test_campaign.py
# 6 passed in 1.93s
pytest tests/integration/test_defaults.py
# 1 passed in 0.32s
pytest tests/integration/test_last_year.py
# 1 passed in 0.26s
pytest tests/integration/test_weekly.py # Skip bundles tests for now
# FAILED tests/integration/test_weekly.py::test_comprehensive_v2_images - AssertionError
# FAILED tests/integration/test_weekly.py::test_comprehensive_v3_images - AssertionError
# 2 failed, 3 skipped in 1351.74s (0:22:31)

# For v2:
# Total number of images checked: 4376
# Missing images: 81
# For v3:
# Total number of images checked: 5216
# Missing images: 102

With first commit only:

# Set up branches
cd ~/ez/zppy
git status
# Check no file changes will persist when we switch branches.
git checkout issue-346-diags-post-refactor
git checkout -b issue-346-diags-post-refactor-1st-commit
git fetch upstream
git rebase -i upstream/main
# Keep only the 1st commit:
# Enable CDAT-migrated E3SM Diags
# Now rebased on 1/17: Further PR template updates (#662)

# Set up zppy env
conda clean --all --y
conda env create -f conda/dev.yml -n zppy_dev_pr651_1st_commit_20250117
conda activate zppy_dev_pr651_1st_commit_20250117
pip install .

# Unit tests for zppy
pytest tests/test_*.py
# 23 passed in 0.23s

# Integration testing for zppy

# Edit tests/integration/utils.py:
# UNIQUE_ID = "test_pr651_1st_commit_20250117"
#        "diags_environment_commands": "source /gpfs/fs1/home/ac.forsyth2/miniforge3/etc/profile.d/conda.sh; conda activate e3sm_diags_20250117",
#        "global_time_series_environment_commands": "source /gpfs/fs1/home/ac.forsyth2/miniforge3/etc/profile.d/conda.sh; conda activate zi_dev_weekly_20250117",
# Keep this as-is: generate_cfgs(unified_testing=False, dry_run=False)

# Run zppy to produce actual results to compare in the integration tests
python tests/integration/utils.py

zppy -c tests/integration/generated/test_weekly_comprehensive_v3_chrysalis.cfg
zppy -c tests/integration/generated/test_weekly_comprehensive_v2_chrysalis.cfg

cd /lcrc/group/e3sm/ac.forsyth2/zppy_weekly_comprehensive_v3_output/test_pr651_1st_commit_20250117/v3.LR.historical_0051/post/scripts/
grep -v "OK" *status
# No errors
cd /lcrc/group/e3sm/ac.forsyth2/zppy_weekly_comprehensive_v2_output/test_pr651_1st_commit_20250117/v2.LR.historical_0201/post/scripts
grep -v "OK" *status
# No errors

cd ~/ez/zppy
git branch # Check that we're on issue-346-diags-post-refactor-1st-commit
ls tests/integration/test_*.py
pytest tests/integration/test_bash_generation.py
# 1 failed in 2.49s
# Diffs are expected
pytest tests/integration/test_campaign.py
# 6 passed in 1.78s
pytest tests/integration/test_defaults.py
# 1 passed in 0.33s
pytest tests/integration/test_last_year.py
# 1 passed in 0.25s
pytest tests/integration/test_weekly.py # Skip bundles tests for now
# FAILED tests/integration/test_weekly.py::test_comprehensive_v2_images - AssertionError
# FAILED tests/integration/test_weekly.py::test_comprehensive_v3_images - AssertionError
# 2 failed, 3 skipped in 1339.78s (0:22:19)

# For v2:
# Total number of images checked: 4376
# Missing images: 117
# For v3:
# Total number of images checked: 5216
# Missing images: 138

@chengzhuzhang
Copy link
Collaborator

@forsyth2 nice to see all task status shows OK, Could you provide instructions on how to review the diffs?

@forsyth2
Copy link
Collaborator Author

See /lcrc/group/e3sm/public_html/diagnostic_output/ac.forsyth2/pr651_results_20250117

Diffs can be found at:

The image_check directories have multiple subdirectories. You'll need to navigate through a few layers to get to Actual, Diff, and Expected images.

I've also copied outputs of running the tests into the following files:

  • 1_commit_v2.txt
  • 1_commit_v3.txt
  • 2_commits_v2.txt
  • 2_commits_v3.txt

grep for Total number of images checked, Missing images, Mismatched images to get the counts shown in the table on #651 (comment).

@forsyth2
Copy link
Collaborator Author

@chengzhuzhang @tomvothecoder I'm just checking in to see if either of you were able to look into the diffs any further.

I know we discussed the challenges of the image diff checks. The internet suggests a number of ways to do a more nuanced approach to image diff checking, which I could try to implement, but we still have the problem of a lot of diffs being the same thing qualitatively. For instance, looking at some of the diffs, I see axis label differences "-60 => 60 S". It would be fantastic if we had some way for the code to automatically recognize these things and then group them together -- e.g., "here's all the diffs that are just axis label differences" or "here's all the diffs where the metrics differ". That might be infeasible though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Replace CDAT
3 participants