Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix flaky issue in test_set_fans_speed #16162

Merged
merged 1 commit into from
Dec 19, 2024

Conversation

sdszhang
Copy link
Contributor

@sdszhang sdszhang commented Dec 19, 2024

Description of PR

Summary:
Fixes wrong dut maybe selected to stop thermalctrl in chassis_fan tests.

TestChassisFans::setup fixture will select the first DUT as the duthost to stop/start thermal control daemon.
However, in the test case, it will enum_rand_one_per_hwsku_hostname to do the test.

Therefore, we will see something like the following. that lc4 is being selected to stop thermalctld, but the test is running on supervisor.

16/12/2024 12:42:40 base._run                                L0071 DEBUG  | /var/src/sonic-mgmt_xxx/tests/common/devices/sonic.py::stop_pmon_daemon_service#888: [xxx-lc4-1] AnsibleModule::shell, args=["docker exec pmon supervisorctl stop thermalctld"], kwargs={"module_ignore_errors": true}
......
16/12/2024 12:42:54 __init__._log_sep_line                   L0170 INFO   | ==================== platform_tests/api/test_chassis_fans.py::TestChassisFans::test_set_fans_speed[xxx-sup-1] call ====================

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 202012
  • 202205
  • 202305
  • 202311
  • 202405

Approach

What is the motivation for this PR?

flaky test case in test_set_fans_speed

How did you do it?

use enum_rand_one_per_hwsku_hostname instead of duthost to align with the test case.

How did you verify/test it?

ran the test 3 times, all passed.

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

@sdszhang sdszhang requested a review from prgeor as a code owner December 19, 2024 09:06
@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@sdszhang
Copy link
Contributor Author

Test result

platform_tests/api/test_chassis_fans.py::TestChassisFans::test_set_fans_speed[xxx-sup-1] PASSED [ 97%]
platform_tests/api/test_chassis_fans.py::TestChassisFans::test_set_fans_led[xxx-sup-1] SKIPPED [100%]

@sdszhang sdszhang changed the title fix errors in test_set_fans_speed fix flaky issue in test_set_fans_speed Dec 19, 2024
@augusdn augusdn self-requested a review December 19, 2024 23:17
Copy link
Contributor

@augusdn augusdn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yejianquan yejianquan merged commit 1637441 into sonic-net:master Dec 19, 2024
17 checks passed
yejianquan pushed a commit that referenced this pull request Dec 19, 2024
Cherry pick #16162

Description of PR
Summary:
Fixes wrong dut maybe selected to stop thermalctrl in chassis_fan tests.

TestChassisFans::setup fixture will select the first DUT as the duthost to stop/start thermal control daemon.
However, in the test case, it will enum_rand_one_per_hwsku_hostname to do the test.

Therefore, we will see something like the following. that lc4 is being selected to stop thermalctld, but the test is running on supervisor.

16/12/2024 12:42:40 base._run                                L0071 DEBUG  | /var/src/sonic-mgmt_xxx/tests/common/devices/sonic.py::stop_pmon_daemon_service#888: [xxx-lc4-1] AnsibleModule::shell, args=["docker exec pmon supervisorctl stop thermalctld"], kwargs={"module_ignore_errors": true}
......
16/12/2024 12:42:54 __init__._log_sep_line                   L0170 INFO   | ==================== platform_tests/api/test_chassis_fans.py::TestChassisFans::test_set_fans_speed[xxx-sup-1] call ====================

Approach
What is the motivation for this PR?
flaky test case in test_set_fans_speed

How did you do it?
use enum_rand_one_per_hwsku_hostname instead of duthost to align with the test case.

How did you verify/test it?
ran the test 3 times, all passed.

co-authorized by: jianquanye@microsoft.com
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Jan 2, 2025
Description of PR
Summary:
Fixes wrong dut maybe selected to stop thermalctrl in chassis_fan tests.

TestChassisFans::setup fixture will select the first DUT as the duthost to stop/start thermal control daemon.
However, in the test case, it will enum_rand_one_per_hwsku_hostname to do the test.

Therefore, we will see something like the following. that lc4 is being selected to stop thermalctld, but the test is running on supervisor.

16/12/2024 12:42:40 base._run                                L0071 DEBUG  | /var/src/sonic-mgmt_xxx/tests/common/devices/sonic.py::stop_pmon_daemon_service#888: [xxx-lc4-1] AnsibleModule::shell, args=["docker exec pmon supervisorctl stop thermalctld"], kwargs={"module_ignore_errors": true}
......
16/12/2024 12:42:54 __init__._log_sep_line                   L0170 INFO   | ==================== platform_tests/api/test_chassis_fans.py::TestChassisFans::test_set_fans_speed[xxx-sup-1] call ====================

Approach
What is the motivation for this PR?
flaky test case in test_set_fans_speed

How did you do it?
use enum_rand_one_per_hwsku_hostname instead of duthost to align with the test case.

How did you verify/test it?
ran the test 3 times, all passed.

co-authorized by: jianquanye@microsoft.com
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202411: #16298

mssonicbld pushed a commit that referenced this pull request Jan 3, 2025
Description of PR
Summary:
Fixes wrong dut maybe selected to stop thermalctrl in chassis_fan tests.

TestChassisFans::setup fixture will select the first DUT as the duthost to stop/start thermal control daemon.
However, in the test case, it will enum_rand_one_per_hwsku_hostname to do the test.

Therefore, we will see something like the following. that lc4 is being selected to stop thermalctld, but the test is running on supervisor.

16/12/2024 12:42:40 base._run                                L0071 DEBUG  | /var/src/sonic-mgmt_xxx/tests/common/devices/sonic.py::stop_pmon_daemon_service#888: [xxx-lc4-1] AnsibleModule::shell, args=["docker exec pmon supervisorctl stop thermalctld"], kwargs={"module_ignore_errors": true}
......
16/12/2024 12:42:54 __init__._log_sep_line                   L0170 INFO   | ==================== platform_tests/api/test_chassis_fans.py::TestChassisFans::test_set_fans_speed[xxx-sup-1] call ====================

Approach
What is the motivation for this PR?
flaky test case in test_set_fans_speed

How did you do it?
use enum_rand_one_per_hwsku_hostname instead of duthost to align with the test case.

How did you verify/test it?
ran the test 3 times, all passed.

co-authorized by: jianquanye@microsoft.com
@sdszhang
Copy link
Contributor Author

sdszhang commented Jan 3, 2025

manual cherry-pick PR #16163 merged for 202405.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants