-
Notifications
You must be signed in to change notification settings - Fork 746
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix multiple failures in KillProcess test on KVM. #16303
base: master
Are you sure you want to change the base?
Conversation
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
7aa6deb
to
b5e660f
Compare
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Have you tested locally? |
|
Can we consider the issue #16238 has been fixed? Is so, you can close it and let's monitor the result in PR test. |
Thanks I will close the bug once this is merged. |
@@ -15,23 +15,20 @@ | |||
("gnmi", False, "Dbus does not support gnmi service management"), | |||
("nonexistent", False, "Dbus does not support nonexistent service management"), | |||
("", False, "Dbus stop_service called with no service specified"), | |||
("snmp", True, ""), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is indeed valid. Restarting snmp container will produce an error log due to this line:
https://github.com/sonic-net/sonic-buildimage/blob/39e2131a7b76f6c3d5257b7e02c540dd33a24d5b/files/build_templates/docker_image_ctl.j2#L114
{%- elif docker_container_name == "snmp" %}
$SONIC_DB_CLI STATE_DB HSET 'DEVICE_METADATA|localhost' chassis_serial_number $(decode-syseeprom -s)
Because
sudo decode-syseeprom -s
Failed to read system EEPROM info
I think this is known:
# For kvm testbed, command `show platform syseeprom` will return the expected Error |
This will also cause a similar issue when killing pmon (I think this is due to "missing sonic_platform module".)
So for now let's just skip these two for vs platform. I don't think this affect our ability to quality the KillProcess implementation.
("dhcp_relay", True, ""), | ||
("radv", True, ""), | ||
("restapi", True, ""), | ||
("lldp", True, ""), | ||
("sshd", True, ""), | ||
("swss", True, ""), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out it is because the test wasn't written correctly: We need to explicitly wait for critical processes to start after killing swss. Looks like killing and restarting swss will make a lot of other processes restart, and if we don't wait and immediately start the next testcase, it will generate some swss error (in the next testcase).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed after adding a wait for critical process back.
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Description of PR
Summary:
isAutoNegEnabled: Failed to get port AutoNeg status for port pid:1000000000021
Fixes #16238
Type of change
Back port request
Approach
The test tries to kill a process then restart it but the API throws an exception if the process does not exist.
Here we gracefully catch the exception and skip the test.
What is the motivation for this PR?
Fix the test for KVM.
How did you do it?
How did you verify/test it?
Any platform specific information?
Supported testbed topology if it's a new test case?
Documentation