Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Snappi] Adding Ungraceful Restart script for BGP Outbound cases #16359

Closed
wants to merge 20 commits into from

Conversation

selldinesh
Copy link
Contributor

Description of PR

Summary:Adding Ungraceful Restart script as part of BGP Outbound cases
Fixes # (issue)

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 202012
  • 202205
  • 202305
  • 202311
  • 202405

Approach

What is the motivation for this PR?

To add Ungraceful restart case

How did you do it?

How did you verify/test it?

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@deepak-singhal0408 deepak-singhal0408 self-requested a review January 6, 2025 22:39
@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@deepak-singhal0408
Copy link
Contributor

@selldinesh PR checker(Pre_test Validate failing)..

Could you fix the import error..

2025-01-07T00:18:39.3756408Z ==================================== ERRORS ====================================
2025-01-07T00:18:39.3756592Z _ ERROR collecting snappi_tests/multidut/bgp/test_bgp_outbound_ungraceful_restart.py _
2025-01-07T00:18:39.3756889Z ImportError while importing test module '/var/src/sonic-mgmt/tests/snappi_tests/multidut/bgp/test_bgp_outbound_ungraceful_restart.py'.
2025-01-07T00:18:39.3757094Z Hint: make sure your test modules/packages have valid Python names.
2025-01-07T00:18:39.3757225Z Traceback:
2025-01-07T00:18:39.3757365Z /usr/lib/python3.8/importlib/init.py:127: in import_module
2025-01-07T00:18:39.3757528Z return _bootstrap._gcd_import(name[level:], package, level)
2025-01-07T00:18:39.3757695Z snappi_tests/multidut/bgp/test_bgp_outbound_ungraceful_restart.py:9: in
2025-01-07T00:18:39.3757877Z from tests.snappi_tests.multidut.bgp.files.bgp_outbound_helper_pr import (
2025-01-07T00:18:39.3758131Z E ModuleNotFoundError: No module named 'tests.snappi_tests.multidut.bgp.files.bgp_outbound_helper_pr'

@mssonicbld
Copy link
Collaborator

/azp run

Copy link

linux-foundation-easycla bot commented Jan 7, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@selldinesh
Copy link
Contributor Author

/azp run

Copy link

Commenter does not have sufficient privileges for PR 16359 in repo sonic-net/sonic-mgmt

selldinesh and others added 7 commits January 9, 2025 15:04
…st (sonic-net#16157)

What is the motivation for this PR?
Elastictest performs well in distribute running PR test in multiple KVMs, which support us to add more test scripts to PR checker.
But some traffic test can't be tested on KVM platform, we need to skip traffic test if needed

How did you do it?
This PR adds qos tests to the KVM-based PR test framework with the following scope and modifications:

Excludes fanout switch-related configurations, which are not applicable in the KVM test environment.
Traffic tests have been intentionally skipped due to the limitations of running traffic in the KVM environment.
How did you verify/test it?
What is the motivation for this PR?
In our Impacted Area Based PR testing, we noticed some failure beacuse of the error Could not get lock /var/lib/dpkg/lock-frontend. It is held by process 5166 (dpkg). In this PR, we add a timeout for acquiring dpkg lock to avoid this issue.

How did you do it?
In this PR, we add a timeout for acquiring dpkg lock to avoid this issue.

How did you verify/test it?
The JR2 tuning values were copied over from the J2c+ tuning values here: sonic-net#13660

But the JR2 tunings are slightly different as the shared buffer pool is different on JR2 vs J2c+.
J2c+:
…ic-net#15708)

We are seeing UnboundLocalError when running sonic-mgmt tests against
a single-ASIC linecard:

```
UnboundLocalError: local variable 'dst_sys_port_id' referenced before assignment
```

Upon further investigation, this was determined to be happening
because a previous attempt to fix this issue (PR sonic-net#13700) completely
omitted the ASIC prefix, but the entries in SYSTEM_PORT in config_db
do have an Asic0 prefix even on a single ASIC DUT.

Resolve this by specifically adding the Asic0 prefix in the case of a
single-ASIC T2 DUT, instead of leaving the prefix out.

Tested by manually running qos tests on a T2 single ASIC DUT with
these changes.
Create /etc/tacacs folder on PTF when it's missing

Why I did it
TACACS test failed because /etc/tacacs+ folder does not exist, which recently missing on some version of PTF container image.

Error message:

E msg = Destination directory /etc/tacacs+ does not exist

How I did it
Create /etc/tacacs folder on PTF when it's missing

How to verify it
Pass all test case.

Description for the changelog
Create /etc/tacacs folder on PTF when it's missing
liuh-80 and others added 11 commits January 9, 2025 15:05
…ailed. (sonic-net#16373)

Remove golden config file and revert config when load golden config failed.

Why I did it
Fix bug sonic-net#16338
TACACS test case reload golden config failed, and golden config not remove because a code bug, then all test case failed after that because login failed.

How I did it
add try finally to make sure golden config always remove after reload config.

How to verify it
Pass all test case.

Description for the changelog
Remove golden config file and revert config when load golden config failed.
Description of PR
Optimize bgp/test_reliable_tsa.py with multithreading to reduce the running time.

Summary:
Fixes # (issue)

Approach
What is the motivation for this PR?
The bgp/test_reliable_tsa.py takes a very long time to finish on T2 chassis (5.5h ~ 6h), so we wanted to optimize it using multithreading to reduce the running time. After the optimization, the running time is reduced to ~3.5h.

How did you do it?
How did you verify/test it?
I ran the updated code and can confirm it's working as expected. Elastictest link with flaky test case re-run link

co-authorized by: jianquanye@microsoft.com
…e missing (sonic-net#16357)

What is the motivation for this PR?
Sometimes exabgp in ptf would be in incorrect status by stress testing, hence add restarting exabgp before re-announce routes in sanity check.

How did you do it?
Restart exabgp before re-announce routes
Add try catch to handle failed to re-announce issue
How did you verify/test it?
Run test with sanity check
…IPv6 neighbor addresses on KVM testbeds. (sonic-net#16371)

Temporarily skipping test_arp_update_for_failed_standby_neighbor for IPv6 neighbor addresses on KVM testbeds. 

Signed-off-by: Mahdi Ramezani <mramezani@microsoft.com>
…net#16169)

In sonic-net#8149 the multi-asic and multi-dut variants were added to test_qos_sai.py.
This required updating calls to dynamically_compensate_leakout to specify either the src_client or dst_clientbut a couple calls inPGSharedWatermarkTest` passed the wrong client.

For more details on the failure this causes see sonic-net#16167

Summary:
Fixes sonic-net#16167
Fix test ipfwd/test_nhop_group.py for Arista 7050CX3 SKU
… updated. (sonic-net#16396)

Signed-off-by: Mahdi Ramezani <mramezani@microsoft.com>
…#15199)

Summary:
Covers following

TestGap on VOQ Chassis: [Test Gap]Chassis-VOQ: ECMP hashing tests when member goes down/UP sonic-net#14985
Generic all platform ECMP Hashing test with Member flap trigger
Type of change
 Bug fix
 Testbed and Framework(new/improvement)
 Test case(new/improvement)
Back port request
 202012
 202205
 202305
 202311
 202405
Approach
What is the motivation for this PR?
Currently there is no test case to cover ECMP hashing upon member flap trigger. This gap is present on pizza box DUTs as well.
Also on VOQ chassis the ECMP member flap on one linecard, needs to get synced to remote linecards. We currently dont have testcase covering this.

How did you do it?
Wrote a new testcase test_ecmp_group_member_flap(common for all topology types).
Underlying it utilizes the existing fib_test infrastructure.

High level:
Here, I am verifying ECMP member flap test on default route.
The test will be skipped if DUT doesnt have default route/num paths <2.
Initially the test verifies traffic forwarding and ECMP hashing of this default route.
Then one of the member port is brought down, and the again traffic test and ECMP hashing test is carried out.
Finally, the member port is brought back and traffic/hash test is carried out.

For VOQ chassis, I pass an additional parameter(skip_src_ports) to ensure that ptf incoming traffic lands on remote linecard. This is to ensure ECMP member down/up is handled on remote linecards properly.

How did you verify/test it?
Verified the test on T2 Chassis.
Also verified on T1 DUT.
@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Pull request contains merge conflicts.

@deepak-singhal0408
Copy link
Contributor

@selldinesh , closing this one..
Created a new PR #16636 to accommodate above changes plus a few additional changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.