Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SAI validation #15324

Open
wants to merge 34 commits into
base: master
Choose a base branch
from
Open

SAI validation #15324

wants to merge 34 commits into from

Conversation

opcoder0
Copy link
Contributor

@opcoder0 opcoder0 commented Nov 1, 2024

Description of PR

  • Introduces a new method to access SONiC Redis database instance using redis-py native Python library instead of relying on sonic-db-cli.
  • It also uses the Redis' pub-sub infrastructure to listen to key-space notifications to monitor DB changes instead of using the traditional wait_until which performs sleep->check cycle.
  • Adds changes to perform SAI validation for ACL tests.
  • Uses this library to monitor STATE_DB changes using the new notification library in test_bfd.py::test_bfd_basic.

Summary:
Fixes # Not applicable

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 202012
  • 202205
  • 202305
  • 202311
  • 202405

Approach

What is the motivation for this PR?

Motivation is to add a simple and easy to use library that allows tests (where applicable) to perform SAI validation and allow tests to use event notification-based method to monitor DB changes.

How did you do it?

Refer to description above.

How did you verify/test it?

Verified ACL tests (on t0) and BFD basic tests (on t1) testbeds.

Any platform specific information?

No

Supported testbed topology if it's a new test case?

NA

Documentation

TBD.

@opcoder0 opcoder0 marked this pull request as ready for review November 4, 2024 05:44
@wangxin
Copy link
Collaborator

wangxin commented Nov 27, 2024

Is there a way to avoid committing a deb package into this repo? I have 2 concerns:

  1. The repo should be used for source code only. Try not to commit any package that is publicly available to the repo unless there is a very very good reason.
  2. The package version is kind of fixed. Extra effort would be required to upgrade version of the packages.

@opcoder0
Copy link
Contributor Author

Is there a way to avoid committing a deb package into this repo? I have 2 concerns:

  1. The repo should be used for source code only. Try not to commit any package that is publicly available to the repo unless there is a very very good reason.
  2. The package version is kind of fixed. Extra effort would be required to upgrade version of the packages.

Regarding (1.) As the package will need to be installed on the DUT which most likely won't have Internet access. And I am not sure if it is okay to assume the server (in MS / community test environments) on which the tests run have Internet access either. Hence, I checked them into the repository. I see similar practice followed for some files specific to Mellanox -

./ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pkgs/sx-complib_1.mlnx.4.5.4414_amd64.deb
./ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pkgs/sx-gen-utils_1.mlnx.4.5.4414_amd64.deb
./ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pkgs/libnl-3-200_3.5.0-1_amd64.deb
./ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pkgs/python-sdk-api_1.mlnx.4.5.4414_amd64.deb
./ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pkgs/sxd-libs_1.mlnx.4.5.4414_amd64.deb
./ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pkgs/applibs_1.mlnx.4.5.4414_amd64.deb

Regarding (2.) The two versions of socat correspond to the SONiC base OS versions (bullseye and bookworm).

@wangxin
Copy link
Collaborator

wangxin commented Dec 6, 2024

Is there a way to avoid committing a deb package into this repo? I have 2 concerns:

  1. The repo should be used for source code only. Try not to commit any package that is publicly available to the repo unless there is a very very good reason.
  2. The package version is kind of fixed. Extra effort would be required to upgrade version of the packages.

Regarding (1.) As the package will need to be installed on the DUT which most likely won't have Internet access. And I am not sure if it is okay to assume the server (in MS / community test environments) on which the tests run have Internet access either. Hence, I checked them into the repository. I see similar practice followed for some files specific to Mellanox -

./ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pkgs/sx-complib_1.mlnx.4.5.4414_amd64.deb
./ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pkgs/sx-gen-utils_1.mlnx.4.5.4414_amd64.deb
./ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pkgs/libnl-3-200_3.5.0-1_amd64.deb
./ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pkgs/python-sdk-api_1.mlnx.4.5.4414_amd64.deb
./ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pkgs/sxd-libs_1.mlnx.4.5.4414_amd64.deb
./ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pkgs/applibs_1.mlnx.4.5.4414_amd64.deb

Regarding (2.) The two versions of socat correspond to the SONiC base OS versions (bullseye and bookworm).

Most likely these packages are no longer being used. And this is not a good practice anyway. I don't feel it is a good idea to go further on this path.
In the code base, we can assume that either DUTs have direct internet access or via http proxy.
Maybe we can also consider building this package into SONiC image. Then in tests, the code can firstly check if there is such tool. If not, then try to install it.

@wangxin
Copy link
Collaborator

wangxin commented Dec 9, 2024

tests/common/database is a new folder. Need to add a file __init__.py file under this folder because the parent folder is a python package.


## Introduction

SONiC management tests are Pytest modules running in the SONiC management container on the developer/CI/test environment and PTF tests running from the PTF container on the testbed server. As part of the setup and tear down activities the tests make configuration changes to SONiC, run the tests and verify if the tests ran successfully by making additional configuration checks and finally tear down the configuration changes. The tests use command line utilities on the DUT like `config`, `sonic-db-cli` or `redis-cli` to set and get configuration values. In some cases tests export / dump the contents of the database to examine its results and verify if tests ran successfully. These configuration changes are propogated to the ASIC through ASIC_DB. The aim of this design document is to identify a mechanism for tests to validate the configuration changes against ASIC_DB entries or SAI object types.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SONiC management tests -> SONiC tests in the sonic-mgmt repository?


## Introduction

SONiC management tests are Pytest modules running in the SONiC management container on the developer/CI/test environment and PTF tests running from the PTF container on the testbed server. As part of the setup and tear down activities the tests make configuration changes to SONiC, run the tests and verify if the tests ran successfully by making additional configuration checks and finally tear down the configuration changes. The tests use command line utilities on the DUT like `config`, `sonic-db-cli` or `redis-cli` to set and get configuration values. In some cases tests export / dump the contents of the database to examine its results and verify if tests ran successfully. These configuration changes are propogated to the ASIC through ASIC_DB. The aim of this design document is to identify a mechanism for tests to validate the configuration changes against ASIC_DB entries or SAI object types.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SONiC management container -> sonic-mgmt container.


- **Using threading and polling** as implemented in [SWSS VS Tests](https://github.com/sonic-net/sonic-swss/blob/master/tests/README.md), [conftest](https://github.com/sonic-net/sonic-swss/blob/master/tests/conftest.py), [dvslib/dvs_database](https://github.com/sonic-net/sonic-swss/blob/master/tests/dvslib/dvs_database.py)

- **Using Redis pubsub** This approach takes advantage of Redis pubsub to get notification of keyspace changes instead of using threading or polling to watch for key value changes. This mechanism improves performance and predictability of watching for keyspace changes. The tests don't have to rely on threads or sleep timers to check for value changes. The keyspace notifications are enabled by default for `ASIC_DB`. Keyspace notifications are enabled by default (`AKE`) in SONiC Redis instance. There are two kinds of message types received in pubsub `psubscribe` and `pmessage`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In "Keyspace notifications are enabled by default (AKE) in SONiC Redis instance."

What does AKE mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AKE are flags which translate to "all the events except for new key events (n) and missed events (m)". More details are available here https://redis.io/docs/latest/develop/use/keyspace-notifications/#configuration. I have updated the design doc with this link too.

After=database.service

[Service]
ExecStart=/usr/bin/socat TCP4-LISTEN:6381,fork TCP:localhost:6379
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The template has port 6381 hard coded. The added pytest option --exposed-db-port is not actually being used.


time.sleep(1)
# check all STATE_DB BFD_SESSION_TABLE neighbors' state is Up
status, actual_wait = wait_until_condition(q, prefix, neighbor_addrs,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If exception is raised in wait_until_condition, it would return None. Can this tuple assignment pass if wait_until_condition returns None? If not, then there is a problem here. And other wait_until_* tools have the similar problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. ThreadPoolExecutor.submit raises RuntimeError, TypeError and BrokenThreadPool all of these shouldn't be handled. The concurrent.futures.Future.result submit() can raise TimeoutError, CancelledError and Exception which have been handled by logging and returning appropriate return values.

@opcoder0
Copy link
Contributor Author

Most likely these packages are no longer being used. And this is not a good practice anyway. I don't feel it is a good idea to go further on this path. In the code base, we can assume that either DUTs have direct internet access or via http proxy. Maybe we can also consider building this package into SONiC image. Then in tests, the code can firstly check if there is such tool. If not, then try to install it.

Thanks @wangxin. I agree that committing Debian packages to the repository is not the best way forward. I am checking with the SONiC team to see if socat can be / is part of the image. I observed a few builds 202311 and 202405 already had socat on them. However I am not sure if they are there across all the builds based on bullseye and bookworm. I observed a dirty build based on master didn't have it for some reason.

Modify template to use exposed db port;
Update documentation;
Add __init__.py to tests/common/database;
@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Pull request contains merge conflicts.

@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants