Skip to content
This repository has been archived by the owner on Mar 20, 2024. It is now read-only.

multiple device request issue #81

Merged
merged 5 commits into from
Nov 28, 2023

Conversation

maryamtahhan
Copy link
Contributor

@maryamtahhan maryamtahhan commented Nov 7, 2023

The issue is if a single pod requests different devices from different pools it results in multiple uds servers serving the same container and all attempt to mount their uds to the pod as /tmp/afxdp.sock.

A similar issue exists for the AFXDP_DEVICES env var that's set in each container.

This patch fixes the first issue by mounting the xsk socket at /tmp/afxdp_dp//afxdp.sock
This patch fixes the second issue by setting the env var in the container to AFXDP_DEVICES_<pool_name>

  • We still need to update test code.
  • Update CNDP
  • Update DPDK - patch ready to send to DPDK.

the issue is if a single pod requests different devices from
different pools it results in multiple uds servers serving the
same container and all attempt to mount their uds to the pod
as /tmp/afxdp.sock.
A similar issue exists for the AFXDP_DEVICES env var that's
set in each container. This patch fixes the first issue by
mounting the xsksocket at /tmp/afxdp_dp/<netdev>/afxdp.sock,
a similar issue also existed for the bpf map pinning support.

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
@maryamtahhan maryamtahhan force-pushed the feat_hotfix_multiple_uds branch from 6a31a01 to d717535 Compare November 7, 2023 13:21
Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
@maryamtahhan maryamtahhan marked this pull request as ready for review November 15, 2023 09:59
@maryamtahhan
Copy link
Contributor Author

maryamtahhan commented Nov 15, 2023

an example of what's set for a pod that requests 2 devices from 2 different pools [access, core] is the following:

[root@dpdk dpdk]# env | grep AFXDP
AFXDP_DEVICES_ACCESS=ens3f0np0
AFXDP_DEVICES_CORE=ens3f1np1


[root@dpdk dpdk]# ls -r /tmp/*

/tmp/ens3f1np1:
afxdp.sock

/tmp/ens3f0np0:
afxdp.sock

constants/constants.go Outdated Show resolved Hide resolved
Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
@maryamtahhan
Copy link
Contributor Author

Tested with DPDK

testpmd output

./build/app/dpdk-testpmd --log-level pmd.net.af_xdp:debug  -l 0-2 --no-pci --main-lcore=2 --vdev net_af_xdp0,iface=ens3f0np0,start_queue=22,queue_count=1,use_cni=1,sock=/tmp/afxdp_dp/ens3f0np0/afxdp.sock --vdev net_af_xdp1,iface=ens3f1np1,start_queue=22,queue_count=1,use_cni=1,sock=/tmp/afxdp_dp/ens3f1np1/afxdp.sock -- -i --a --nb-cores=2 --rxq=1 --txq=1 --forward-mode=macswap;
EAL: Detected CPU lcores: 64
EAL: Detected NUMA nodes: 2
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: 8 hugepages of size 1073741824 reserved, but no mounted hugetlbfs found for that size
EAL: VFIO support initialized
rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp0
init_internals(): Zero copy between umem and mbuf enabled.
rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp1
init_internals(): Zero copy between umem and mbuf enabled.
TELEMETRY: No legacy callbacks, legacy socket not created
Interactive-mode selected
Auto-start selected
Set macswap packet forwarding mode
testpmd: create a new mbuf pool <mb_pool_0>: n=163456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
testpmd: create a new mbuf pool <mb_pool_1>: n=163456, size=2176, socket=1
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (socket 1)
eth_rx_queue_setup(): Set up rx queue, rx queue id: 0, xsk queue id: 22
make_request_cni(): Request: [/connect,dpdk]
make_request_cni(): Response: [/connect,dpdk]
make_request_cni(): Request: [/version]
make_request_cni(): Response: [/version]
make_request_cni(): Request: [/xsk_map_fd,ens3f0np0]
make_request_cni(): Response: [/xsk_map_fd,ens3f0np0]
make_request_cni(): Request: [/fin]
make_request_cni(): Response: [/fin]
Port 0: 40:A6:B7:96:C8:D8
Configuring Port 1 (socket 1)
eth_rx_queue_setup(): Set up rx queue, rx queue id: 0, xsk queue id: 22
make_request_cni(): Request: [/connect,dpdk]
make_request_cni(): Response: [/connect,dpdk]
make_request_cni(): Request: [/version]
make_request_cni(): Response: [/version]
make_request_cni(): Request: [/xsk_map_fd,ens3f1np1]
make_request_cni(): Response: [/xsk_map_fd,ens3f1np1]
make_request_cni(): Request: [/fin]
make_request_cni(): Response: [/fin]

DP logs

INFO[2023-11-27 14:34:54] [udsserver.go:147] [start] Unix domain socket initialised. Listening for new connection.
INFO[2023-11-27 14:35:20] [udsserver.go:162] [start] New connection accepted. Waiting for requests.
DEBU[2023-11-27 14:35:20] [uds.go:191] [Read] Read: /connect,dpdk
DEBU[2023-11-27 14:35:20] [uds.go:208] [Read] Request contains no file descriptor
INFO[2023-11-27 14:35:20] [udsserver.go:248] [read] Pod unvalidated - Request: /connect,dpdk
DEBU[2023-11-27 14:35:20] [udsserver.go:341] [validatePod] Pod dpdk - Validating pod hostname
DEBU[2023-11-27 14:35:20] [resources_api.go:78] [getPodResources] Opening Pod Resource API connection
DEBU[2023-11-27 14:35:20] [resources_api.go:94] [getPodResources] Requesting pod resource list
DEBU[2023-11-27 14:35:20] [resources_api.go:90] [func2] Closing Pod Resource API connection
DEBU[2023-11-27 14:35:20] [udsserver.go:350] [validatePod] Pod dpdk - Found on node
INFO[2023-11-27 14:35:20] [udsserver.go:382] [validatePod] Pod dpdk is valid for this UDS connection
INFO[2023-11-27 14:35:20] [udsserver.go:253] [write] Pod dpdk - Response: /host_ok
DEBU[2023-11-27 14:35:20] [uds.go:229] [Write] Write: /host_ok
DEBU[2023-11-27 14:35:20] [uds.go:191] [Read] Read: /version
DEBU[2023-11-27 14:35:20] [uds.go:208] [Read] Request contains no file descriptor
INFO[2023-11-27 14:35:20] [udsserver.go:248] [read] Pod dpdk - Request: /version
INFO[2023-11-27 14:35:20] [udsserver.go:253] [write] Pod dpdk - Response: 0.1
DEBU[2023-11-27 14:35:20] [uds.go:229] [Write] Write: 0.1
DEBU[2023-11-27 14:35:20] [uds.go:191] [Read] Read: /xsk_map_fd,ens3f0np0
DEBU[2023-11-27 14:35:20] [uds.go:208] [Read] Request contains no file descriptor
INFO[2023-11-27 14:35:20] [udsserver.go:248] [read] Pod dpdk - Request: /xsk_map_fd,ens3f0np0
DEBU[2023-11-27 14:35:20] [udsserver.go:280] [handleFdRequest] Pod dpdk - Device ens3f0np0 recognised
INFO[2023-11-27 14:35:20] [udsserver.go:261] [writeWithFD] Pod dpdk - Response: /fd_ack, FD: 28
DEBU[2023-11-27 14:35:20] [uds.go:221] [Write] Write: /fd_ack, FD: 28
DEBU[2023-11-27 14:35:20] [uds.go:191] [Read] Read: /fin
DEBU[2023-11-27 14:35:20] [uds.go:208] [Read] Request contains no file descriptor
INFO[2023-11-27 14:35:20] [udsserver.go:248] [read] Pod dpdk - Request: /fin
INFO[2023-11-27 14:35:20] [udsserver.go:253] [write] Pod dpdk - Response: /fin_ack
DEBU[2023-11-27 14:35:20] [uds.go:229] [Write] Write: /fin_ack
DEBU[2023-11-27 14:35:20] [uds.go:298] [cleanup] Closing Unix listener
DEBU[2023-11-27 14:35:20] [uds.go:301] [cleanup] Closing connection
DEBU[2023-11-27 14:35:20] [uds.go:304] [cleanup] Closing socket file
DEBU[2023-11-27 14:35:20] [uds.go:306] [cleanup] Removing socket file
INFO[2023-11-27 14:35:20] [udsserver.go:162] [start] New connection accepted. Waiting for requests.
DEBU[2023-11-27 14:35:20] [uds.go:191] [Read] Read: /connect,dpdk
DEBU[2023-11-27 14:35:20] [uds.go:208] [Read] Request contains no file descriptor
INFO[2023-11-27 14:35:20] [udsserver.go:248] [read] Pod unvalidated - Request: /connect,dpdk
DEBU[2023-11-27 14:35:20] [udsserver.go:341] [validatePod] Pod dpdk - Validating pod hostname
DEBU[2023-11-27 14:35:20] [resources_api.go:78] [getPodResources] Opening Pod Resource API connection
DEBU[2023-11-27 14:35:20] [resources_api.go:94] [getPodResources] Requesting pod resource list
DEBU[2023-11-27 14:35:20] [resources_api.go:90] [func2] Closing Pod Resource API connection
DEBU[2023-11-27 14:35:20] [udsserver.go:350] [validatePod] Pod dpdk - Found on node
INFO[2023-11-27 14:35:20] [udsserver.go:382] [validatePod] Pod dpdk is valid for this UDS connection
INFO[2023-11-27 14:35:20] [udsserver.go:253] [write] Pod dpdk - Response: /host_ok
DEBU[2023-11-27 14:35:20] [uds.go:229] [Write] Write: /host_ok
DEBU[2023-11-27 14:35:20] [uds.go:191] [Read] Read: /version
DEBU[2023-11-27 14:35:20] [uds.go:208] [Read] Request contains no file descriptor
INFO[2023-11-27 14:35:20] [udsserver.go:248] [read] Pod dpdk - Request: /version
INFO[2023-11-27 14:35:20] [udsserver.go:253] [write] Pod dpdk - Response: 0.1
DEBU[2023-11-27 14:35:20] [uds.go:229] [Write] Write: 0.1
DEBU[2023-11-27 14:35:20] [uds.go:191] [Read] Read: /xsk_map_fd,ens3f1np1
DEBU[2023-11-27 14:35:20] [uds.go:208] [Read] Request contains no file descriptor
INFO[2023-11-27 14:35:20] [udsserver.go:248] [read] Pod dpdk - Request: /xsk_map_fd,ens3f1np1
DEBU[2023-11-27 14:35:20] [udsserver.go:280] [handleFdRequest] Pod dpdk - Device ens3f1np1 recognised
INFO[2023-11-27 14:35:20] [udsserver.go:261] [writeWithFD] Pod dpdk - Response: /fd_ack, FD: 34
DEBU[2023-11-27 14:35:20] [uds.go:221] [Write] Write: /fd_ack, FD: 34
DEBU[2023-11-27 14:35:20] [uds.go:191] [Read] Read: /fin
DEBU[2023-11-27 14:35:20] [uds.go:208] [Read] Request contains no file descriptor
INFO[2023-11-27 14:35:20] [udsserver.go:248] [read] Pod dpdk - Request: /fin
INFO[2023-11-27 14:35:20] [udsserver.go:253] [write] Pod dpdk - Response: /fin_ack
DEBU[2023-11-27 14:35:20] [uds.go:229] [Write] Write: /fin_ack
DEBU[2023-11-27 14:35:20] [uds.go:298] [cleanup] Closing Unix listener
DEBU[2023-11-27 14:35:20] [uds.go:301] [cleanup] Closing connection
DEBU[2023-11-27 14:35:20] [uds.go:304] [cleanup] Closing socket file
DEBU[2023-11-27 14:35:20] [uds.go:306] [cleanup] Removing socket file

@maryamtahhan
Copy link
Contributor Author

CNDP PR CloudNativeDataPlane/cndp#351

@@ -73,14 +73,14 @@ build: builddp buildcni
docker: ## Build docker image
@echo "****** Docker Image ******"
@echo
docker build -t localhost:5000/afxdp-device-plugin -f images/amd64.dockerfile .
docker build -t afxdp-device-plugin -f images/amd64.dockerfile .
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@patrickog11 - can you remind me why we had "localhost:5000" here?
Is it needed? Was it just a workaround on our side?

@garyloug garyloug merged commit 720d92b into intel:main Nov 28, 2023
6 checks passed
@garyloug garyloug mentioned this pull request Nov 28, 2023
ovsrobot pushed a commit to ovsrobot/dpdk that referenced this pull request Dec 15, 2023
With the original 'use_cni' implementation, (using a
hardcoded socket rather than a configurable one),
if a DPDK pod is requesting multiple net devices
and these devices are from different pools, then
the container attempts to mount all the netdev UDSes
in the pod as /tmp/afxdp.sock. Which means that at best
only 1 netdev will handshake correctly with the AF_XDP
DP. This patch addresses this by making the socket
parameter configurable using a new vdev param called
'uds_path' and removing the previous 'use_cni' param.
This patch also fixes incorrect references to the
AF_XDP DP as CNI and updates the documentation with a
working example. This change has been tested with the
AF_XDP DP PR 81[1], with both single and multiple interfaces.

[1] intel/afxdp-plugins-for-kubernetes#81

v6:
* Add link to PR 81 in commit message
* Add release notes changes to this patchset

v5:
* Fix alignment for ETH_AF_XDP_USE_DP_UDS_PATH_ARG
* Remove use_cni references in af_xdp.rst

v4:
* Rename af_xdp_cni.rst to af_xdp_dp.rst
* Removed all incorrect references to CNI throughout af_xdp
  PMD file.
* Fixed Typos in af_xdp_dp.rst

v3:
* Remove `use_cni` vdev argument as it's no longer needed.
* Update incorrect CNI references for the AF_XDP DP in the
  documentation.
* Update the documentation to run a simple example with the
  AF_XDP DP plugin in K8s.

v2:
* Rename sock_path to uds_path.
* Update documentation to reflect when CAP_BPF is needed.
* Fix testpmd arguments in the provided example for Pods.
* Use AF_XDP API to update the xskmap entry.

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
Reviewed-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: 0-day Robot <robot@bytheb.org>
ovsrobot pushed a commit to ovsrobot/dpdk that referenced this pull request Dec 22, 2023
The original 'use_cni' implementation, was added
to enable support for the AF_XDP PMD in a K8s env
without any escalated privileges.
However 'use_cni' used a hardcoded socket rather
than a configurable one. If a DPDK pod is requesting
multiple net devices and these devices are from
different pools, then the AF_XDP PMD attempts to
mount all the netdev UDSes in the pod as /tmp/afxdp.sock.
Which means that at best only 1 netdev will handshake
correctly with the AF_XDP DP. This patch addresses
this by making the socket parameter configurable using
a new vdev param called 'uds_path' and removing the
previous 'use_cni' param. This change has been tested
with the AF_XDP DP PR 81[1], with both single and
multiple interfaces. This patch also renames the
af_xdp_cni.rst doc to af_xdp_dp.rst and changes
incorrect references to the DP as CNI. Lastly,
this patch adds this feature to the release notes.

[1] intel/afxdp-plugins-for-kubernetes#81

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
Reviewed-by: Ciara Loftus <ciara.loftus@intel.com>
Reviewed-by: Shibin Koikkara Reeny <shibin.koikkara.reeny@intel.com>
Signed-off-by: 0-day Robot <robot@bytheb.org>
maryamtahhan added a commit to maryamtahhan/dpdk that referenced this pull request Jan 8, 2024
The original 'use_cni' implementation, was added
to enable support for the AF_XDP PMD in a K8s env
without any escalated privileges.
However 'use_cni' used a hardcoded socket rather
than a configurable one. If a DPDK pod is requesting
multiple net devices and these devices are from
different pools, then the AF_XDP PMD attempts to
mount all the netdev UDSes in the pod as /tmp/afxdp.sock.
Which means that at best only 1 netdev will handshake
correctly with the AF_XDP DP. This patch addresses
this by making the socket parameter configurable using
a new vdev param called 'uds_path' and removing the
previous 'use_cni' param. This change has been tested
with the AF_XDP DP PR 81[1], with both single and
multiple interfaces. This patch also renames the
af_xdp_cni.rst doc to af_xdp_dp.rst and changes
incorrect references to the DP as CNI. Lastly,
this patch adds this feature to the release notes.

[1] intel/afxdp-plugins-for-kubernetes#81

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
Reviewed-by: Ciara Loftus <ciara.loftus@intel.com>
Reviewed-by: Shibin Koikkara Reeny <shibin.koikkara.reeny@intel.com>
---
v7:
* Give a more descriptive commit msg headline.
* Fixup typos in documentation.

v6:
* Add link to PR 81 in commit message
* Add release notes changes to this patchset

v5:
* Fix alignment for ETH_AF_XDP_USE_DP_UDS_PATH_ARG
* Remove use_cni references in af_xdp.rst

v4:
* Rename af_xdp_cni.rst to af_xdp_dp.rst
* Removed all incorrect references to CNI throughout af_xdp
  PMD file.
* Fixed Typos in af_xdp_dp.rst

v3:
* Remove `use_cni` vdev argument as it's no longer needed.
* Update incorrect CNI references for the AF_XDP DP in the
  documentation.
* Update the documentation to run a simple example with the
  AF_XDP DP plugin in K8s.

v2:
* Rename sock_path to uds_path.
* Update documentation to reflect when CAP_BPF is needed.
* Fix testpmd arguments in the provided example for Pods.
* Use AF_XDP API to update the xskmap entry.
ovsrobot pushed a commit to ovsrobot/dpdk that referenced this pull request Feb 29, 2024
The original 'use_cni' implementation, was added
to enable support for the AF_XDP PMD in a K8s env
without any escalated privileges.
However 'use_cni' used a hardcoded socket rather
than a configurable one. If a DPDK pod is requesting
multiple net devices and these devices are from
different pools, then the AF_XDP PMD attempts to
mount all the netdev UDSes in the pod as /tmp/afxdp.sock.
Which means that at best only 1 netdev will handshake
correctly with the AF_XDP DP. This patch addresses
this by making the socket parameter configurable using
a new vdev param called 'dp_path' alongside the
original 'use_cni' param. If the 'dp_path' parameter
is not set alongside the 'use_cni' parameter, then
it's configured inside the AF_XDP PMD (transparently
to the user). This change has been tested
with the AF_XDP DP PR 81[1], with both single and
multiple interfaces.

[1] intel/afxdp-plugins-for-kubernetes#81

Fixes: 7fc6ae5 ("net/af_xdp: support CNI Integration")
Cc: stable@dpdk.org

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
Signed-off-by: 0-day Robot <robot@bytheb.org>
ovsrobot pushed a commit to ovsrobot/dpdk that referenced this pull request Feb 29, 2024
The original 'use_cni' implementation, was added
to enable support for the AF_XDP PMD in a K8s env
without any escalated privileges.
However 'use_cni' used a hardcoded socket rather
than a configurable one. If a DPDK pod is requesting
multiple net devices and these devices are from
different pools, then the AF_XDP PMD attempts to
mount all the netdev UDSes in the pod as /tmp/afxdp.sock.
Which means that at best only 1 netdev will handshake
correctly with the AF_XDP DP. This patch addresses
this by making the socket parameter configurable using
a new vdev param called 'dp_path' alongside the
original 'use_cni' param. If the 'dp_path' parameter
is not set alongside the 'use_cni' parameter, then
it's configured inside the AF_XDP PMD (transparently
to the user). This change has been tested
with the AF_XDP DP PR 81[1], with both single and
multiple interfaces.

[1] intel/afxdp-plugins-for-kubernetes#81

Fixes: 7fc6ae5 ("net/af_xdp: support CNI Integration")
Cc: stable@dpdk.org

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
Signed-off-by: 0-day Robot <robot@bytheb.org>
@maryamtahhan maryamtahhan deleted the feat_hotfix_multiple_uds branch April 9, 2024 09:44
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants