-
Notifications
You must be signed in to change notification settings - Fork 752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
iptables contention between vpc-cni and kube-proxy #2948
Comments
@orsenthil can you advise whether the lock contention is avoidable and what are the possible actionable solutions? |
@duxing VPC CNI validates API server connectivity as part of it’s bootup process and this check requires Workarounds:
|
@duxing - I missed the ping. As @achevuru mentiond, the contention, if observed, only during the aws-node (vpc cni) pod startup and not during the pods are running. Kube-Proxy provides flags like ( |
hi @achevuru ! Thanks for the suggestions! I did some research after submitting this issue and planned to try meanwhile, one thing that can be improved (IMO) from VPC CNI is logging / metrics. If we can have more logs (maybe DEBUG logs) related to waiting on When this issue happened, the logs from VPC CNI has 0 warning logs and 0 error logs (everything is info or debug). It wasn't until a few days later I desperately checking other logs from the log collection tool did I realize Do you think it's reasonable for |
hi @orsenthil !
that's absolutely right. I noticed this as well: when this issue happened to new nodes, existing nodes are perfectly fine, even if they needed to assign new EIPs. thanks again for helping! @orsenthil @achevuru |
@duxing VPC CNI logs should show relevant error message if it runs in to IPtable contention issue. I believe in your case, VPC CNI had the lock and |
thx for confirming! In case another entity acquired the lock first, what about adding a debug log for iptable update duration? this value can be calculated with multiple consecutive logs from the same instance but querying isn't easy to do. if we have a single log entry, this duration can be queried easily and graphed to capture issues and history. e.g. : |
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days |
Issue closed due to inactivity. |
What happened:
on an EKS cluster with many
Service
s (1000 in my case) and many pods (300 pods), big iptable lead to long execution time for someiptable
rules (5s+ seconds)this leads to xtable contention between
kube-proxy
andvpc-cni
, despite specifying-w
:This race condition between
kube-proxy
andvpn-cni
has lead to longer initialization time forvpc-cni
and frequent pod crashes due to failing readiness check (60s delay + 3 * 10s interval). Related issue #2945Using some of the logs from
eks_i-0019a68d504566810_2024-06-06_1830-UTC_0.7.6.tar.gz
to walk through this issue (uploaded, see "Attach logs" section)From the
ipamd.log
i can tell the pod was restarted 5 times by the time I collected the logs, the following part of logs overlap with thekube-proxy
logs around the same time, leading to the contention.from
kube-proxy
log. CONSECUTIVE DEBUG logs. at 2024-06-06T16:49:46:from
ipamd.log
. CONSECUTIVE DEBUG logs. between 2024-06-06T16:49:41 and 2024-06-06T16:49:49Attach logs
I've got logs from running the cni log collection tool from 3 different instances that run into this issue:
eks_i-0130dc8295b19b0e3_2024-06-06_1901-UTC_0.7.6.tar.gz
andeks_i-0019a68d504566810_2024-06-06_1830-UTC_0.7.6.tar.gz
has been uploaded viafile="<filename>"; curl -s https://d1mg6achc83nsz.cloudfront.net/ebf57d09395e2150ac2485091ba7c48aa46181dbdcae78620987d3d7d36ace4b/us-east-1/$file | bash
eks_i-02c1cd4484684230c_2024-06-05_1932-UTC_0.7.6.tar.gz
has been emailed.What you expected to happen:
kube-proxy
is supposed to wait for actually5s
rather than saying5s
but just waited0.00001s
. If this is not expected, this is a problem withkube-proxy
addon from EKS.kube-proxy
fromv1.29.1-eksbuild.2
tov1.29.3-eksbuild.2
and noticed this issue. maybe it exists before as well.kube-proxy
may need to updateiptables
throughout its entire lifecycle so this contention may not be entirely avoidable. I'd love to know if it's feasible to tellvpc-cni
to wait for the part ofiptables
that's necessary for its own initialization.vpc-cni
run into a lock contention, it should spit out some logs about the situation as well as what it's going to do. "e.g.Another app is currently holding the xtables lock; wait for X seconds
toipamd
DEBUG
logger.How to reproduce it (as minimally and precisely as possible):
EKS@1.29
ami-0a5010afd9acfaa26
/amazon-eks-node-1.29-v20240227
r5.4xlarge
(EKS managed nodegroup)kube-proxy
:v1.29.3-eksbuild.2
vpc-cni
:v1.18.1-eksbuild.3
Anything else we need to know?:
Environment:
kubectl version
):v1.29.4-eks-036c24b
v1.18.1-eksbuild.3
cat /etc/os-release
):uname -a
):The text was updated successfully, but these errors were encountered: