Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Temporal Pod Doe Not Bind to Correct Address with dual stack GKE cluster #50948

Open
tmas-definitive opened this issue Jan 6, 2025 · 1 comment
Labels
area/platform issues related to the platform autoteam community helm team/deployments type/bug Something isn't working

Comments

@tmas-definitive
Copy link

tmas-definitive commented Jan 6, 2025

Helm Chart Version

1.1.0

What step the error happened?

On deploy

Relevant information

When attempting to deploy a helm release of OSS Airbyte, there is an issue where the temporal pod will listen only on IPV6, but other pods will attempt to bind on IPV4. This only appears to be an issue with a dual stack GKE cluster, as an IPV4 only cluster type will deploy and run as expected.

In order to resolve this issue, I need to add additional helm settings to enforce binding and listening on the same address type. This was achieved using the following settings through a helm_release terraform resource

set {
    name  = "temporal.extraEnv[0].name"
    value = "BIND_ON_IP"
  }

  set {
    name  = "temporal.extraEnv[0].valueFrom.fieldRef.fieldPath"
    value = "status.podIP"
  }

Relevant log output

Caused by: io.grpc.netty.shaded.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Connection refused: airbyte-temporal/10.212.96.117:7233
Caused by: java.net.ConnectException: finishConnect(..) failed: Connection refused
	at io.grpc.netty.shaded.io.netty.channel.unix.Errors.newConnectException0(Errors.java:166) ~[grpc-netty-shaded-1.66.0.jar:1.66.0]
	at io.grpc.netty.shaded.io.netty.channel.unix.Errors.handleConnectErrno(Errors.java:131) ~[grpc-netty-shaded-1.66.0.jar:1.66.0]
	at io.grpc.netty.shaded.io.netty.channel.unix.Socket.finishConnect(Socket.java:359) ~[grpc-netty-shaded-1.66.0.jar:1.66.0]
	at io.grpc.netty.shaded.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.doFinishConnect(AbstractEpollChannel.java:710) ~[grpc-netty-shaded-1.66.0.jar:1.66.0]
	at io.grpc.netty.shaded.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:687) ~[grpc-netty-shaded-1.66.0.jar:1.66.0]
	at io.grpc.netty.shaded.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:567) ~[grpc-netty-shaded-1.66.0.jar:1.66.0]
	at io.grpc.netty.shaded.io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:499) ~[grpc-netty-shaded-1.66.0.jar:1.66.0]
	at io.grpc.netty.shaded.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:407) ~[grpc-netty-shaded-1.66.0.jar:1.66.0]
	at io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[grpc-netty-shaded-1.66.0.jar:1.66.0]
	at io.grpc.netty.shaded.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[grpc-netty-shaded-1.66.0.jar:1.66.0]
	at io.grpc.netty.shaded.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[grpc-netty-shaded-1.66.0.jar:1.66.0]
	at java.base/java.lang.Thread.run(Thread.java:1583) ~[?:?]
2024-12-17 18:30:35 WARN i.a.c.t.TemporalUtils(getTemporalClientWhenConnected):163 - Waiting for namespace default to be initialized in temporal

There are several erorrs in several pods. The crux of this is several pods will roll trying to connect to the temporal pod and will consistently restart. List of pods

airbyte-workload-launcher-*
airbyte-worker-*
airbyte-server-*
airbyte-cron-*
@marcosmarxm
Copy link
Member

I added this to deployment team backlog @tmas-definitive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/platform issues related to the platform autoteam community helm team/deployments type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants