Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReadException "too many concurrent streams" #1987

Closed
visox opened this issue Oct 31, 2023 · 9 comments · Fixed by #2413
Closed

ReadException "too many concurrent streams" #1987

visox opened this issue Oct 31, 2023 · 9 comments · Fixed by #2413

Comments

@visox
Copy link

visox commented Oct 31, 2023

Hi i am running a service which makes lots of calls to some other service, lets say 10 calls per second.

The service runs properly but eventually there are periods when it starts to crash with

sttp.client3.SttpClientException$ReadException: Exception when sending request: POST https://xxxxx
	at sttp.client3.SttpClientExceptionExtensions.defaultExceptionToSttpClientException(SttpClientExceptionExtensions.scala:25)
	at sttp.client3.SttpClientExceptionExtensions.defaultExceptionToSttpClientException$(SttpClientExceptionExtensions.scala:9)
	at sttp.client3.SttpClientException$.defaultExceptionToSttpClientException(SttpClientException.scala:24)
	at sttp.client3.HttpClientAsyncBackend.adjustExceptions$$anonfun$1(HttpClientAsyncBackend.scala:147)
	at sttp.client3.SttpClientException$$anon$1.applyOrElse(SttpClientException.scala:35)
	at sttp.client3.SttpClientException$$anon$1.applyOrElse(SttpClientException.scala:34)
	at zio.ZIO.tryRescue$1$$anonfun$1(ZIO.scala:359)
	at scala.util.Either.fold(Either.scala:190)
	at zio.ZIO.tryRescue$1(ZIO.scala:359)
	at zio.ZIO.catchSome$$anonfun$1(ZIO.scala:361)
	at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1121)
	at zio.internal.FiberRuntime.evaluateEffect(FiberRuntime.scala:381)
	at zio.internal.FiberRuntime.evaluateMessageWhileSuspended(FiberRuntime.scala:504)
	at zio.internal.FiberRuntime.drainQueueOnCurrentThread(FiberRuntime.scala:220)
	at zio.internal.FiberRuntime.run(FiberRuntime.scala:139)
	at zio.internal.ZScheduler$$anon$4.run(ZScheduler.scala:476)
Caused by: java.io.IOException: too many concurrent streams
	at java.net.http/jdk.internal.net.http.Http2Connection.reserveStream(Http2Connection.java:484)
	at java.net.http/jdk.internal.net.http.Http2ClientImpl.getConnectionFor(Http2ClientImpl.java:106)
	at java.net.http/jdk.internal.net.http.ExchangeImpl.get(ExchangeImpl.java:94)
	at java.net.http/jdk.internal.net.http.Exchange.establishExchange(Exchange.java:369)
	at java.net.http/jdk.internal.net.http.Exchange.responseAsyncImpl0(Exchange.java:553)
	at java.net.http/jdk.internal.net.http.Exchange.responseAsyncImpl(Exchange.java:406)
	at java.net.http/jdk.internal.net.http.Exchange.responseAsync(Exchange.java:398)
	at java.net.http/jdk.internal.net.http.MultiExchange.responseAsyncImpl(MultiExchange.java:409)
	at java.net.http/jdk.internal.net.http.MultiExchange.lambda$responseAsync0$2(MultiExchange.java:342)
	at java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1150)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1773)
  lazy val sttpV = "3.8.+"

        "com.softwaremill.sttp.client3" %% "core" % sttpV,
      "com.softwaremill.sttp.client3" %% "zio" % sttpV,

Also its a POST request with some timeout a json body and some auth headers

in sttp code i do see

case e: java.io.IOException                   => Some(new ReadException(request, e))

and also from the stack trace one can see its an IO exception with too many concurrent streams

Any obvious solution ?

@kciesielski
Copy link
Member

@visox I'd like to look into this, if it's still relevant. Does the problem still occur?

@hochgi
Copy link

hochgi commented Nov 14, 2024

@kciesielski I stumbled on this as well.
Also with zio backend. (sttp client3 version: 3.9.1)

Caused by: sttp.client3.SttpClientException$ReadException: Exception when sending request: GET https://<REDACTED>
	at sttp.client3.SttpClientExceptionExtensions.defaultExceptionToSttpClientException(SttpClientExceptionExtensions.scala:25)
	at sttp.client3.SttpClientExceptionExtensions.defaultExceptionToSttpClientException$(SttpClientExceptionExtensions.scala:11)
	at sttp.client3.SttpClientException$.defaultExceptionToSttpClientException(SttpClientException.scala:24)
	at sttp.client3.HttpClientAsyncBackend.$anonfun$adjustExceptions$1(HttpClientAsyncBackend.scala:147)
	at sttp.client3.SttpClientException$$anonfun$adjustExceptions$1.applyOrElse(SttpClientException.scala:35)
	at sttp.client3.SttpClientException$$anonfun$adjustExceptions$1.applyOrElse(SttpClientException.scala:34)
	at zio.ZIO.$anonfun$catchSome$1(ZIO.scala:359)
	at scala.util.Either.fold(Either.scala:190)
	at zio.ZIO.tryRescue$1(ZIO.scala:359)
	at zio.ZIO.$anonfun$catchSome$4(ZIO.scala:361)
	... 6 more
Caused by: java.io.IOException: too many concurrent streams
	at java.net.http/jdk.internal.net.http.Http2Connection.reserveStream(Http2Connection.java:449)
	at java.net.http/jdk.internal.net.http.Http2ClientImpl.getConnectionFor(Http2ClientImpl.java:104)
	at java.net.http/jdk.internal.net.http.ExchangeImpl.get(ExchangeImpl.java:93)
	at java.net.http/jdk.internal.net.http.Exchange.establishExchange(Exchange.java:343)
	at java.net.http/jdk.internal.net.http.Exchange.responseAsyncImpl0(Exchange.java:475)
	at java.net.http/jdk.internal.net.http.Exchange.responseAsyncImpl(Exchange.java:380)
	at java.net.http/jdk.internal.net.http.Exchange.responseAsync(Exchange.java:372)
	at java.net.http/jdk.internal.net.http.MultiExchange.responseAsyncImpl(MultiExchange.java:408)
	at java.net.http/jdk.internal.net.http.MultiExchange.lambda$responseAsync0$2(MultiExchange.java:341)
	at java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1150)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1773)
	... 1 more

@hochgi
Copy link

hochgi commented Nov 14, 2024

@kciesielski & @visox a workaround for me was just to retry:

sttpClient.send(httpRequest).retry(Schedule.fibonacci(1.second).whileOutput(60.seconds.>))

I'm guessing in a burst of too many requests this may happen, so eventually it'll succeed if we retry enough.

@ioaoue
Copy link

ioaoue commented Jan 21, 2025

I encountered a similar issue with an app frequently canceling requests and I think I was able to reproduce it. Here is an example code that can be run using scala-cli:

//> using jvm temurin:21
//> using javaOpt -XX:ActiveProcessorCount=4
//> using scala 3.3.4
//> using dep "dev.zio::zio:2.1.14"
//> using dep "com.softwaremill.sttp.client3::core:3.10.2"
//> using dep "com.softwaremill.sttp.client3::zio:3.10.2"
//> using dep "com.softwaremill.sttp.tapir::tapir-core:1.11.13"
//> using dep "com.softwaremill.sttp.tapir::tapir-armeria-server:1.11.13"

import com.linecorp.armeria.common.SessionProtocol
import com.linecorp.armeria.server.Server as ArmeriaServer
import scala.concurrent.Future
import scala.jdk.FutureConverters.*
import sttp.client3.*
import sttp.client3.httpclient.zio.HttpClientZioBackend
import sttp.model.*
import sttp.tapir.*
import sttp.tapir.server.armeria.ArmeriaFutureServerInterpreter
import zio.*

val port = 8080

def randomPause =
  Random.nextIntBounded(100).flatMap(n => ZIO.sleep(Duration.fromNanos(n)))

def runClient =
  val request = basicRequest.get(uri"http://localhost:$port/test").response(asString)
  for
    backend <- HttpClientZioBackend.scoped()
    _       <- request.send(backend) // initialize the connection
    requests = List.fill(20) {
      randomPause *>
        request.send(backend).catchAll(e => ZIO.succeed(e.printStackTrace())) *> // expecting an error here
        randomPause
    }
    _ <- {
      ZIO.log("Sending requests") *>
        ZIO.raceAll(requests.head, requests.tail)
    }.repeat(Schedule.spaced(Duration.fromMillis(500)))
  yield ()

def startServer =
  val responseText = "Test".repeat(100)
  for
    runtime <- ZIO.runtime[Any]
    _ <- ZIO.acquireRelease {
      ZIO.fromFuture { implicit ec =>
        val testEndpoint =
          endpoint.get.out(stringBody).serverLogic { _ =>
            Unsafe.unsafe[Future[Either[Unit, String]]] { implicit unsafe =>
              runtime.unsafe.runToFuture(randomPause.as(Right(responseText)))
            }
          }
        val tapirService = ArmeriaFutureServerInterpreter().toService(testEndpoint)
        val server = ArmeriaServer
          .builder()
          .port(port, SessionProtocol.HTTP)
          .http2MaxStreamsPerConnection(50)
          .http2MaxResetFramesPerWindow(Int.MaxValue, 1)
          .service(tapirService)
          .build()
        server.start().asScala.map(_ => server)
      }
    } { server =>
      ZIO.fromFuture(_ => server.stop().asScala).orDie
    }
  yield ()

object Demo extends ZIOAppDefault:
  override def run =
    for
      _ <- startServer
      _ <- runClient
    yield ()

This code repeatedly creates 20 fibers that send requests, and cancels them as soon as one of them completes. On my PC it starts logging "too many concurrent streams" errors after 1-2 minutes, even though the number of concurrent requests is always below the limit.

Using the debugger, I found that the numReservedClientStreams counter inside HttpClient is not always reset after all requests are cancelled. This may indicate that the HTTP/2 stream is not always closed, i.e. a resource leak occurs.

The documentation for HttpClient says:

In order for the resources associated with these streams to be reclaimed, and for the HTTP request to be considered completed, a caller must eventually obtain the streaming response body and close, cancel, or read the returned streams to exhaustion.

But it seems the HttpClientAsyncBackend implementation does not guarantee that the stream will be closed if the fiber was interrupted after receiving the HttpResponse, but before reading the response body. For example, it can be interrupted immediately after this async call completes.

@adamw
Copy link
Member

adamw commented Jan 22, 2025

@ioaoue thanks for a reproducer! I confirm that it happens on my machine as well.

Not sure about the interruption point you mentioned, though. The .async call encompasses sending the request & handling the response. So this whole code block is a "unit of interruption" here?

@ioaoue
Copy link

ioaoue commented Jan 22, 2025

If I understand correctly, the .async call does not encapsulate receiving the entire response. The documentation for HttpClient says:

Once an HttpResponse is received, the headers, response code, and body (typically) are available. Whether the response body bytes have been read or not depends on the type, T, of the response body.

In case of HttpClientAsyncBackend, the type T is returned by the bodyHandlerBodyToBody method. But this method does not read the response body, it only returns an effect that will read the body. The effect is only executed by calling .flatten here. So if the effect is canceled immediately after .async completes, the response body will not be read.

@adamw
Copy link
Member

adamw commented Jan 22, 2025

Ah, yes you're right of course. Now we just need to come up with a proper fix :)

@adamw
Copy link
Member

adamw commented Jan 23, 2025

See PR #2413 for a PoC of a fix. The code you posted now runs for a long time without throwing exceptions.

We can use ZIO's acquire/release to properly manage the lifecycle of the publisher. This has several consequences:

  • acquire is non-interruptible - but in fact, it never was - the returned canceler was "lying" (it didn't really cancel a request that was already in the process of being sent; it only cancelled requests which weren't yet scheduled to be sent). That's why one of the cancellation tests fails, as it should. So this only surfaced a problem - or rather a feature of the backend - which was always there
  • we now need to generalize this to other HttpClient backends
  • and extend cancellation test suite to verify when exactly requests are cancellable

@adamw
Copy link
Member

adamw commented Jan 23, 2025

Actually, it's a bit better - I forgot (but got reminded ;) ) that the futures returned by HttpClient are in fact cancellable (despite the javadoc saying something else). The only problem is that cancellation is asynchronous, which is not perfect, but as good as it gets (we can't obtain information if cancellation is done - same problem as in #1746). So we can use ZIO.acquireReleaseInterruptibleExit to get better cancellation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants