Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

curl-based request does not maintain HTTP keepalive after 4xx response #1984

Open
sharky5102 opened this issue Jul 4, 2022 · 1 comment
Open
Labels
feature-request A feature should be added or improved. p2 This is a standard priority issue

Comments

@sharky5102
Copy link

Describe the bug

I have noticed that the AWS CPP SDK does not maintain HTTP keepalive with services when the service responds with a HTTP 4xx or 5xx status code - the connection is closed directly after receive such a response. The result is a high volume of HTTP reconnects and, in my case, high CPU usage due to excessive loading of /etc/pki/ssl/certs/ca-bundle.crt for each new connection (and the abundance of HTTP 400 errors in DynamoDB for ConditionalCheckFailed exceptions)

The root cause is behavior from libcurl, in which some code determines that 'upload was incomplete' after receiving an HTTP error code which is 300 or above (see https://github.com/curl/curl/blob/master/lib/http.c#L4128 ). The reason for this is that the AWS SDK CPP is providing a CURLOPT_READFUNCTION but not a CURLOPT_POSTFIELDSIZE. Normally this would result in a chunked upload, but CurlHttpClient.cpp is overriding this by providing a Content-length and Transfer-Encoding header explicitly. As a result, libcurl is unable to determine if the entire request body has been sent after receiving the response. Disputably this is a bug in libcurl, but the fix in the SDK seems pretty easy.

On my system the fix caused significant CPU usage improvements (like, 50%) due to not having to re-load the cert bundle for each request but it is heavily dependent on the number of 4xx's received.

Expected Behavior

HTTP keepalive should work when we receive a 400

Current Behavior

HTTP connection is dropped after any 3xx, 4xx or 5xx response from a service

Reproduction Steps

The easiest way to show this is by modeling the interaction with libcurl. The minimal code for libcurl to show the reconnection, modeled after the interaction that CurlHttpClient has with libcurl is:

#include <stdio.h>
#include <curl/curl.h>


size_t read_callback(char *ptr, size_t size, size_t nmemb, void *userdata)
{
  return fread(ptr, size, nmemb, (FILE *)userdata);
}

int main(void)
{
  CURL *curl;
  CURLcode res;

  curl = curl_easy_init();
  struct curl_slist *list = NULL;

  if(curl) {
    FILE *file = fopen("small", "rb");
    fseek(file, 0, 2);
    size_t size = ftell(file);
    fseek(file, 0, 0);
    char buf[1024];

    sprintf(buf, "Content-length: %d", size);

    list = curl_slist_append(list, buf);
    list = curl_slist_append(list, "Expect:");
    list = curl_slist_append(list, "transfer-encoding:");

    curl_easy_setopt(curl, CURLOPT_URL, "https://dynamodb.us-east-1.amazonaws.com");
    curl_easy_setopt(curl, CURLOPT_POST, 1L);
    curl_easy_setopt(curl, CURLOPT_READFUNCTION, read_callback);
    curl_easy_setopt(curl, CURLOPT_READDATA, (void *)file);
    curl_easy_setopt(curl, CURLOPT_VERBOSE, 1L);
    curl_easy_setopt(curl, CURLOPT_HTTPHEADER, list);

    // ------------------------------------------ UNCOMMENT TO FIX
    //    curl_easy_setopt(curl, CURLOPT_POSTFIELDSIZE, size);

    // FIST REQUEST
    res = curl_easy_perform(curl);
    if(res != CURLE_OK)
      fprintf(stderr, "curl_easy_perform() failed: %s\n",
              curl_easy_strerror(res));

    // SECOND REQUEST
    fseek(file, 0, 0);

    res = curl_easy_perform(curl);
    if(res != CURLE_OK)
      fprintf(stderr, "curl_easy_perform() failed: %s\n",
              curl_easy_strerror(res));

    curl_easy_cleanup(curl);
  }
  return 0;
}

Just compile with g++ -o curltest curltest.cpp -lcurl and create a file called small with some random content for it to attempt to upload.

This test makes two requests to dynamodb which are clearly incorrect, resulting in a 404 from dynamodb. What should happen, is that the second request re-uses the connection from the first, but it does not (see output below), unless the "UNCOMMENT TO FIX" line is uncommented.

Without the fix, this will output:

...
* HTTP error before end of send, stop sending
<
* Closing connection 0
...
* Hostname dynamodb.us-east-1.amazonaws.com was found in DNS cache
*   Trying 52.119.226.214:443...
* Connected to dynamodb.us-east-1.amazonaws.com (52.119.226.214) port 443 (#1)
...

uncommenting the fix produces:

...
* We are completely uploaded and fine
...
* Connection #0 to host dynamodb.us-east-1.amazonaws.com left intact
* Found bundle for host dynamodb.us-east-1.amazonaws.com: 0x7073d0 [serially]
* Can not multiplex, even if we wanted to!
* Re-using existing connection! (#0) with host dynamodb.us-east-1.amazonaws.com
* Connected to dynamodb.us-east-1.amazonaws.com (52.119.226.80) port 443 (#0)

Possible Solution

diff --git a/aws-cpp-sdk-core/source/http/curl/CurlHttpClient.cpp b/aws-cpp-sdk-core/source/http/curl/CurlHttpClient.cpp
index 5e6c8fc..8a06bca 100644
--- a/aws-cpp-sdk-core/source/http/curl/CurlHttpClient.cpp
+++ b/aws-cpp-sdk-core/source/http/curl/CurlHttpClient.cpp
@@ -653,6 +653,7 @@ std::shared_ptr<HttpResponse> CurlHttpClient::MakeRequest(const std::shared_ptr<
             curl_easy_setopt(connectionHandle, CURLOPT_READDATA, &readContext);
             curl_easy_setopt(connectionHandle, CURLOPT_SEEKFUNCTION, SeekBody);
             curl_easy_setopt(connectionHandle, CURLOPT_SEEKDATA, &readContext);
+            curl_easy_setopt(connectionHandle, CURLOPT_POSTFIELDSIZE, request->GetSize());
             if (request->IsEventStreamRequest())
             {
                 curl_easy_setopt(connectionHandle, CURLOPT_NOPROGRESS, 0L);

Additional Information/Context

Currently I'm using a different workaround, which is setting CURLOPT_KEEP_SENDING_ON_ERROR which also allows the connection to stay open. This is settable without modifying the AWS SDK CPP by setting this curl flag manually, but that seems like a bit of a hack.

AWS CPP SDK version used

I used 1.8 but the same code is in 1.9

Compiler and Version used

gcc

Operating System and version

linux AL2

@sharky5102 sharky5102 added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jul 4, 2022
@jmklix jmklix self-assigned this Nov 18, 2022
@jmklix jmklix added needs-reproduction This issue needs reproduction. p2 This is a standard priority issue and removed needs-triage This issue or PR still needs to be triaged. labels Nov 18, 2022
@jmklix jmklix added feature-request A feature should be added or improved. and removed bug This issue is a bug. needs-reproduction This issue needs reproduction. labels Apr 19, 2024
@jmklix jmklix removed their assignment Apr 19, 2024
@jmklix
Copy link
Member

jmklix commented Apr 19, 2024

This is currently behaving as expected, but it is a feature request that we might add at some point in the future

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request A feature should be added or improved. p2 This is a standard priority issue
Projects
None yet
Development

No branches or pull requests

2 participants