Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Error while checking bucket ownership with connector athena-federation-jdbc to mysql #1702

Open
evbo opened this issue Jan 14, 2024 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@evbo
Copy link

evbo commented Jan 14, 2024

Describe the bug
This bug was reported here. I am reopening in a new issue at the request of @akuzin1

To Reproduce
Create buckets in multiple regions in your s3 account (e.g. us-west-1, us-west-2). Then create a new Athena Data Source for mysql RDS instance in a private VPC, ensuring a VPC endpoint for S3 has been enabled. Even with correct security groups and subnet ids, the lambda will fail because the way it calls listBuckets the lambda will try to list buckets not within your current region's VPC endpoint for S3.

Expected behavior
Able to run a basic select * from mysql_data_source without errors in the connector if your s3 account has buckets in multiple regions. The code should be updated to not call listBuckets and instead only interact with the spill-bucket configured.

Screenshots / Exceptions / Errors

GENERIC_USER_ERROR: Encountered an exception[java.lang.RuntimeException] from your LambdaFunction[...] executed in context[retrieving meta-data] with message[Error while checking bucket ownership for ...]
This query ran against the "..." database, unless qualified by the query. Please post the error message on our forum or contact customer support with Query Id: ...

Connector Details (please complete the following information):

  • Version: 2023.49.2
  • Name: mysql
  • Athena Query IDs [if applicable]

Additional context

From the previously closed issue, two questions remain unanswered:

  • Can the docs be clarified for spill-bucket? Do we enter the bucket name or URL? I used bucket name - was that wrong?
  • Can this connector be updated to never call listBuckets since there's no way to avoid different regions (outside of the VPC endpoint)? Instead, could it only try to make requests involving the spill-bucket? For instance, can only getBucket be used?

CloudWatch logs:

2024-01-14 22:07:29 ... WARN CompositeHandler:116 - handleRequest: Completed with an exception.
java.lang.RuntimeException: Error while checking bucket ownership for ...
at com.amazonaws.athena.connector.lambda.domain.spill.SpillLocationVerifier.updateBucketState(SpillLocationVerifier.java:100) ~[task/:?]
at com.amazonaws.athena.connector.lambda.domain.spill.SpillLocationVerifier.checkBucketAuthZ(SpillLocationVerifier.java:74) ~[task/:?]
at com.amazonaws.athena.connector.lambda.handlers.MetadataHandler.doHandleRequest(MetadataHandler.java:288) ~[task/:?]
at com.amazonaws.athena.connector.lambda.handlers.CompositeHandler.handleRequest(CompositeHandler.java:144) ~[task/:?]
at com.amazonaws.athena.connector.lambda.handlers.CompositeHandler.handleRequest(CompositeHandler.java:112) [task/:?]
at lambdainternal.EventHandlerLoader$2.call(EventHandlerLoader.java:925) [aws-lambda-java-runtime-0.2.0.jar:?]
at lambdainternal.AWSLambda.startRuntime(AWSLambda.java:268) [aws-lambda-java-runtime-0.2.0.jar:?]
at lambdainternal.AWSLambda.startRuntime(AWSLambda.java:207) [aws-lambda-java-runtime-0.2.0.jar:?]
at lambdainternal.AWSLambda.main(AWSLambda.java:196) [aws-lambda-java-runtime-0.2.0.jar:?]
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: ...; S3 Extended Request ID: ...=; Proxy: null)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1879) ~[task/:?]
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1418) ~[task/:?]
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1387) ~[task/:?]
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1157) ~[task/:?]
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:814) ~[task/:?]
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781) ~[task/:?]
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755) ~[task/:?]
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715) ~[task/:?]
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697) ~[task/:?]
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561) ~[task/:?]
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541) ~[task/:?]
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5520) ~[task/:?]
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5467) ~[task/:?]
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5461) ~[task/:?]
at com.amazonaws.services.s3.AmazonS3Client.listBuckets(AmazonS3Client.java:1056) ~[task/:?]
at com.amazonaws.services.s3.AmazonS3Client.listBuckets(AmazonS3Client.java:1062) ~[task/:?]
at com.amazonaws.athena.connector.lambda.domain.spill.SpillLocationVerifier.updateBucketState(SpillLocationVerifier.java:88) ~[task/:?]

@evbo
Copy link
Author

evbo commented Jan 19, 2024

This issue is discussed here:
https://stackoverflow.com/a/67261680/1080804

@akuzin1 Instead of listBuckets could getBucketLocation be called and then filter out any buckets not in the current Lambda's region?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants