-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-36770: [C++] Use custom endpoint for s3 using environment variable AWS_ENDPOINT_URL #36791
Conversation
Thanks for opening a pull request! If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project. Then could you also rename the pull request title in the following format?
or
In the case of PARQUET issues on JIRA the title also supports:
See also: |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a test to cpp/src/arrow/filesystem/s3fs_test.cc
?
cpp/src/arrow/filesystem/s3fs.cc
Outdated
@@ -337,6 +338,10 @@ Result<S3Options> S3Options::FromUri(const Uri& uri, std::string* out_path) { | |||
} else { | |||
options.ConfigureDefaultCredentials(); | |||
} | |||
auto endpoint_env = arrow::internal::GetEnvVar(kDefaultEndpointEnvVar); | |||
if(endpoint_env.ok()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if(endpoint_env.ok()) { | |
if (endpoint_env.ok()) { |
cpp/src/arrow/filesystem/s3fs.cc
Outdated
@@ -127,6 +127,7 @@ using internal::ToAwsString; | |||
using internal::ToURLEncodedAwsString; | |||
|
|||
static const char kSep = '/'; | |||
constexpr char kDefaultEndpointEnvVar[] = "AWS_ENDPOINT"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about using AWS_ENDPOINT_URL
instead of AWS_ENDPOINT
because aws-sdk-cpp will support AWS_ENDPOINT_URL
aws/aws-sdk-cpp#2587 ?
(How about accepting AWS_ENDPOINT_URL
in arrow-rs too?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use the same name for constant name?
For example, kAwsEndpointEnvVar
for this case.
cpp/src/arrow/filesystem/s3fs.cc
Outdated
@@ -337,6 +338,10 @@ Result<S3Options> S3Options::FromUri(const Uri& uri, std::string* out_path) { | |||
} else { | |||
options.ConfigureDefaultCredentials(); | |||
} | |||
auto endpoint_env = arrow::internal::GetEnvVar(kDefaultEndpointEnvVar); | |||
if(endpoint_env.ok()) { | |||
options.endpoint_override = endpoint_env.MoveValueUnsafe(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we simplify this?
options.endpoint_override = endpoint_env.MoveValueUnsafe(); | |
options.endpoint_override = *endpoint_env; |
@kou thanks for your comments, I updated PR, please review again. |
--amend --amend
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Could you update the PR title and description before we merge this?
|
1 similar comment
|
updated, thanks @kou |
Thanks. |
@assignUser We have #36791 (comment) message but we couldn't assign #36770 to abdmal automatically. |
Hi @kou, This is due to #34389, they will have to comment on the issue first. |
Ah! Thanks! |
After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit f43bfd6. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. |
…riable AWS_ENDPOINT_URL (apache#36791) ### Rationale for this change we need a way to read custom object storage (such as minio host or other s3-like storage). use environment variable `AWS_ENDPOINT_URL ` ### What changes are included in this PR? set variable endpoint_override according the environment variable ### Are these changes tested? unittest and tested on pyarrow ### Are there any user-facing changes? No * Closes: apache#36770 Authored-by: yiwei.wang <yiwei.wang@weride.ai> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…riable AWS_ENDPOINT_URL (apache#36791) ### Rationale for this change we need a way to read custom object storage (such as minio host or other s3-like storage). use environment variable `AWS_ENDPOINT_URL ` ### What changes are included in this PR? set variable endpoint_override according the environment variable ### Are these changes tested? unittest and tested on pyarrow ### Are there any user-facing changes? No * Closes: apache#36770 Authored-by: yiwei.wang <yiwei.wang@weride.ai> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Rationale for this change
we need a way to read custom object storage (such as minio host or other s3-like storage).
use environment variable
AWS_ENDPOINT_URL
What changes are included in this PR?
set variable endpoint_override according the environment variable
Are these changes tested?
unittest and tested on pyarrow
Are there any user-facing changes?
No