Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix remote call of s3Cluster function #583

Open
wants to merge 8 commits into
base: project-antalya-24.12.2
Choose a base branch
from

Conversation

ianton-ru
Copy link

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Fix remote call of s3Cluster function

Documentation entry for user-facing changes

remote('remote_host', ''s3Cluster(.....)') did not work with error

2024.11.24 09:07:37.307264 [ 11 ] {5548e29a-ab3e-4306-9925-c5d39e47fa9c} <Error> executeQuery: Code: 49. DB::Exception: Distributed task iterator is not initialized: While executing Remote. (LOGICAL_ERROR) (version 24.11.1.1) (from 172.16.1.1:55326) (in query: SELECT * FROM remote('s0_0_1', s3Cluster('cluster_simple', 'http://minio1:9001/root/data/{clickhouse,database}/*', 'minio', '[HIDDEN]', 'CSV', 'name String, value UInt32, polygon Array(Array(Tuple(Float64, Float64)))')) ORDER BY (name, value, polygon) ASC LIMIT 1), Stack trace (when copying this message, always include the lines below):

0. ./contrib/llvm-project/libcxx/include/exception:141: Poco::Exception::Exception(String const&, int) @ 0x0000000014cd1e72
1. /home/iantonspb/ch-build/./src/Common/Exception.cpp:109: DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000ba7cad9
2. DB::Exception::Exception(PreformattedMessage&&, int) @ 0x00000000069abf0c
3. DB::Exception::Exception<>(int, FormatStringHelperImpl<>) @ 0x00000000069b5e8b
4. /home/iantonspb/ch-build/./src/QueryPipeline/RemoteQueryExecutor.cpp:715: DB::RemoteQueryExecutor::processReadTaskRequest() @ 0x000000000fecbf12
5. /home/iantonspb/ch-build/./src/QueryPipeline/RemoteQueryExecutor.cpp:614: DB::RemoteQueryExecutor::processPacket(DB::Packet) @ 0x000000000feca25a
6. /home/iantonspb/ch-build/./src/QueryPipeline/RemoteQueryExecutor.cpp:553: DB::RemoteQueryExecutor::readAsync() @ 0x000000000fecb60c
7. /home/iantonspb/ch-build/./src/Processors/Sources/RemoteSource.cpp:182: DB::RemoteSource::tryGenerate() @ 0x000000001295b590
8. /home/iantonspb/ch-build/./src/Processors/ISource.cpp:108: DB::ISource::work() @ 0x0000000012648b45
9. /home/iantonspb/ch-build/./src/Processors/Executors/ExecutionThreadContext.cpp:49: DB::ExecutionThreadContext::executeTask() @ 0x000000001265fc00
10. /home/iantonspb/ch-build/./src/Processors/Executors/PipelineExecutor.cpp:289: DB::PipelineExecutor::executeStepImpl(unsigned long, std::atomic<bool>*) @ 0x0000000012655c3f
11. /home/iantonspb/ch-build/./src/Processors/Executors/PipelineExecutor.cpp:255: DB::PipelineExecutor::executeImpl(unsigned long, bool) @ 0x000000001265541d
12. /home/iantonspb/ch-build/./src/Processors/Executors/PipelineExecutor.cpp:126: DB::PipelineExecutor::execute(unsigned long, bool) @ 0x0000000012655207
13. /home/iantonspb/ch-build/./src/Processors/Executors/PullingAsyncPipelineExecutor.cpp:83: void std::__function::__policy_invoker<void ()>::__call_impl<std::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<true, true>::ThreadFromGlobalPoolImpl<DB::PullingAsyncPipelineExecutor::pull(DB::Chunk&, unsigned long)::$_0>(DB::PullingAsyncPipelineExecutor::pull(DB::Chunk&, unsigned long)::$_0&&)::'lambda'(), void ()>>(std::__function::__policy_storage const*) @ 0x0000000012662d8c
14. ./contrib/llvm-project/libcxx/include/__functional/function.h:848: ? @ 0x000000000bb49da3
15. ./contrib/llvm-project/libcxx/include/__functional/invoke.h:359: ? @ 0x000000000bb4f07b
16. ? @ 0x00007f08de1c4ac3
17. ? @ 0x00007f08de256850

The reason is this:
https://github.com/ClickHouse/ClickHouse/blob/master/src/TableFunctions/TableFunctionObjectStorageCluster.cpp#L34
s3Cluster function has two stages, and stage is choose depends of query_kind - INITIAL_QUERY for first stage and SECONDARY_QUERY for second stage. When s3Cluster called from remote function, both stages were with SECONDARY_QUERY.

Now requests from remote function use INITIAL_QUERY.

Information about CI checks: https://clickhouse.com/docs/en/development/continuous-integration/

CI Settings (Only check the boxes if you know what you are doing):

  • Allow: All Required Checks
  • Allow: Stateless tests
  • Allow: Stateful tests
  • Allow: Integration Tests
  • Allow: Performance tests
  • Allow: All Builds
  • Allow: batch 1, 2 for multi-batch jobs
  • Allow: batch 3, 4, 5, 6 for multi-batch jobs

  • Exclude: Style check
  • Exclude: Fast test
  • Exclude: All with ASAN
  • Exclude: All with TSAN, MSAN, UBSAN, Coverage
  • Exclude: All with aarch64, release, debug

  • Run only fuzzers related jobs (libFuzzer fuzzers, AST fuzzers, etc.)
  • Exclude: AST fuzzers

  • Do not test
  • Woolen Wolfdog
  • Upload binaries for special builds
  • Disable merge-commit
  • Disable CI cache

Information about CI checks: https://clickhouse.com/docs/en/development/continuous-integration/

CI Settings (Only check the boxes if you know what you are doing):

  • Allow: All Required Checks
  • Allow: Stateless tests
  • Allow: Stateful tests
  • Allow: Integration Tests
  • Allow: Performance tests
  • Allow: All Builds
  • Allow: batch 1, 2 for multi-batch jobs
  • Allow: batch 3, 4, 5, 6 for multi-batch jobs

  • Exclude: Style check
  • Exclude: Fast test
  • Exclude: All with ASAN
  • Exclude: All with TSAN, MSAN, UBSAN, Coverage
  • Exclude: All with aarch64, release, debug

  • Run only fuzzers related jobs (libFuzzer fuzzers, AST fuzzers, etc.)
  • Exclude: AST fuzzers

  • Do not test
  • Woolen Wolfdog
  • Upload binaries for special builds
  • Disable merge-commit
  • Disable CI cache

@@ -2702,8 +2702,11 @@ void Context::setCurrentQueryId(const String & query_id)

client_info.current_query_id = query_id_to_set;

if (client_info.query_kind == ClientInfo::QueryKind::INITIAL_QUERY)
if (client_info.query_kind == ClientInfo::QueryKind::INITIAL_QUERY
&& (getApplicationType() != ApplicationType::SERVER || client_info.initial_query_id.empty()))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the comment in ClientInfo.h
when query_kind == INITIAL_QUERY
initial_query_id is equal to current.
Does not it contradict with the condition?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically it must be some new kind, something like REMOTE_INITIAL_QUERY, but this breaks backward compatibility on protocol level.

@@ -442,6 +442,7 @@ void executeQuery(
not_optimized_cluster->getName());

read_from_remote->setStepDescription("Read from remote replica");
read_from_remote->setRemoteFunction(is_remote_function);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, "setIsRemoteFunction" is slightly more natural.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed

"""
)

assert TSV(pure_s3) == TSV(s3_distributed)
Copy link
Collaborator

@ilejn ilejn Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the PR really s3Cluster specific?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, affects all *Cluster object storage functions. Suggest to make test for others?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about new tests, may be. Depends on your feeling how probably is to break something that worked before or accidentally create a "bridge" that e.g. bypasses security.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added test for iceberg, most actual for us.

@ianton-ru ianton-ru force-pushed the project-antalya-24.12.2-remote-s3-cluster branch from 7473648 to 8b6064f Compare January 14, 2025 12:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants