-
-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support multi-level partition tables #56
Labels
Comments
philippemnoel
added
feature
New feature or request
priority-high
High priority issue
labels
Aug 8, 2024
Hi @philippemnoel , I am interested to work on this feature, could you please assign it to me |
Absolutely, it is yours! Thank you for your work :) |
This was referenced Aug 20, 2024
This was referenced Aug 29, 2024
philippemnoel
added
priority-low
Low priority issue
and removed
priority-high
High priority issue
labels
Oct 20, 2024
For now, we only intend to support Hive partitioning |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
What feature are you requesting?
Support for foreign tables in a multi-level partitioned table setup (see example).
Why are you requesting this feature?
When you have a lot of large files in one object store key/directory, scanning all of the files can be prohibitively time consuming. Changing the storage structure to support hive keys may not always be an option.
What is your proposed implementation for this feature?
This is an example of a multi-level partition table, but in many cases only the first partition level would be needed.
Example:
A “root” table that will have two partition “levels”:
The table for the first partition level:
Many partition tables may be created at this level - one for each
id_1
value in the root table.Note: If these first two tables have to be created as foreign tables in order for the last partition level table to be a foreign table, they shouldn’t require a files option.
The table for the next partition level looks like this:
Many partition tables may be created at this level as well - one for each
id_2
value in the first partition level tables.Note: when creating this final “leaf” partition table as a foreign table, a
files
option should be required.So, when querying the root table with a where clause like this:
...only one parquet file should be scanned - the one specified by the files option of the “leaf” partition table. Or, if a glob pattern is used in the
files
option of the leaf partition table, only the (ideally) few number of matching files will be scanned.Full Name:
Patrick Park
Affiliation:
Payzer
The text was updated successfully, but these errors were encountered: