Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support deletion vectors #30

Open
dpxcc opened this issue Nov 14, 2024 · 0 comments
Open

Support deletion vectors #30

dpxcc opened this issue Nov 14, 2024 · 0 comments
Labels
Milestone

Comments

@dpxcc
Copy link
Contributor

dpxcc commented Nov 14, 2024

What feature are you requesting?

Use deletion vectors to track deleted rows in data files

Why are you requesting this feature?

Currently, our system uses a copy-on-write approach for UPDATE and DELETE operations, where entire data files are replaced even if only a single row is deleted. Implementing deletion vectors would be a more efficient solution

What is your proposed implementation for this feature?

When rows are deleted from a data file, a deletion vector is created as a bitmap that indicates whether each row within this data file is deleted or not
This deletion vector is stored separately, either in its own file or within a Postgres heap table
During a columnstore table scan, the deletion vector is used to skip over deleted rows
If a data file accumulates too many deletions, a new data file containing only the undeleted rows will be created

@dpxcc dpxcc added the feature label Nov 14, 2024
@dpxcc dpxcc modified the milestones: tbd, 0.2.0 Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant