-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: introduce num wal files to keep #25801
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall looks good, left some nitpicky name comments and a couple other things.
influxdb3/src/commands/serve.rs
Outdated
/// Number of wal files to keep behind, wal flush does not clear the wal files immediately | ||
/// instead they are only deleted when num wal files count exceed this keep behind size | ||
#[clap( | ||
long = "num-wal-files-to-keep", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be renamed to snapshotted-wal-files-to-keep
. It's not the actual total number of WAL files, it's the number that have been snapshotted. So if you have 200 files that haven't been snapshotted and this setting is 50, you'd have 250 total files at that point.
influxdb3_wal/src/object_store.rs
Outdated
use tokio::sync::Mutex; | ||
use tokio::sync::{oneshot, OwnedSemaphorePermit, Semaphore}; | ||
|
||
// struct WalObjectStoreArgs { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leftover?
influxdb3_wal/src/object_store.rs
Outdated
@@ -25,6 +36,11 @@ pub struct WalObjectStore { | |||
added_file_notifiers: parking_lot::Mutex<Vec<Arc<dyn WalFileNotifier>>>, | |||
/// Buffered wal ops go in here along with the state to track when to snapshot | |||
flush_buffer: Mutex<FlushBuffer>, | |||
/// oldest, last and latest wal file nums are used to keep track of what wal files | |||
/// to remove from the OS | |||
num_wal_files_to_keep: u64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same, rename this so it indicates that it's the number of snapshotted wal files to keep.
all_wal_file_paths | ||
.iter() | ||
.filter(|path| { | ||
if let Some(last_wal_sequence_number) = last_wal_sequence_number { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I'm not sure of looking through this, does the persisted snapshot have the last known wal file number saved, or is it the wal file number it's snapshotting to? It should have both, but the one we care about here is the one it's snapshotting to. Can you verify?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The persisted snapshot file carries the wal file number that came in the current wal contents, that's why @hiltontj had to use >=
instead of >
. SnapshotDetails
holds the last_wal_file_sequence_number
(which is based on wal period we're snapshotting). If we used the one coming in SnapshotDetails
we can then say load files only >
this wal file number
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so we need to update so that we're also keeping a record of what WAL file we're snapshotting to. That way we initialize with the correct one (i.e. not the latest WAL file written). In the case of a forced snapshot, those numbers will be the same, but in the case of normal snapshots, the snapshot number will be < latest wal number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed it in commit - good spot, thanks.
} | ||
|
||
if path_count == paths.len() { | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we sort before this so that paths always returns sorted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is from existing code, I thought it's doing the right already - yes I can sort it.
influxdb3_wal/src/object_store.rs
Outdated
#[derive(Debug)] | ||
struct WalFileRemover { | ||
oldest_wal_file: Option<WalFileSequenceNumber>, | ||
last_wal_sequence_number: Option<WalFileSequenceNumber>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename to last_snapshotted_wal_sequence_number
for clarity. The latest wal file number will always be >= this one.
This commit allows a configurable number of wal files to be left behind in OS. This is necessary as enterprise replicas rely on these files. closes: #25788
80a61d5
to
9096eee
Compare
9096eee
to
5e76230
Compare
2f6f29a
to
a03f3fb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I think it looks good. Let's get it tested downstream!
This commit allows a configurable number of wal files to be left behind in OS. This is necessary as enterprise replicas rely on these files.
closes: #25788