Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

skip CRC checksumming during diskless full sync with TLS enabled. #1479

Open
wants to merge 14 commits into
base: unstable
Choose a base branch
from

Conversation

talxsha
Copy link

@talxsha talxsha commented Dec 23, 2024

Implemented a mechanism to eliminate CRC64 checksumming during full sync when not writing to disk (using a connection that has data integrity checks such as TLS), as it adds overhead with minimal benefit.

Nodes can skip CRC calculations when these conditions are met:

  1. Running diskless sync on primary.
  2. Running diskless load on the replica.
  3. Primary-replica connection is integrity checked.

Closes #1129

@ranshid
Copy link
Member

ranshid commented Dec 23, 2024

@talxsha before I look into this, lets put some details in the top comment. linking the issue is not what we susually do.
Please state shortly what is the problem we are solving and what this solution includes.

src/replication.c Outdated Show resolved Hide resolved
src/config.c Outdated Show resolved Hide resolved
src/server.c Outdated Show resolved Hide resolved
src/server.h Outdated Show resolved Hide resolved
src/rdb.c Outdated Show resolved Hide resolved
src/rdb.c Outdated Show resolved Hide resolved
@madolson
Copy link
Member

Added the functionality to disable CRC calculations during diskless full sync with TLS enabled.

Also add justification for why we should do this only when TLS is enabled. Given that the network has built in checksumming, I'm still not convinced about the tradeoff we are making given that the steady state replication is not checksummed.

@ranshid
Copy link
Member

ranshid commented Dec 24, 2024

@madolson should I tag it as a major-decision ? I think it worth discussion.

src/replication.c Outdated Show resolved Hide resolved
@madolson
Copy link
Member

@madolson should I tag it as a major-decision ? I think it worth discussion.

For now it's not. It's just an internal one. I would probably just ping PingXie directly and core team if anyone else is interested.

@talxsha talxsha marked this pull request as ready for review December 29, 2024 16:24
Signed-off-by: Tal Shachar <talxsha@amazon.com>
…o bypass_crc, encapsulated condition checks for skipping CRC, and chenged connIsTLS condition to connIntegrityChecked in ConnectionType. Some changes in the test as well

Signed-off-by: Tal Shachar <talxsha@amazon.com>
Signed-off-by: talxsha <160726520+talxsha@users.noreply.github.com>
src/connection.h Outdated Show resolved Hide resolved
src/server.h Outdated
@@ -1988,6 +1993,7 @@ struct valkeyServer {
char *rdb_filename; /* Name of RDB file */
int rdb_compression; /* Use compression in RDB? */
int rdb_checksum; /* Use RDB checksum? */
int bypass_crc; /* Skip RDB checksum? Applicable only for TLS enabled diskless full sync */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure why do we need to keep this server flag?

src/replication.c Outdated Show resolved Hide resolved
src/replication.c Outdated Show resolved Hide resolved
src/rdb.c Outdated Show resolved Hide resolved
src/rdb.c Outdated Show resolved Hide resolved
src/rdb.c Outdated Show resolved Hide resolved
src/rdb.c Outdated Show resolved Hide resolved
Signed-off-by: ranshid <88133677+ranshid@users.noreply.github.com>
Signed-off-by: ranshid <88133677+ranshid@users.noreply.github.com>
Signed-off-by: ranshid <88133677+ranshid@users.noreply.github.com>
Signed-off-by: ranshid <88133677+ranshid@users.noreply.github.com>
Copy link
Member

@ranshid ranshid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@madolson the code seems fine now. would you like to take a quick look as well?

Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
Copy link
Member

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks okay, mostly some nits to improve clarity.

src/tls.c Outdated Show resolved Hide resolved
src/unix.c Outdated Show resolved Hide resolved
src/server.c Outdated Show resolved Hide resolved
src/socket.c Outdated Show resolved Hide resolved
src/replication.c Outdated Show resolved Hide resolved
tests/integration/bypass-crc.tcl Outdated Show resolved Hide resolved
tests/integration/bypass-crc.tcl Outdated Show resolved Hide resolved
src/replication.c Outdated Show resolved Hide resolved
src/rdb.c Outdated Show resolved Hide resolved
src/replication.c Outdated Show resolved Hide resolved
talxsha and others added 3 commits January 5, 2025 11:22
Signed-off-by: talxsha <160726520+talxsha@users.noreply.github.com>
…ecessary metric and added a log instead. Using sendCommandArgv when sendding replica capa

Signed-off-by: Tal Shachar <talxsha@amazon.com>
@ranshid ranshid changed the title CRC removal during diskless full sync with TLS enabled. skip CRC checksumming during diskless full sync with TLS enabled. Jan 14, 2025
@madolson madolson added the release-notes This issue should get a line item in the release notes label Jan 15, 2025
Copy link

codecov bot commented Jan 17, 2025

Codecov Report

Attention: Patch coverage is 73.52941% with 9 lines in your changes missing coverage. Please review.

Project coverage is 71.00%. Comparing base (dc9ca1b) to head (5ea896f).
Report is 11 commits behind head on unstable.

Files with missing lines Patch % Lines
src/replication.c 66.66% 8 Missing ⚠️
src/rdb.c 85.71% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #1479      +/-   ##
============================================
+ Coverage     70.98%   71.00%   +0.02%     
============================================
  Files           120      120              
  Lines         65061    65090      +29     
============================================
+ Hits          46185    46219      +34     
+ Misses        18876    18871       -5     
Files with missing lines Coverage Δ
src/connection.h 87.77% <100.00%> (+0.27%) ⬆️
src/rdma.c 100.00% <ø> (ø)
src/rio.c 84.83% <100.00%> (ø)
src/rio.h 100.00% <ø> (ø)
src/server.h 100.00% <ø> (ø)
src/socket.c 91.62% <ø> (ø)
src/tls.c 100.00% <ø> (ø)
src/unix.c 73.49% <ø> (ø)
src/rdb.c 77.03% <85.71%> (+0.09%) ⬆️
src/replication.c 87.31% <66.66%> (-0.14%) ⬇️

... and 12 files with indirect coverage changes

@ranshid
Copy link
Member

ranshid commented Jan 17, 2025

@talxsha please fix the spellcheck/format issues and also the tests should probably be tagged with cluster:skip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-notes This issue should get a line item in the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Skip CRC64 checksumming when doing diskless replication
3 participants