Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cancel activedefrag in case we empty db #1411

Open
wants to merge 2 commits into
base: unstable
Choose a base branch
from

Conversation

ranshid
Copy link
Member

@ranshid ranshid commented Dec 9, 2024

In case we empty database (eg user triggers flushdb command) the activedefrag might be in the process of scanning over the db kvstore.
Example:

Logged sanitizer errors (pid 53103):
=================================================================
==53103==ERROR: AddressSanitizer: heap-use-after-free on address 0x6080000026b8 at pc 0x0001006130d0 bp 0x00016fa72ec0 sp 0x00016fa72eb8
READ of size 4 at 0x6080000026b8 thread T0
    #0 0x1006130cc in defragStageKvstoreHelper+0x650 (valkey-server:arm64+0x1002870cc)
    #1 0x100610d64 in activeDefragTimeProc+0x2e0 (valkey-server:arm64+0x100284d64)
    #2 0x1003c128c in aeProcessEvents+0x630 (valkey-server:arm64+0x10003528c)
    #3 0x10040334c in main+0x6d7c (valkey-server:arm64+0x10007734c)
    #4 0x19d1ff150 in start+0x9a8 (dyld:arm64e+0xfffffffffff4d150)

0x6080000026b8 is located 24 bytes inside of 96-byte region [0x6080000026a0,0x608000002700)
freed by thread T3 here:
    #0 0x101338d40 in free+0x98 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x54d40)
    #1 0x1005bed10 in lazyfreeFreeDatabase+0x54 (valkey-server:arm64+0x100232d10)
    #2 0x100578750 in bioProcessBackgroundJobs+0x4e8 (valkey-server:arm64+0x1001ec750)
    #3 0x101335858 in asan_thread_start(void*)+0x40 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x51858)
    #4 0x19d589f90 in _pthread_start+0x84 (libsystem_pthread.dylib:arm64e+0x6f90)
    #5 0x19d584d30 in thread_start+0x4 (libsystem_pthread.dylib:arm64e+0x1d30)

previously allocated by thread T0 here:
    #0 0x101338fd0 in calloc+0x9c (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x54fd0)
    #1 0x1004120a0 in valkey_calloc+0x3c (valkey-server:arm64+0x1000860a0)
    #2 0x1003cd844 in kvstoreCreate+0xa8 (valkey-server:arm64+0x100041844)
    #3 0x1005bf778 in emptyDbAsync+0xbc (valkey-server:arm64+0x100233778)
    #4 0x100458ce8 in emptyData+0x354 (valkey-server:arm64+0x1000ccce8)
    #5 0x100459cd4 in flushallCommand+0x1c0 (valkey-server:arm64+0x1000cdcd4)
    #6 0x1003e9500 in call+0x3cc (valkey-server:arm64+0x10005d500)
    #7 0x1003ec7a4 in processCommand+0x17b4 (valkey-server:arm64+0x1000607a4)
    #8 0x100424358 in processInputBuffer+0x498 (valkey-server:arm64+0x100098358)
    #9 0x100422204 in readQueryFromClient+0xd4 (valkey-server:arm64+0x100096204)
    #10 0x100668688 in connSocketEventHandler+0x160 (valkey-server:arm64+0x1002dc688)
    #11 0x1003c0fb8 in aeProcessEvents+0x35c (valkey-server:arm64+0x100034fb8)
    #12 0x10040334c in main+0x6d7c (valkey-server:arm64+0x10007734c)
    #13 0x19d1ff150 in start+0x9a8 (dyld:arm64e+0xfffffffffff4d150)

Thread T3 created by T0 here:
    #0 0x1013301c8 in pthread_create+0x5c (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x4c1c8)
    #1 0x100578104 in bioInit+0x28c (valkey-server:arm64+0x1001ec104)
    #2 0x1003ff6b4 in main+0x30e4 (valkey-server:arm64+0x1000736b4)
    #3 0x19d1ff150 in start+0x9a8 (dyld:arm64e+0xfffffffffff4d150)

In order to avoid such issues we will reset activedefrag to start all over again.

Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
@ranshid ranshid marked this pull request as ready for review December 9, 2024 16:52
@ranshid ranshid requested a review from madolson December 9, 2024 16:53
@ranshid
Copy link
Member Author

ranshid commented Dec 9, 2024

@JimB123 can you please take a look?
It was a quick solution that came to mind, but maybe there is something better we can do?

Copy link

codecov bot commented Dec 9, 2024

Codecov Report

Attention: Patch coverage is 80.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 70.85%. Comparing base (e8078b7) to head (cbb5dac).
Report is 4 commits behind head on unstable.

Files with missing lines Patch % Lines
src/defrag.c 75.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #1411      +/-   ##
============================================
+ Coverage     70.78%   70.85%   +0.06%     
============================================
  Files           118      118              
  Lines         63561    63596      +35     
============================================
+ Hits          44994    45059      +65     
+ Misses        18567    18537      -30     
Files with missing lines Coverage Δ
src/db.c 89.30% <100.00%> (+<0.01%) ⬆️
src/server.h 100.00% <ø> (ø)
src/defrag.c 89.51% <75.00%> (-0.09%) ⬇️

... and 12 files with indirect coverage changes

@JimB123
Copy link
Contributor

JimB123 commented Dec 9, 2024

Before doing this, I'd like to look at the defrag code to see if there's a better solution. But this is certainly an option.

@JimB123
Copy link
Contributor

JimB123 commented Dec 10, 2024

Would a better solution be to modify defragStageDbKeys and defragStageDbKeys and defragStageExpiresKvstore to use a DBID rather than a serverDb* as the reference? In this way, flushing/replacing the DB doesn't matter, defrag will dereference to the proper kvstore on every loop?

The only thing that needs to be check is if kvstoreDictScanDefrag can tolerate a cursor derived from a different DB. Would it just continue from an arbitrary point? Or would it potentially crash? Do we need to harden kvstoreDictScanDefrag to survive a bad cursor?

I'm worried about a RL case I saw once... A customer had 2 DBs, a big one and a small one. They were repeatedly flushing the small one (every few seconds). If we cancel the entire defrag operation, the bigger/stable DB might keep getting interrupted before completing defrag - and repeatedly have to start over from the beginning.

@ranshid
Copy link
Member Author

ranshid commented Dec 10, 2024

I'm worried about a RL case I saw once... A customer had 2 DBs, a big one and a small one. They were repeatedly flushing the small one (every few seconds). If we cancel the entire defrag operation, the bigger/stable DB might keep getting interrupted before completing defrag - and repeatedly have to start over from the beginning.

Yes this is something we discussed and I agree is a potential issue. I think working with indexes should always be preferred.
So you suggest to change the kvstoreIterState to hold the dbid and fetch the relevant kvs in defragStageKvstoreHelper?
That would probably require to propagate the specific kvs TYPE (eg keys or expire) to the defragStageKvstoreHelper.
Doable I think.

The only thing that needs to be check is if kvstoreDictScanDefrag can tolerate a cursor derived from a different DB.

I do not think there is any issue with that the cursor is blind to memory location AFAIK.

@JimB123
Copy link
Contributor

JimB123 commented Dec 10, 2024

So you suggest to change the kvstoreIterState to hold the dbid and fetch the relevant kvs in defragStageKvstoreHelper?
That would probably require to propagate the specific kvs TYPE (eg keys or expire) to the defragStageKvstoreHelper.

Not exactly, I think that reduces the flexibility of the kvstore helper - which is used for other kvstores (like pubsub).

What I'm thinking is to:

  • Change defragKeysCtx, replacing serverDB *db with int dbid`.
  • Change the target on defragStageDbKeys to be a DBID (using intptr_t)
  • defragStageKvstoreHelper would still receive a kvstore*, BUT now that the kvstore is being determined by the index, each timer invocation, it's possible that the assertion in defragStageKvstoreHelper might pop. I think, in this case, we can just return DEFRAG_DONE rather than asserting to indicate that we've completed that kvs and can move onto the next stage.

@zuiderkwast
Copy link
Contributor

zuiderkwast commented Dec 14, 2024

The only thing that needs to be check is if kvstoreDictScanDefrag can tolerate a cursor derived from a different DB. Would it just continue from an arbitrary point? Or would it potentially crash?

Arbitrary point. No crash.

But what if the SWAPDB command is used? Worst case, defrag will miss some keys? I can't imagine a use case for repeatedly calling SWAPDB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants