Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrency Correctness (Cache) #42

Open
nhusung opened this issue Apr 28, 2023 · 1 comment
Open

Concurrency Correctness (Cache) #42

nhusung opened this issue Apr 28, 2023 · 1 comment

Comments

@nhusung
Copy link

nhusung commented Apr 28, 2023

Shouldn‘t it be memory_order_acquire here:

const uint64_t s = atomic_load_explicit(s_bucket, memory_order_relaxed);

const uint32_t s = atomic_load_explicit(s_bucket, memory_order_relaxed);

I‘m not an expert about the C/C++ memory model, but as far as I understand it, the compiler or CPU is free to reorder the instructions roughly like this (i.e. move the load instructions of the bucket values before the load of the status):

    uint64_t bucket_a = bucket->a, bucket_b = bucket->b, bucket_c = bucket->c;
    const uint32_t s = atomic_load_explicit(s_bucket, memory_order_relaxed);
    // abort if locked or if part of a 2-part cache entry
    if (s & 0xc0000000) return 0;
    // abort if different hash
    if ((s ^ (hash>>32)) & 0x3fff0000) return 0;
    // abort if key different
    if (bucket_a != a || bucket_b != b || bucket_c != c) return 0;
    *res = bucket->res;
    // abort if status field changed after compiler_barrier()
    return atomic_load_explicit(s_bucket, memory_order_acquire) == s ? 1 : 0;

memory_order_acquire guarantees that “no reads or writes in the current thread can be reordered before this load,” which would fix the problem. In return one could probably change the memory_order_acquire of the second atomic load into memory_order_relaxed, at least I don’t see any need for memory_order_acquire here.

Also, atomic_compare_exchange_weak in the put functions could probably be replaced by atomic_compare_exchange_weak_explicit with memory_order_acq_rel for the success case and memory_order_relaxed for the fail case. This may have no effect on the generated assembly for strongly-ordered architectures such as x86-64 but may have a performance impact on others. Furthermore, it would be great if the compiler_barrier() macro and comments could be removed, they are somewhat confusing when reading the code for the first time.

@nhusung nhusung changed the title Concurrency Correctness Concurrency Correctness (Cache) Apr 28, 2023
@nhusung
Copy link
Author

nhusung commented Apr 30, 2023

In addition, the loads and stores here (and similarly for cache_get6() and cache_put6()) need to be atomic operations:

sylvan/src/sylvan_cache.c

Lines 182 to 183 in 0278500

if (bucket->a != a || bucket->b != b || bucket->c != c) return 0;
*res = bucket->res;

sylvan/src/sylvan_cache.c

Lines 209 to 212 in 0278500

bucket->a = a;
bucket->b = b;
bucket->c = c;
bucket->res = res;

Section 5.1.2.4 of the C23 language specification draft states that if “there is ambiguity about which side effect to a non-atomic object is visible, then there is a data race and the behavior is undefined.” Assume that thread T1 executes cache_get() up to the point immediately before the loads cited above (L182). Then thread T2 executes cache_put() for the same bucket up to the point immediately before the stores cited above (L209). Now, there is no proper happens-before relationship between the cited loads and stores. Therefore, the values visible to T1 are ambiguous and we have undefined behavior. As I read the language specification, it does not matter that this ambiguity is finally “resolved” by checking the status on line 185 in cache_get().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant