-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eager Empty Bucket Checking #148
base: main
Are you sure you want to change the base?
Conversation
Since this pull request is modifying the |
@@ -125,6 +125,18 @@ class Sketch { | |||
|
|||
std::mutex mutex; // lock the sketch for applying updates in multithreaded processing | |||
|
|||
|
|||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line up *
and provide a little more documentation. Specify that this operates per column and that all non-empty buckets are above this cutoff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couldn't think of one
src/sketch.cpp
Outdated
@@ -18,6 +27,14 @@ Sketch::Sketch(vec_t vector_len, uint64_t seed, size_t _samples, size_t _cols) : | |||
buckets[i].alpha = 0; | |||
buckets[i].gamma = 0; | |||
} | |||
|
|||
#ifdef EAGER_BUCKET_CHECK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to have two forms of serialization:
- Direct "no-copy" serialization where we pull the bucket array data out directly.
- Compressed serialization which leverages the known column sizes to be more data size efficient.
To pull this off we need the buckets and flags to be stored contiguously. Suggest allocating a few extra buckets that are then used for the flags.
NOTE: We will all have to think about the right way to "receive" these serialized forms on the other side.
src/sketch.cpp
Outdated
@@ -78,6 +118,13 @@ void Sketch::update(const vec_t update_idx) { | |||
size_t bucket_id = i * bkt_per_col + depth; | |||
likely_if(depth < bkt_per_col) { | |||
Bucket_Boruvka::update(buckets[bucket_id], update_idx, checksum); | |||
#ifdef EAGER_BUCKET_CHECK | |||
likely_if(!Bucket_Boruvka::is_empty(buckets[bucket_id])) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unlikely_if(is_empty)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Investigate the performance of the following:
unlikely_if(is_empty)
set_bit()
update()
unlikely_if(is_empty)
clear_bit()
src/sketch.cpp
Outdated
return {buckets[bucket_id].alpha, GOOD}; | ||
|
||
for (size_t col = first_column; col < first_column + cols_per_sample; ++col) { | ||
int row = effective_size(col)-1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int(effective_size(col))
src/sketch.cpp
Outdated
{ | ||
// first, check for emptyness | ||
Bucket *current_row = buckets + (col_idx * bkt_per_col); | ||
if (Bucket_Boruvka::is_empty(buckets[num_buckets - 1])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would drop this or move it into the #else
. It's a more expensive call than the clzll
} | ||
#ifdef EAGER_BUCKET_CHECK | ||
unlikely_if(nonempty_buckets[col_idx] == 0) return 0; | ||
return (uint8_t)((sizeof(unsigned long long) * 8) - __builtin_clzll(nonempty_buckets[col_idx])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a convenient way to use ctzll
? It's about 3 times faster than clzll
@@ -17,7 +17,11 @@ | |||
|
|||
constexpr uint64_t KB = 1024; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls clean this file. (i.e. get rid of the HashSet etc.)
} | ||
BENCHMARK(BM_Std_Set_Hash_Iterator)->RangeMultiplier(2)->Range(1, 1 << 14); | ||
|
||
BENCHMARK_MAIN(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add newline
@@ -5,6 +5,15 @@ | |||
#include <vector> | |||
#include <cassert> | |||
|
|||
|
|||
inline static void set_bit(vec_t &t, int position) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest switching the position of first bucket to most significant bit.
t |= 1 << (sizeof(vec_t) * 8 - position)
Implements the following changes:
effective_size()
tracks the number of rows in a sketch column that are at or below a nonzero bucket. That is, it tracks (index of deepest nonempty bucket) + 1.nonempty_buckets
bit array. This has the effect of makingeffective_size()
take a constant number of instructions byclz
. The flag is updating on calls to the three merge functions and toupdate()
.sample()
andexhaustive_sample()
have been updated to only check a small constant number of buckets beloweffective_size()
for each column. This is based off a number of theoretical observations about the concentration of good buckets near the "best" bucket, on both the right and left. This should speed up calls to them by a 2-4x factor without sacrafices to the success probability.merge()
andrange_merge()
now useeffective_size()
to not bother merging in buckets that are definitely zero.serialize()
and the deserialize constructor also useeffective_size()
to shrink the space needed for storing and sending sketches.The performance testing needs to be done thorouhgly, but this should slightly speed up merging, signifcantly speed up querying, and slightly slow down updating.