-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added support for sunion and sinter #147
base: master
Are you sure you want to change the base?
Conversation
@@ -2429,7 +2431,13 @@ public Long msetnx(String... keysvalues) { | |||
|
|||
@Override | |||
public Set<String> sinter(String... keys) { | |||
throw new UnsupportedOperationException("not yet implemented"); | |||
Set<String> allResults = d_smembers(keys[0]).getResult(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assign the global set with the first keys smembers.
Set<String> allResults = d_smembers(keys[0]).getResult(); | ||
//loop through keys and intersect to a set, use d_smembers | ||
for(int i = 1; i < keys.length; i++) { | ||
allResults = Sets.intersection(allResults, d_smembers(keys[i]).getResult()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use Guava set intersection.
@@ -2448,7 +2456,14 @@ public Long sort(String key, String dstkey) { | |||
|
|||
@Override | |||
public Set<String> sunion(String... keys) { | |||
throw new UnsupportedOperationException("not yet implemented"); | |||
Set<String> allResults = new HashSet<String>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just get all the smembers and push it to a single set.
I was under the impression that for both SINTER and SUNION we need to fetch the keys of the sets from all the nodes. So we probably need an implementation similar to |
@ipapapa Inside sunion and sinter I am using d_smembers which is done using executeWithFailover. So in this case it will also talk to the replication as well. For the operation we just need the keys and we can do union or intersection at run time. I have unit tested this in house and it works fine too. Since smembers(a single set) cannot be across nodes as its a single key then it will work fine. This operation is not similar to keys or scan. Please let me know your thoughts. |
@balajivenki all operations are SMEMBERS returns all the keys of a single SET. Whereas SUNION and SINTER are operations across sets, and therefore across keys that represent those sets. The keys may potentially be in different Dynomite nodes. It will work only if all the sets are in the same Dynomite node and you are not using token awareness but rather leverage the coordinator node. I think the implementation should probably be similar to |
@ipapapa During my test I was able to do union and intersection on 3 different sets that is stored on 3 different nodes on the same rack. You are right I am leveraging the coordinator code done for smembers and doing the intersection and union at run time. Even if we implement the way how SCAN is done, we will end up doing it at run time only as Redis node is not aware of the other sets. I will take a look at how dyno_scan is implemented and I am not sure how that will help here as these two are two different options. SCAN and KEYS are scatter and gather operations but this is purely gather and operate on operations. |
One another technical difficulty with executeWithRing is that it looks for operation success plus the token aware is not on String[]. So I have my code changed with executeWithRing and I do union based on List<OperationResult<Set>>. In this case instead of running SUNION on all three nodes(my rack is 3 nodes and 3 keys across all three nodes) it is running SUNION key1 key2 key3 three times on node1 instead of executing it once on all three nodes(I captured it by running MONITOR command on those nodes). I think it executes them based on the first key. When I execute SUNION key1 key2 key3 then it runs it 3 times on node1 where key1 sits SUNION key2 key1 key3 then it runs it 3 times on node2 where key2 sits Now even if we solve it then for SINTER we will have completely wrong results because intersection will result as empty in this case. We cannot simply run SINTER key1 key2 key3 on three nodes and combine the data because it will be wrong. In a nutshell I think the best way to do is to use SMEMBERS and do union and intersection from client side and that is what my PR is. Hope this make sense. |
@ipapapa Any update on this? Does my explanation make sense? Please clarify. |
I have added @timiblossom and @jcacciatore to get their feedback on this as well. You may be correct that I would suggest adding some unit tests for these operations so that at least we have some basis on the correctness. |
@balajivenki For that technical issue, since currently Dynomite does not have any extra code to handle SINTER and SUNION to help out, the client or Dyno in this case has to do extra works to overcome this. Here are some info that might help you:
You can't simply run "SINTER key1 key2 key3" on all nodes because all requests will get sent to one Dynomite server which own key1. Hence the result will be incorrect as you already observed. You don't want to convert and send to all nodes like this: Another option is that Dynomite team can support this command at the server side to do all of the scatter-gather operations and the client only needs to send one such request to any Dynomite coordinator. Let me know if I am not clear on any points. Thanks. |
@timiblossom is correct, the right way to do this is to group the keys by token prior to sending requests. The mechanics of this should be straightforward. The broader question for this PR lies in error handling. There are several operations that fall into this category of key-grouped requests and they should all behave in the same manner in terms of error handling, the semantics of which would only apply to the dyno client. My cursory thinking on that is to have configurable policies with sensible defaults that app clients can use to tweak the behavior. More concretely, say key2 points to a large set and that while scanning the set one of the requests fails. Should it be retried, fallback, fail the entire SINTER request ? Of course, having this functionality properly supported by dynomite itself is the more desirable solution, since any client could then take advantage of this functionality. |
@timiblossom I completely agree with you and my code changes are exactly what you have expected and explained above. SINTER or SUNION cannot be executed on all nodes and so I am using SMEMBERS and do the union or intersection after collecting the data. @jcacciatore For the error handling I can add some NPE checks but the failover thing isnt executeWithFailOver does it for us? |
@ipapapa and @timiblossom any update on this PR? The actual changes meet with the review comments. |
From my side, we need to add proper unit testing for it because we need to keep the quality high and validate the implementation (We recently got burned from a PR that did not have a proper test coverage). @balajivenki there is a unit test framework and a number of unit tests for you to take a look. @balajivenki You also need to address the (1), (2) and (3) points @timiblossom mentioned above, unless @timiblossom is OK moving forward and pushing off the implementation to the server. |
Thanks @ipapapa I will add the unit test cases to my changes soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the above comments if you want to push this forward
We needed sinter and sunion and so added implementation for the same. For sinter assigning the first set from s_members for initializing and using Guava Sets.intersection for the rest of the sets. For sunion, I am getting all smembers and just simply pushing into a global set. Please review it and approve the same.