You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently join and joinEach behaves a bit differently.
join is using HashJoin algorithm under the hood when joinEach due is based on a nested loop algorithm.
The problem is that the implementation of Nested Loop enforces using join_prefix because if we try to join two dataframes on id column when on both sides this column is called id we are going to get DuplicatedEntriesException coming from Rows::merge() method.
What we should do is to remove from the right dataset join columns to avoid duplicates.
The text was updated successfully, but these errors were encountered:
Currently join and joinEach behaves a bit differently.
join is using
HashJoin
algorithm under the hood whenjoinEach
due is based on a nested loop algorithm.The problem is that the implementation of Nested Loop enforces using join_prefix because if we try to join two dataframes on
id
column when on both sides this column is calledid
we are going to get DuplicatedEntriesException coming fromRows::merge()
method.What we should do is to remove from the right dataset join columns to avoid duplicates.
The text was updated successfully, but these errors were encountered: