Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBASE-28484: Add ability to replicate to a different tableName #6578

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

eab148
Copy link
Contributor

@eab148 eab148 commented Jan 6, 2025

Design document

Jira

Currently, replication can only occur if the source and sink clusters both house tables with the same (tableName, family) pairs. This requirement exists so that the sink cluster knows where to persist the data it receives from the source cluster. In this PR, we loosen the naming constraint and give clients more configuration power over the name of their sink namespaces and tableNames.

@eab148 eab148 force-pushed the HBASE-28484-eboland-impl branch from 0ad9568 to 670844c Compare January 8, 2025 19:34
@eab148 eab148 force-pushed the HBASE-28484-eboland-impl branch from 670844c to 0448f31 Compare January 8, 2025 19:36
@eab148
Copy link
Contributor Author

eab148 commented Jan 8, 2025

@Apache9 Happy New Year! Do you have any thoughts on the new design and implementation? Thank you for all of your help so far!

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 28s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ master Compile Tests _
+0 🆗 mvndep 0m 10s Maven dependency ordering for branch
+1 💚 mvninstall 2m 54s master passed
+1 💚 compile 3m 36s master passed
+1 💚 checkstyle 0m 47s master passed
+1 💚 spotbugs 2m 0s master passed
+1 💚 spotless 0m 43s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 11s Maven dependency ordering for patch
+1 💚 mvninstall 2m 49s the patch passed
+1 💚 compile 3m 41s the patch passed
+1 💚 javac 3m 41s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 49s the patch passed
+1 💚 spotbugs 2m 15s the patch passed
+1 💚 hadoopcheck 10m 43s Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 💚 spotless 0m 41s patch has no errors when running spotless:check.
_ Other Tests _
+1 💚 asflicense 0m 17s The patch does not generate ASF License warnings.
39m 5s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6578/6/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #6578
JIRA Issue HBASE-28484
Optional Tests dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname Linux 31118b2d8ce1 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 0c2d864
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 83 (vs. ulimit of 30000)
modules C: hbase-common hbase-server U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6578/6/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache9 Apache9 self-requested a review January 13, 2025 14:17
@Apache9
Copy link
Contributor

Apache9 commented Jan 13, 2025

Will take a look soon.

Thanks for preparing the design doc and also the PR.

@krconv
Copy link

krconv commented Jan 24, 2025

@Apache9 Would you mind taking a look at this when you get a chance?

Copy link
Contributor

@Apache9 Apache9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The solution looks OK, but there are still some concerns about how to make use of the ReplicationSinkTranslator...

@@ -139,6 +140,8 @@ public ReplicationSink(Configuration conf, RegionServerCoprocessorHost rsServerH
Class<? extends SourceFSConfigurationProvider> c =
Class.forName(className).asSubclass(SourceFSConfigurationProvider.class);
this.provider = c.getDeclaredConstructor().newInstance();
} catch (RuntimeException e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for debug?

@@ -153,6 +156,8 @@ private WALEntrySinkFilter setupWALEntrySinkFilter() throws IOException {
filter = walEntryFilterClass == null
? null
: (WALEntrySinkFilter) walEntryFilterClass.getDeclaredConstructor().newInstance();
} catch (RuntimeException e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too.

@@ -968,6 +968,9 @@ public enum OperationStatusCode {
public static final String REPLICATION_SINK_SERVICE_CLASSNAME_DEFAULT =
"org.apache.hadoop.hbase.replication.ReplicationSinkServiceImpl";
public static final String REPLICATION_BULKLOAD_ENABLE_KEY = "hbase.replication.bulkload.enabled";
public static final String REPLICATION_SINK_TRANSLATOR = "hbase.replication.sink.translator";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd better not put things in HConstants, put them into the package where we use it?

import org.apache.hadoop.hbase.TableName;
import org.apache.yetus.audience.InterfaceAudience;

@InterfaceAudience.Public
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be IA.LimitedPrivate("CONFIG")? Do we expected users to use it directly in their code?

Class<?> translatorClass = this.conf.getClass(HConstants.REPLICATION_SINK_TRANSLATOR,
IdentityReplicationSinkTranslator.class, ReplicationSinkTranslator.class);
try {
return (ReplicationSinkTranslator) translatorClass.getDeclaredConstructor().newInstance();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC we have a ReflectionUtils or something to call constructors of a class, here we do not need to pass the Configuration object to it? Maybe the translator needs to load some configurations?

throw e;
} catch (Exception e) {
LOG.warn("Failed to instantiate " + translatorClass);
return new IdentityReplicationSinkTranslator();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the right choice to fallback to default implementation? I'm not sure...

import org.apache.yetus.audience.InterfaceAudience;

@InterfaceAudience.Public
public interface ReplicationSinkTranslator {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better add some javadoc's to explain the meanings of the methods and the usage of this class?

mutation.setClusterIds(clusterIds);
mutation.setAttribute(ReplicationUtils.REPLICATION_ATTR_NAME,
TableName sinkTableName = translator.getSinkTableName(tableName);
ExtendedCell sinkCell = translator.getSinkExtendedCell(tableName, cell);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the reason why I want to see the javadoc for this method,as why do we need to pass the original table name in? And I think we will just do tableName mapping, so we do not need to call the above getSinkTableName everytime as all the cells from the WALEntry are for the same table?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants