Add support for VARIANT type in the Databricks destination #51002
bdevlieger
started this conversation in
Connector Ideas and Features
Replies: 1 comment
-
I would be very interested in this feature! Variant helps A LOT with sudden schema changes and still being performant compared to json strings. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Databricks has a relatively new VARIANT datatype, which is specifically made for semi-structured data. I think this would be a great addition, since it's a lot faster and easier to work with than json strings.
I'm not a Kotlin developer, but I tried to look into the code where a change would be needed. I believe in file DatabricksSqlGenerator.kt on line 49-63 the cases for structs and arrays are handled. Note that the delta table must have at least Reader Version 3 and Writer Version 7 and must have the
variant-preview
table property set to support the variant datatype. For new tables this is configured automatically. More details about the requirement can be found here. These requirements can affect compatibility with other systems and are not backward compatible with existing tables, so I believe it's best to make this an (immutable) setting on the databricks destination connector.We have found that this variant type makes tables a lot more robust against changes in nested structures. I'm open to help out wherever I can.
Beta Was this translation helpful? Give feedback.
All reactions