⚗️ Test supporting Python Models (via Create-A-Derived-Table) #3293
Labels
data-platform-apps-and-tools
This issue is owned by Data Platform Apps and Tools
enhancement
enhancing an existing feature
stale
User Story
As a… user of Create-A-Derived-Table
I want to… be able to work with tables in Python code, rather than pure SQL
So that… I can run machine learning tasks in a reproducible deployment pipeline.
Value / Purpose
Currently, we do not support using Spark Via Athena. However, customers have very legitimate reasons they cannot deploy their modeled data via SQL alone, as they may be applying machine learning techniques, or other processes that are easier to implement Pythonically. As such, we should use
data-engineering-sandbox
to explore what we would need to do technically to allow a user to deploy a Python Model.Useful Contacts
@jhpyke
User Types
No response
Hypothesis
If we... [do a thing]
Then... [this will happen]
Proposal
Spark
enabled Athena Workgroup in Sandboxworkbook
via the console.workbook
to pull some TCP-DS data in and do some pythonic transformations (Modelling of Data, conditional transformation that would be hard to do in code, managing Datestamps that don't conform to standard SQL, etc.)Additional Information
If you have time within the spike, you should then try and apply this information to Create-A-Derived-Table specifically.
Create-A-Derived-Table
to create a Python Model (if 📈 Update Create-A-Derived-Table to newest DBT-Core/DBT-Athena Versions #3290 is not complete before this ticket is done you will need to manually bump yourdbt-core
/dbt-athena-community
versions to support thissandbox
Definition of Done
The text was updated successfully, but these errors were encountered: