[MLOPS-596] Add Lakehouse Monitor Resource #3238

aravind-segu · 2024-02-09T10:30:20Z

Changes

Adds Lakehouse Monitor Resource to the Terraform Provider

Tests

make test run locally
relevant change in docs/ folder
covered with integration tests in internal/acceptance
relevant acceptance tests are passing
using Go SDK

codecov-commenter · 2024-02-09T17:42:13Z

Codecov Report

Attention: Patch coverage is 78.57143% with 15 lines in your changes are missing coverage. Please review.

Project coverage is 83.47%. Comparing base (d4812c5) to head (ad9481d).
Report is 4 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #3238   +/-   ##
=======================================
  Coverage   83.46%   83.47%           
=======================================
  Files         174      176    +2     
  Lines       16067    16165   +98     
=======================================
+ Hits        13410    13493   +83     
- Misses       1845     1854    +9     
- Partials      812      818    +6

Files	Coverage Δ
provider/provider.go	`94.65% <100.00%> (+0.02%)`	⬆️
catalog/resource_lakehouse_monitor.go	`78.26% <78.26%> (ø)`

... and 2 files with indirect coverage changes

monitoring/resourse_lakehouse_monitor.go

alexott · 2024-02-14T09:03:24Z

@aravind-segu integration tests are passing (at least on AWS)

mgyucht

Thank you for contributing this resource! We do need to make some adjustments to improve the long-term maintainability story, and they may necessitate deeper changes. If it isn't possible to do this right from the start, we need to prioritize the work needed for this to be properly supported.

monitoring/resource_lakehouse_monitor_test.go

monitoring/resourse_lakehouse_monitor.go

internal/acceptance/lakehouse_monitor_test.go

monitoring/resourse_lakehouse_monitor.go

alexott · 2024-02-29T14:09:05Z

Integration test is still failing:

    init_test.go:236: Step 1/2 error: After applying this test step, the plan was not empty.
        stdout:
        
        
        Terraform used the selected providers to generate the following execution
        plan. Resource actions are indicated with the following symbols:
          ~ update in-place
        
        Terraform will perform the following actions:
        
          # databricks_lakehouse_monitor.testMonitorInference will be updated in-place
          ~ resource "databricks_lakehouse_monitor" "testMonitorInference" {
              - drift_metrics_table_name   = "sandbox$tdkkfakkkfadj.things$tdkkfakkkfadj.bar$tdkkfakkkfadj_inference_drift_metrics" -> null
                id                         = "sandbox$tdkkfakkkfadj.things$tdkkfakkkfadj.bar$tdkkfakkkfadj_inference"
              - monitor_version            = "0" -> null
              - profile_metrics_table_name = "sandbox$tdkkfakkkfadj.things$tdkkfakkkfadj.bar$tdkkfakkkfadj_inference_profile_metrics" -> null
              - status                     = "MONITOR_STATUS_PENDING" -> null
                # (3 unchanged attributes hidden)
        
                # (1 unchanged block hidden)
            }
        
        Plan: 0 to add, 1 to change, 0 to destroy.

alexott · 2024-02-29T14:29:24Z

after fixing computed fields, another error in update test:

    init_test.go:236: Step 2/2 error: Error running apply: exit status 1
        
        Error: cannot update lakehouse monitor: Data Monitor 'sandbox$tijgeledgkhhb.things$tejcfjihaeblg.bar$tejcfjihaeblg_inference' does not exist.
        
          with databricks_lakehouse_monitor.testMonitorInference,
          on terraform_plugin_test.tf line 36, in resource "databricks_lakehouse_monitor" "testMonitorInference":
          36:         resource "databricks_lakehouse_monitor" "testMonitorInference" {

aravind-segu · 2024-03-04T22:28:20Z

Integration tests are passing: https://github.com/databricks-eng/eng-dev-ecosystem/actions/runs/8147735527

alexott

Small changes are still required in the code.

Plus there is no documentation yet

alexott · 2024-03-05T13:46:34Z

catalog/resource_lakehouse_monitor.go

+			create.FullName = d.Get("table_name").(string)
+
+			endpoint, err := w.LakehouseMonitors.Create(ctx, create)
+			WaitForMonitor(w, ctx, create.FullName)


We need to put

if err != nil { return err }

before this line, and then have this line as:

err = WaitForMonitor(w, ctx, create.FullName)

Otherwise we don't capture wait errors

alexott · 2024-03-05T13:47:06Z

catalog/resource_lakehouse_monitor.go

+			err = common.StructToData(endpoint, monitorSchema, d)
+			if err != nil {
+				return err
+			}
+			return nil


Just rewrite these lines as

return common.StructToData(endpoint, monitorSchema, d)

mgyucht

Two main comments:

Waiters are supported in the SDK by default, you just need to annotate your API appropriately. I'll send you the link to this offline.
How exactly is table_name supposed to work? It seems like it is a required field but it is not in the Create or Update requests.

catalog/resource_lakehouse_monitor.go

alexott

In general looks good - we need to decide about readiness waiting - should we wait for new Go SDK, or implement it later

catalog/resource_lakehouse_monitor.go

docs/resources/lakehouse_monitor.md

alexott

in general good, pending decision on if we should wait for OpenAPI spec changes for wait command

alexott · 2024-03-06T19:10:00Z

docs/resources/lakehouse_monitor.md

+### Computed Fields
+* `monitor_version` - The version of the monitor config (e.g. 1,2,3). If negative, the monitor may be corrupted
+* `drift_metrics_table_name` - The full name of the drift metrics table. Format: __catalog_name__.__schema_name__.__table_name__.
+* `profile_metrics_table_name` - The full name of the profile metrics table. Format: __catalog_name__.__schema_name__.__table_name__.
+* `status` - Status of the Monitor 
+* `dashboard_id` - The ID of the generated dashboard.


small nit - it's better to move it to the Attribute Reference section

aravind-segu · 2024-03-06T19:11:18Z

in general good, pending decision on if we should wait for OpenAPI spec changes for wait command

I checked with Miles, and he is ok to push this for now. I will wait for his approval as well

mgyucht

LGTM, provided you address @alexott's comment to move computed fields under the Attribute Reference section of the doc.

catalog/resource_lakehouse_monitor.go

docs/resources/lakehouse_monitor.md

arpitjasa-db · 2024-03-06T19:36:09Z

docs/resources/lakehouse_monitor.md

+}
+
+resource "databricks_lakehouse_monitor" "testTimeseriesMonitor" {
+    table_name = "${databricks_catalog.sandbox.name}.${databricks_schema.things.name}.${databricks_table.myTestTable.name}"


Hmm I wonder if this is a bit too much, we ended up separating this in a UC Model into 3 parameters for this reason: https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/registered_model

According to their python api docs (https://api-docs.databricks.com/python/lakehouse-monitoring/latest/databricks.lakehouse_monitoring.html#databricks.lakehouse_monitoring.create_monitor) , it can be {catalog}.{schema}.{table} or {schema}.{table} or {table} and the api fills in the current catalog or schema. So we technically dont need to split it up and force users to fill in all three fields. I think this should be the recommendation to fill in all three, but if the user reads other docs, they should be able to just use all three of the options

Got it, so if someone doesn't specify schema and catalog via Terraform though, what would be the "current" catalog and schema it would infer?

No, in for databricks_sql_table you always specify catalog & schema and id will be three-level name

arpitjasa-db · 2024-03-06T19:37:39Z

docs/resources/lakehouse_monitor.md

+    table_name = "${databricks_catalog.sandbox.name}.${databricks_schema.things.name}.${databricks_table.myTestTable.name}"
+    assets_dir = "/Shared/provider-test/databricks_lakehouse_monitoring/${databricks_table.myTestTable.name}"
+    output_schema_name = "${databricks_catalog.sandbox.name}.${databricks_schema.things.name}"
+    snapshot  {} 


Looks like a bit weird syntax, would it be better to expose this as a boolean instead and then convert in the resource implementation?

We debated about this. We changed the Go SDK to use an empty struct in place of any. This is also the the example expected in the Go SDK, so thought it would be clear. Miles also did not want too many changes between Go SDK structs and the terraform input structs.

Also in the future if there are any snapshot relevant parameters the team introduces, we dont need additional changes in the terraform provider as we are already using the struct.

arpitjasa-db · 2024-03-06T19:38:25Z

docs/resources/lakehouse_monitor.md

+
+* `table_name` - (Required) - The full name of the table to attach the monitor too. Its of the format {catalog}.{schema}.{tableName}
+* `assets_dir` - (Required) - The directory to store the monitoring assets (Eg. Dashboard and Metric Tables)
+* `output_schema_name` - (Required) - Schema where output metric tables are created


Let's clarify that it needs to be catalog.schema

alexott · 2024-03-06T19:50:23Z

docs/resources/lakehouse_monitor.md

    }
 }

 resource "databricks_lakehouse_monitor" "testTimeseriesMonitor" {
-    table_name = "${databricks_catalog.sandbox.name}.${databricks_schema.things.name}.${databricks_table.myTestTable.name}"
-    assets_dir = "/Shared/provider-test/databricks_lakehouse_monitoring/${databricks_table.myTestTable.name}"
+    table_name = "${databricks_catalog.sandbox.name}.${databricks_schema.things.name}.${databricks_sql_table.myTestTable.name}"


This could be simplified to databricks_sql_table.myTestTable.id now: https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/sql_table#id

alexott · 2024-03-06T19:51:23Z

docs/resources/lakehouse_monitor.md

+### Snapshot Monitor
+```hcl
+resource "databricks_lakehouse_monitor" "testMonitorInference" {
+    table_name = "${databricks_catalog.sandbox.name}.${databricks_schema.things.name}.${databricks_table.myTestTable.name}"


samuhepp · 2024-03-12T10:35:35Z

Hey! Will this be included in the next release? If so, when is the approximate target date for that? Keen to use this instead of the notebook approach.

aravind-segu requested review from a team as code owners February 9, 2024 10:30

aravind-segu requested review from tanmay-db and removed request for a team February 9, 2024 10:30

aravind-segu added 4 commits February 9, 2024 02:32

Add Resource and unit tests

0c60fa7

Completed Unit Tests

bcb0d78

Added integration test

8b65e8f

remove blank test file

6891b57

aravind-segu force-pushed the LakehouseMonitor branch from b1f910f to 6891b57 Compare February 9, 2024 10:32

aravind-segu requested review from arpitjasa-db and alexott February 9, 2024 10:34

aravind-segu added 2 commits February 9, 2024 10:34

fix tests

2fd5ab1

Update Provider

0f487cd

alexott added 2 commits February 12, 2024 15:11

Try to fix integration tests

72abbed

Use force_destroy = true for catalog

2a614a1

alexott requested changes Feb 12, 2024

View reviewed changes

Add computed fields and update schema

555ea42

aravind-segu requested a review from alexott February 14, 2024 00:04

mgyucht requested changes Feb 14, 2024

View reviewed changes

alexott reviewed Feb 15, 2024

View reviewed changes

aravind-segu added 8 commits February 21, 2024 01:06

Address PR Comments

f9f5db0

Update Get

b13d47b

Lint

087dfb2

Merge branch 'main' into LakehouseMonitor

f845b81

fix test

29a2259

Merge branch 'main' into LakehouseMonitor

5f7120f

Merge branch 'main' into LakehouseMonitor

cfb96af

Update to new GO SDK and add snapshot integration tests

3f01227

aravind-segu requested a review from alexott February 27, 2024 10:45

Add Wait for Creation of monitor

589ad01

aravind-segu force-pushed the LakehouseMonitor branch from 996f913 to 589ad01 Compare March 4, 2024 19:55

aravind-segu added 2 commits March 4, 2024 11:55

Merge branch 'main' into LakehouseMonitor

7c9953c

Update Integration Tests

669a8a5

aravind-segu requested a review from mgyucht March 4, 2024 22:28

alexott requested changes Mar 5, 2024

View reviewed changes

Add Documentation

7572abd

aravind-segu requested a review from alexott March 5, 2024 20:48

mgyucht reviewed Mar 6, 2024

View reviewed changes

catalog/resource_lakehouse_monitor.go Show resolved Hide resolved

catalog/resource_lakehouse_monitor.go Show resolved Hide resolved

catalog/resource_lakehouse_monitor.go Show resolved Hide resolved

catalog/resource_lakehouse_monitor.go Show resolved Hide resolved

alexott reviewed Mar 6, 2024

View reviewed changes

catalog/resource_lakehouse_monitor.go Show resolved Hide resolved

catalog/resource_lakehouse_monitor.go Show resolved Hide resolved

docs/resources/lakehouse_monitor.md Show resolved Hide resolved

Add computed fields to documentation

8033692

alexott approved these changes Mar 6, 2024

View reviewed changes

mgyucht approved these changes Mar 6, 2024

View reviewed changes

update docs

ad9481d

arpitjasa-db reviewed Mar 6, 2024

View reviewed changes

Accept Suggestiongs

f9ce3b8

alexott reviewed Mar 6, 2024

View reviewed changes

alexott approved these changes Mar 6, 2024

View reviewed changes

alexott added this pull request to the merge queue Mar 7, 2024

Merged via the queue into databricks:main with commit 57ee88b Mar 7, 2024
5 checks passed

hectorcast-db mentioned this pull request Mar 28, 2024

Release v1.39.0 #3411

Merged

aravind-segu mentioned this pull request Apr 15, 2024

Add support for Lakehouse monitoring in bundles databricks/cli#1307

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MLOPS-596] Add Lakehouse Monitor Resource #3238

[MLOPS-596] Add Lakehouse Monitor Resource #3238

aravind-segu commented Feb 9, 2024 •

edited by alexott

Loading

codecov-commenter commented Feb 9, 2024 •

edited

Loading

alexott commented Feb 14, 2024

mgyucht left a comment

alexott commented Feb 29, 2024

alexott commented Feb 29, 2024

aravind-segu commented Mar 4, 2024

alexott left a comment

alexott Mar 5, 2024

alexott Mar 5, 2024

mgyucht left a comment

alexott left a comment

alexott left a comment

alexott Mar 6, 2024

aravind-segu commented Mar 6, 2024

mgyucht left a comment

arpitjasa-db Mar 6, 2024

aravind-segu Mar 6, 2024

arpitjasa-db Mar 6, 2024

alexott Mar 6, 2024

arpitjasa-db Mar 6, 2024

aravind-segu Mar 6, 2024

arpitjasa-db Mar 6, 2024

alexott Mar 6, 2024

alexott Mar 6, 2024

samuhepp commented Mar 12, 2024

[MLOPS-596] Add Lakehouse Monitor Resource #3238

[MLOPS-596] Add Lakehouse Monitor Resource #3238

Conversation

aravind-segu commented Feb 9, 2024 • edited by alexott Loading

Changes

Tests

codecov-commenter commented Feb 9, 2024 • edited Loading

Codecov Report

alexott commented Feb 14, 2024

mgyucht left a comment

Choose a reason for hiding this comment

alexott commented Feb 29, 2024

alexott commented Feb 29, 2024

aravind-segu commented Mar 4, 2024

alexott left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mgyucht left a comment

Choose a reason for hiding this comment

alexott left a comment

Choose a reason for hiding this comment

alexott left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aravind-segu commented Mar 6, 2024

mgyucht left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

samuhepp commented Mar 12, 2024

aravind-segu commented Feb 9, 2024 •

edited by alexott

Loading

codecov-commenter commented Feb 9, 2024 •

edited

Loading