Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom annotations for bundle-specific JSON schema fields #1957

Merged
merged 38 commits into from
Dec 18, 2024

Conversation

ilyakuz-db
Copy link
Contributor

@ilyakuz-db ilyakuz-db commented Dec 4, 2024

Changes

Adds annotations to json-schema for fields which are not covered by OpenAPI spec.

Custom descriptions were copy-pasted from documentation PR which is still WIP so descriptions for some fields are missing

Further improvements:

  • documentation autogen based on json-schema
  • fix missing descriptions

Tests

This script is not part of CLI package so I didn't test all corner cases. Few high-level tests were added to be sure that schema annotations is in sync with actual config


// Title of the object, rendered as inline documentation in the IDE.
// https://json-schema.org/understanding-json-schema/reference/annotations
Title string `json:"title,omitempty"`
Copy link
Contributor Author

@ilyakuz-db ilyakuz-db Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Title is not necessary but it's in the spec and monaco editor (vscode, Workspace) supports it. Maybe it will be useful in some cases where the field name itself is ambiguous

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the title render in the inline IDE interpolation? Is there an improvement vs no title?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answered here #1957 (comment)

Copy link
Contributor

@shreyas-goenka shreyas-goenka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This functionality would be great to have. I have a couple of differing opinions about how broad the scope of custom annotations should be and how to structure them.

Please TAL!

Makefile Outdated Show resolved Hide resolved
func generateSchema(workdir, outputFile string) {
annotationsPath := path.Join(workdir, "annotations.yml")
annotationsOpenApiPath := path.Join(workdir, "annotations_openapi.yml")
annotationsOpenApiOverridesPath := path.Join(workdir, "annotations_openapi_overrides.yml")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should limit the scope of custom annotations to only non-OpenAPI overrides. I don't see much if any overrides for OpenAPI fields. Any such one-of overrides can be easily be added similar to removeJobsFields.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reasoning is that annotations_openapi.yml and annotations_openapi_overrides.yml are large files that serve little purpose and complicate understanding how OpenAPI descriptions are loaded onto the schema.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was to use annotations_openapi.yml as untouched auto-generated file (we don't modify it), to be able to generate schema without running genkit every time. And in annotations_openapi_overrides.yml we can add any changes we need, e.g. fill missing descriptions.

Currently annotations_openapi_overrides consists only the fields where descriptions are missing, and it seems valid to override in this case since it affects json-schema usability

But we could merge all overrides in single file together in annotations.yml, @pietern WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be cautious about overexposing here since this will be harder to maintain as we launch SDK v1 with versioning per package. The interface of PkgName + TypeName might not be the best one for structs outside the CLI repo.

It seems valid to override in this case since it affects json-schema usability

Yeah, but we should not maintain docs for multiple API fields that we do not own. One-off overrides can be achieved via removeJobsFields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate more about SDK v1, how it will change json-schema, will it also include different versions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to keep it around but as minimal as possible. Good to have an escape hatch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate more about SDK v1, how it will change json-schema, will it also include different versions?

IIRC we'll have different Go packages for each version, jobsv1, jobsv2 etc so the package name in the key might no longer correspond to the new package names.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We embed these types and we infer their path from the actual struct.

The only issue here is that we add another place where we depend on these paths and that we need to update when this move happens. That's a reasonable cost and worth it, IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to keep it around but as minimal as possible. Good to have an escape hatch.

We can skip descriptions for those and I will remove all PLACEHOLDER references in OpenAPI overrides and keep the file empty until we need overrides. As for annotations_openapi.yml - we could keep as is to keep an ability to run schema updates without genkit, @shreyas-goenka WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only issue here is that we add another place where we depend on these paths and that we need to update when this move happens. That's a reasonable cost and worth it, IMO.

I will add test that checks that descriptions are still valid after path changes as suggested here

bundle/internal/schema/main_test.go Show resolved Hide resolved
github.com/databricks/cli/bundle/config.Artifact:
description: Defines the attributes to build an artifact.
title: Artifact
github.com/databricks/cli/bundle/config.Artifact.build:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about grouping the struct fields together? Something like:

github.com/databricks/cli/bundle/config.Artifact:
  executable:
    description: PLACEHOLDER
  files:
    description: PLACEHOLDER
  path:
    description: PLACEHOLDER
  type:
    description: PLACEHOLDER

It'll make this file easier to read and maintain. We don't need any docs for the top level Root{} struct so we are not missing out on completeness.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! Looks like it should be safe to skip non-property descriptions like github.com/databricks/cli/bundle/config.Artifact since it is already included in config.Root.artifacts as property

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated according to this idea, for some rare cases where we still need root type I added "_" key


// Title of the object, rendered as inline documentation in the IDE.
// https://json-schema.org/understanding-json-schema/reference/annotations
Title string `json:"title,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the title render in the inline IDE interpolation? Is there an improvement vs no title?

}

func (d *annotationHandler) addAnnotations(typ reflect.Type, s jsonschema.Schema) jsonschema.Schema {
refPath := jsonschema.TypePath(typ)
Copy link
Contributor

@shreyas-goenka shreyas-goenka Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather keep typePath a private method because it deals with how to serialize #ref and work with the way our OpenAPI spec is structured.

We can instead directly use typ.PkgPath() + "." + typ.Name() here and error if either of them is an empty string. The rationale being that the annotations override will only be applicable for Go structs and not other types like primitive strings.

@ilyakuz-db
Copy link
Contributor Author

How does the title render in the inline IDE interpolation? Is there an improvement vs no title?

It is shown as header, I think the only improvement that we can add some custom title that we want to emphasize, e.g. for target keys, showing something like <target>. Although I'm not sure if schema supports descriptions for non-static keys, need to be checked

image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test to prevent package renames from breaking documentation? A test that ensures all types defined in the annotations overrides actually exist in the CLI repo.

We should also consider limiting the scope of the annotations to only structs in the CLI repo if the keys are based of the type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, will add tests for that

We should also consider limiting the scope of the annotations to only structs in the CLI repo if the keys are based of the type.

Do you mean to check properties only for structs that are in the CLI package?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added test for detached annotations

Makefile Outdated Show resolved Hide resolved
"reflect"
"strings"

"github.com/ghodss/yaml"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use yamlsaver for saving to disk.

This gives you some control over the node style if you need it (e.g. use blocks for all descriptions).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to yamlsaver with yaml.LiteralStyle to make editing easier, this style seems the best for this purpose

func generateSchema(workdir, outputFile string) {
annotationsPath := path.Join(workdir, "annotations.yml")
annotationsOpenApiPath := path.Join(workdir, "annotations_openapi.yml")
annotationsOpenApiOverridesPath := path.Join(workdir, "annotations_openapi_overrides.yml")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to keep it around but as minimal as possible. Good to have an escape hatch.

Comment on lines +42 to +44
schema:
@echo "✓ Generating json-schema ..."
@go run ./bundle/internal/schema ./bundle/internal/schema ./bundle/schema/jsonschema.json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need the path to itself?

Copy link
Contributor Author

@ilyakuz-db ilyakuz-db Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I passed this as argument to avoid hardcoding relative paths in Go (to keep an ability to execute both from make schema and from other places like Run from IDE). Other possible option to keep portability is to use embed but I also need write operations so it seems like not an option in my case

bundle/internal/schema/annotations.go Show resolved Hide resolved

rootTypeAnnotation, ok := annotations[RootTypeKey]
if ok {
assingAnnotation(&s, rootTypeAnnotation)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo

Suggested change
assingAnnotation(&s, rootTypeAnnotation)
assignAnnotation(&s, rootTypeAnnotation)

item, ok := annotations[k]
if !ok {
item = annotation{}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value of the map is a value type, so item will already be zero-initialized if it doesn't exist.

The if !ok is not needed.

bundle/internal/schema/annotations.go Show resolved Hide resolved
bundle/internal/schema/annotations.go Show resolved Hide resolved
addInterpolationPatterns,
})
if err != nil {
log.Fatal(err)
}

err = a.sync(annotationsPath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you include a comment here; I suspect this writes the input file back with missing types included.

bundle/internal/schema/main_test.go Show resolved Hide resolved
assert.NoError(t, err)
copied, err := os.ReadFile(annotationsPath)
assert.NoError(t, err)
assert.Equal(t, string(original), string(copied), "Missing JSON-schema descriptions for new config fields in bundle/internal/schema/annotations.yml")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't run on Windows because it uses path.Join instead of filepath.Join. If you search/replace that the test will work on Windows as well. Not a blocking change for this PR, but would rather not have Windows-skips if it is not strictly necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was also an issue with line-breaks so I changed the way how this test compares values, now it parses file content and compare dynamic representation of yaml fields

@ilyakuz-db ilyakuz-db requested a review from pietern December 17, 2024 16:05
@pietern pietern changed the title feat: Custom annotations for json-schema fields in bundle Custom annotations for bundle-specific JSON schema fields Dec 18, 2024
Copy link
Contributor

@pietern pietern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @ilyakuz-db.

Merging the main branch again to resolve the conflict and then this will go in.

Can you TAL at the remaining comment?

func (d *annotationHandler) addAnnotations(typ reflect.Type, s jsonschema.Schema) jsonschema.Schema {
refPath := getPath(typ)
shouldHandle := strings.HasPrefix(refPath, "github.com")
if !shouldHandle {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which types don't have this prefix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some primitive types like int64, string etc

Copy link

If integration tests don't run automatically, an authorized user can run them manually by following the instructions below:

Trigger:
go/deco-tests-run/cli

Inputs:

  • PR number: 1957
  • Commit SHA: 835fb05b873acd806cabe097770b7eb3bc18ff32

Checks will be approved automatically on success.

@pietern pietern enabled auto-merge December 18, 2024 10:00
@eng-dev-ecosystem-bot
Copy link
Collaborator

Test Details: go/deco-tests/12390589198

@pietern pietern added this pull request to the merge queue Dec 18, 2024
Merged via the queue into main with commit 042c8d8 Dec 18, 2024
10 checks passed
@pietern pietern deleted the feat/custom-annotations-json-schema branch December 18, 2024 10:25
github-merge-queue bot pushed a commit that referenced this pull request Dec 19, 2024
## Changes

I noticed that #1957 took a dep on this library even though we no longer
need it. This change removes the dep and cleans up other (unused) uses
of the library. We originally relied on this library to deserialize
bundle configuration and JSON payloads to non-bundle CLI commands.

Relevant commits:
* The YAML flag was added to support apps (very early), and is not
longer used: e408b70
* First use for bundle configuration loading: e47fa61
* Switch bundle configuration loading to use `libs/dyn`: 87dd46a

## Tests

The build works without the dependency.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants