Proposal: Keyword Preservation During Export #688

willvedd · 2022-11-04T14:00:52Z

Start Date: November 4, 2022
Github Issue: Proposal: Keyword Preservation During Export · Issue #688

Summary

We propose a mechanism for preserving keyword replacement to reduce the burden of manually-placed keyword markers getting overwritten by remote tenant values upon subsequent exports.

Issue

Customers manually place keyword marker throughout their configuration files to enable dynamic replacement of values depending on the environment. When customers import their local configuration into Auth0, these markers are untouched. However, when customers subsequently re-export configuration from Auth0 to their local, all keyword markers are wiped-out with the literal values of the targeted tenant. Users are then forced to manually re-add all keyword markers throughout their configuration files.

Because of this limitation, the recommended workflow for customers is to institute a uni-directional workflow. In practice this workflow would begin with a single export and only execute subsequent imports. In order to preserve keywords, any tenant modifications would need to occur in the local configuration, no remote changes would be allowed. However, this is inflexible and customers sometimes require the ability to import remote changes.

Example

The following example highlights the occurrence of this issue:

Keyword mappings:

{
  "AUTH0_KEYWORD_REPLACEMENT_MAPPINGS":{
    "ENV": "production",
    "DOMAIN": "travel0.com"
  }
}

Local configuration file:

organizations:
  - name: org1
    connections: []
    display_name: Travel0 - ##ENV##
tenant:
  allowed_logout_urls:
    - https://mycompany.org/logoutCallback
    - https://##DOMAIN##/logout

Local configuration file after a0deploy export:

organizations:
  - name: org1
    connections: []
    display_name: Travel0 - production #Keyword marker overwritten by remote value
tenant:
  allowed_logout_urls:
    - https://mycompany.org/logoutCallback
    - https://travel0.com/logout #Keyword marker overwritten by remote value

Two Sources of Truth (“reconciliation problem”)

A fundamental point to address is that there are two sources of truth that need to be reconciled: the local configuration files and the remote tenant.

A useful analogy is when two developers are working on the same branch simultaneously and periodically pulling code from remote. In certain cases, Git won’t know how to merge changes and requires the user to reconcile manually. There are two sources of truth in this system: the developer’s local code and remote.

When a customer currently manages their Auth0 tenant, any delta in state is overwritten, regardless of direction. For exports, all local configuration gets overwritten by remote settings. Likewise with imports, all remote settings get overwritten by that of the local configuration files.

However, in the case of keyword preservation, we introduce the possibility of retaining changes from both remote and local. When that happens, the difference in state between remote and local need to be reconciled.

It is impossible to know the intent and expectations of the user. Are they expecting the remote values to override and get intelligently? Are they expecting local values to be preserved despite remote changes? Or perhaps some blend of the two?

Problematic Cases

Example 1 - Removal of keyword marker value on remote

{
  "AUTH0_KEYWORD_REPLACEMENT_MAPPINGS":{
    "DOMAIN": "travel0.com"
  }
}

Local configuration:

tenant:
  allowed_logout_urls:
    - http://localhost:3000/logoutCallback
    - https://##DOMAIN##/logout

Remote:

{
  "tenant": {
    "allowed_logout_urls": [
      "http://localhost:3000/logoutCallback", 
      "http://pr-branch-45.travel0.com/logout"
    ]
  }
}

This case is problematic because the replaced value on remote got removed and replaced with a different value. We as humans can see the intention here, but the system is unable to make an accurate determination.

Example 2 - String value w/ keyword marker changed on remote

{
  "AUTH0_KEYWORD_REPLACEMENT_MAPPINGS":{
    "ENV": "production"
  }
}

Local:

organizations:
  - name: travel0
    display_name: Travel0 (on ##ENVIRONMENT##)

Remote:

{
  "organizations": [
    {
      "id" "travel0-org",
      "name": "travel0",
      "display_name": "Production Travel0 Organization"
    }
  ]
}

This case is problematic because the property the keyword is associated with has changed on remote. This is another instance of the reconciliation problem, the system doesn’t know wether to keep the remote value or the local value.

Possible Solutions

There are a few options available for addressing instances of the reconciliation problem. Certainly, the Deploy CLI will attempt to preserve keywords when the situation is trivial, however at points of problematic reconciliations, the following mechanisms could be applied:

Attempt automatic reconciliation

At points of problematic reconciliations, always prefer a default environment (eg: local).

Pros: No additional steps for export command, will preserve majority of keywords and remote changes, operable in automated workflows.

Cons: Results may not line-up with expectations, certain remote changes may be blown away.

Defer decisions to developer

Require developer to make choice at each reconciliation instance. In practice, may be exposed through an interactive mechanism.

Pros: Developer can accurately express which remote configuration to keep, which to preserve locally. Enables highly accurate keyword preservation.

Cons: Additional steps added to export command, not suitable for automated processes, only as accurate as developer’s ability to manually reconcile.

Hybrid

Present a choice to developer before export as to which environment to prefer during problematic reconciliations. In practice, may be exposed through an argument passed in during export.

Pros: Developer has some ability to express intentions, will preserve majority of keywords and remote changes, operable in automated workflows.

Cons: Treats all reconciliations broadly, does not allow ability to prefer some remote changes while simultaneously preferring local ones.

Proposed Implementation

Given the technical challenges and questions around DX, a final implementation of keyword preservation could be quite large and complex. To enable a balance of timely delivery and thoughtful evolution, this functionality is planned to be delivered iteratively over time. The below outlines the initial functionality of keyword preservation.

Prerequisites

Customers who wish to leverage the keyword preservation functionality will need to have the following:

Existing configuration files local directory
AUTH0_KEYWORD_REPLACEMENT_MAPPINGS configuration value defined

Without configuration files and keyword mappings present, there is nothing to preserve. Customers who do not satisfy these requirements while attempting to use the feature will incur an error.

Opt-in Boolean Flag

Keyword preservation on export to be enabled through a --attempt-preserve-keywords (or similarly named) boolean flag. The flag maintains backwards compatibility and prevents a new major release.

The name is intentional to convey the possibility of problematic reconciliations, making it clear that the functionality is not performing any magic.

Prefer Local Values During Reconciliation

For the sake of simplicity, the reconciliation problem can be heavily mitigated if we prefer the local state during instances of problematic reconciliation. By preferring the local values during reconciliation, we avoid a manual step for the user and provide a predictable behavior that can be comprehended.

This behavior will allow the preservation of all keyword markers in local configuration and keep most changes on remote. However, this opens the possibility of some remote changes being unkept when local values are preferred. These instances will be communicated to the user through logging outputs.

Future Iterations

The proposed implementation sets the groundwork for future improvements to be built atop. Optimizations and configurability could be added in the future to enhance the developer experience.

A default preference for local configuration values may not be sensible in some instances. In certain situations, a developer may want to keep remote changes more than they want to preserve keywords. This preferences of reconciliation behavior could be expressed as an option atop the existing flag as an argument: --attempt-preserve-keywords="prefer-remote". This lightweight option would allow developers to better express intentions without adding an extra step to the export process.

Algorithmic improvements to string reconciliation and diff’ing could reduce the occurrence of problematic reconciliations. As feedback on this functionality becomes more clear, we can identify typical use cases and begin to cater towards those. Some changes could even be applied transparently to the developer.

Finally, there may be an implementation where a fully interactive CLI mode is necessary to triage the changes in remote and local environments on a per-reconciliation basis. This type of mechanism may have some overlap with the requested test mode functionality whereby differences in remote and local are surfaced to the developer.

Requested Feedback

What use cases have begged the need for keyword preservation? Are changes often made to both remote and local configurations?
Are exports that require keyword preservation performed ad-hoc or in automation?
During cases of “problematic reconciliation”, is it sensible to prefer the preservation of local configuration values over remote values?

Relevant Github Issues:

The text was updated successfully, but these errors were encountered:

hayashi-ay · 2022-11-07T01:55:00Z

Thanks for the proposal but I personally don't need these features. They only complicate the cli.

The cause of the problem of keyword replacement is that the import and export configuration files are the same, but in our operation, they are separate files, so there is no problem.

The exported config files are also used as a backup in case the tenant's configuration changes unexpectedly during operation via Auth0 dashboard or the cli.

Of course, it is necessary to manually reflect the changes to the import configuration files, but there are not that many lines to change and no more worrying about whether REMOTE or LOCAL should be prioritized.

jcerjak · 2022-11-13T14:55:10Z

Thanks for the writeup, @willvedd. In general, I think the proposal is sensible. I have some comments/proposals, please see below.

Comments

There are a few options available for addressing instances of the reconciliation problem. Certainly, the Deploy CLI will attempt to preserve keywords when the situation is trivial

I think it would be good to define what situations are trivial, and what are considered a reconciliation problem.

Given your comment about magic:

The name is intentional to convey the possibility of problematic reconciliations, making it clear that the functionality is not performing any magic.

I would initially go for the simplest and safest solution, and not try to do any merge-like reconciliations at all. I propose it would work like this:

If remote value matches local value (with keyword replacements applied) EXACTLY, keep the local value (with keywords).
Otherwise keep the remote value.

This would allow:

Basic export/import roundtrip flow to work
Easily detect any remote changes (e.g. by doing a git diff), and manually reconcile them by editing local config.

For workflows that rely on frequent remote updates, a more complex/magic approach might be needed. But that can be added later, and shouldn't be the default.

Prefer Local Values During Reconciliation
This behavior will allow the preservation of all keyword markers in local configuration and keep most changes on remote. However, this opens the possibility of some remote changes being unkept when local values are preferred. These instances will be communicated to the user through logging outputs.

I think it's much easier to do a git diff, instead of going through logging output. It would also allow to set up a job which would periodically check for any changes on remote, and trigger an alert (e.g., to detect if a malicious user is doing any changes on remote). So, I would vote to prefer remote values, and let the user reconcile problematic cases manually.

Requested feedback

What use cases have begged the need for keyword preservation? Are changes often made to both remote and local configurations?

After doing initial export, we plan to make changes in local config, then deploy. But sometimes there might be some remote changes which we want to export again. Additionally, we want to detect if there were some changes done on remote - for example, if an admin was temporarily testing something, like editing allowed logout URLs, and then forgot to remove these changes. It is impractical to see a bunch of changes due to keyword preservation not working. We want to see only meaningful changes when doing an export.

Are exports that require keyword preservation performed ad-hoc or in automation?

Right now, we plan to do exports and imports ad-hoc, also to double check the changes before deploying them (when dry-run mode is added in #70 ).

But we might set up an automated export to detect any changes on remote (e.g. as part of CI). For this use case keyword preservation is crucial so that we can fail the build (or trigger automated alert), if there were some changes done on remote.

During cases of “problematic reconciliation”, is it sensible to prefer the preservation of local configuration values over remote values?

As described above, I think it would be easier to review the changes if remote values would be preferred.

willvedd · 2023-03-03T14:20:03Z

Copy-pasting my comment on the original Github issue:

The initial iteration of this feature is complete and released in v7.17.0.

Goal:
Preserve the majority of manually-placed ##KEYWORD_REPLACE## markers in the configuration files when performing subsequent exports. Otherwise, they will be overwritten with the remote values and require toilous step of re-adding.

Usage:
Keyword preservation is an opt-in feature that can be enabled through the AUTH0_PRESERVE_KEYWORDS boolean configuration property (docs).

Prerequisites:
To leverage the keyword preservation feature, the following criteria must be satisfied:

Presence of local configuration files at the same location as the export target
Defined keyword replace mappings via the AUTH0_KEYWORD_REPLACE_MAPPINGS configuration property

Limitations:
The keyword preservation functionality will attempt to preserve as many keywords while also maintaining the accuracy of your resource configuration files. And it the majority of cases, it will work without any intervention by the user. However, some key limitations exist:

In the case of a keyword-replaced configuration field with differing values between local and remote, the local configuration value will always be favored. This will
Arrays without a specific identifiers are not eligible for preservation. Ex: [ "http://site.com/logout", "localhost:3000/logout", "##LOGOUT_URL##" ]. This is because the ordering of these values are non-deterministic. Alternatively, to preserve these values, it is recommended to leverage the @@ARRAY_REPLACE@@ keyword replace syntax with the entire value.

Final Words:
Despite the above limitations, we believe this feature to reduce the majority of the toil related to managing overwritten keyword replace markers. And best of all, it should be transparent to the developer in most cases. Keep in mind that this is only the first iteration. Despite a plethora of testing, adjustments may need to be made to accommodate real-world usage. Also, the functionality is expected to be iterated on over time to best suit the needs of developers.

Appreciate everyone's patience and feedback on this one, really excited to see how its received!

willvedd added feature request proposal Proposing new or revised functionality for comment and removed feature request labels Nov 4, 2022

willvedd pinned this issue Nov 4, 2022

willvedd mentioned this issue Nov 4, 2022

Preserve keyword replacement during export #328

Closed

jcerjak mentioned this issue Dec 12, 2022

Potentially dangerous and inconsistent behavior for secrets #689

Open

6 tasks

This was referenced Feb 21, 2023

DXCDT-376: Preserve keywords function #745

Merged

DXCDT-377: E2E test cases for keyword presevation #753

Merged

This was referenced Mar 1, 2023

DXCDT-384: Keyword preservation in auxiliary files #758

Merged

DXCDT-387: Keyword preservation documentation #759

Merged

DXCDT-388: Enabling preservation of array replace markers in YAML #760

Merged

willvedd closed this as completed Mar 3, 2023

willvedd unpinned this issue Mar 4, 2023

jcerjak mentioned this issue Mar 10, 2023

Error on export when keyword preservation is enabled #763

Closed

6 tasks

willvedd mentioned this issue Mar 24, 2023

Keyword preservation does not preserve "identifier" fields #770

Closed

6 tasks

TimeTravelerFromNow mentioned this issue Sep 9, 2024

Directory JSON actions secrets matches YAML #954

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Keyword Preservation During Export #688

Proposal: Keyword Preservation During Export #688

willvedd commented Nov 4, 2022 •

edited

Loading

hayashi-ay commented Nov 7, 2022

jcerjak commented Nov 13, 2022

willvedd commented Mar 3, 2023

Proposal: Keyword Preservation During Export #688

Proposal: Keyword Preservation During Export #688

Comments

willvedd commented Nov 4, 2022 • edited Loading

Summary

Issue

Example

Keyword mappings:

Two Sources of Truth (“reconciliation problem”)

Problematic Cases

Example 1 - Removal of keyword marker value on remote

Example 2 - String value w/ keyword marker changed on remote

Possible Solutions

Attempt automatic reconciliation

Defer decisions to developer

Hybrid

Proposed Implementation

Prerequisites

Opt-in Boolean Flag

Prefer Local Values During Reconciliation

Future Iterations

Requested Feedback

Relevant Github Issues:

hayashi-ay commented Nov 7, 2022

jcerjak commented Nov 13, 2022

Comments

Requested feedback

willvedd commented Mar 3, 2023

willvedd commented Nov 4, 2022 •

edited

Loading