Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Keyword Preservation During Export #688

Closed
willvedd opened this issue Nov 4, 2022 · 3 comments
Closed

Proposal: Keyword Preservation During Export #688

willvedd opened this issue Nov 4, 2022 · 3 comments
Labels
proposal Proposing new or revised functionality for comment

Comments

@willvedd
Copy link
Contributor

willvedd commented Nov 4, 2022

Summary

We propose a mechanism for preserving keyword replacement to reduce the burden of manually-placed keyword markers getting overwritten by remote tenant values upon subsequent exports.

Issue

Customers manually place keyword marker throughout their configuration files to enable dynamic replacement of values depending on the environment. When customers import their local configuration into Auth0, these markers are untouched. However, when customers subsequently re-export configuration from Auth0 to their local, all keyword markers are wiped-out with the literal values of the targeted tenant. Users are then forced to manually re-add all keyword markers throughout their configuration files.

Because of this limitation, the recommended workflow for customers is to institute a uni-directional workflow. In practice this workflow would begin with a single export and only execute subsequent imports. In order to preserve keywords, any tenant modifications would need to occur in the local configuration, no remote changes would be allowed. However, this is inflexible and customers sometimes require the ability to import remote changes.

Example

The following example highlights the occurrence of this issue:

Keyword mappings:

{
  "AUTH0_KEYWORD_REPLACEMENT_MAPPINGS":{
    "ENV": "production",
    "DOMAIN": "travel0.com"
  }
}

Local configuration file:

organizations:
  - name: org1
    connections: []
    display_name: Travel0 - ##ENV##
tenant:
  allowed_logout_urls:
    - https://mycompany.org/logoutCallback
    - https://##DOMAIN##/logout

Local configuration file after a0deploy export:

organizations:
  - name: org1
    connections: []
    display_name: Travel0 - production #Keyword marker overwritten by remote value
tenant:
  allowed_logout_urls:
    - https://mycompany.org/logoutCallback
    - https://travel0.com/logout #Keyword marker overwritten by remote value

Two Sources of Truth (“reconciliation problem”)

A fundamental point to address is that there are two sources of truth that need to be reconciled: the local configuration files and the remote tenant.

A useful analogy is when two developers are working on the same branch simultaneously and periodically pulling code from remote. In certain cases, Git won’t know how to merge changes and requires the user to reconcile manually. There are two sources of truth in this system: the developer’s local code and remote.

When a customer currently manages their Auth0 tenant, any delta in state is overwritten, regardless of direction. For exports, all local configuration gets overwritten by remote settings. Likewise with imports, all remote settings get overwritten by that of the local configuration files.

However, in the case of keyword preservation, we introduce the possibility of retaining changes from both remote and local. When that happens, the difference in state between remote and local need to be reconciled.

It is impossible to know the intent and expectations of the user. Are they expecting the remote values to override and get intelligently? Are they expecting local values to be preserved despite remote changes? Or perhaps some blend of the two?

Problematic Cases

Example 1 - Removal of keyword marker value on remote

{
  "AUTH0_KEYWORD_REPLACEMENT_MAPPINGS":{
    "DOMAIN": "travel0.com"
  }
}

Local configuration:

tenant:
  allowed_logout_urls:
    - http://localhost:3000/logoutCallback
    - https://##DOMAIN##/logout

Remote:

{
  "tenant": {
    "allowed_logout_urls": [
      "http://localhost:3000/logoutCallback", 
      "http://pr-branch-45.travel0.com/logout"
    ]
  }
}

This case is problematic because the replaced value on remote got removed and replaced with a different value. We as humans can see the intention here, but the system is unable to make an accurate determination.

Example 2 - String value w/ keyword marker changed on remote

{
  "AUTH0_KEYWORD_REPLACEMENT_MAPPINGS":{
    "ENV": "production"
  }
}

Local:

organizations:
  - name: travel0
    display_name: Travel0 (on ##ENVIRONMENT##)

Remote:

{
  "organizations": [
    {
      "id" "travel0-org",
      "name": "travel0",
      "display_name": "Production Travel0 Organization"
    }
  ]
}

This case is problematic because the property the keyword is associated with has changed on remote. This is another instance of the reconciliation problem, the system doesn’t know wether to keep the remote value or the local value.

Possible Solutions

There are a few options available for addressing instances of the reconciliation problem. Certainly, the Deploy CLI will attempt to preserve keywords when the situation is trivial, however at points of problematic reconciliations, the following mechanisms could be applied:

Attempt automatic reconciliation

At points of problematic reconciliations, always prefer a default environment (eg: local).

Pros: No additional steps for export command, will preserve majority of keywords and remote changes, operable in automated workflows.

Cons: Results may not line-up with expectations, certain remote changes may be blown away.

Defer decisions to developer

Require developer to make choice at each reconciliation instance. In practice, may be exposed through an interactive mechanism.

Pros: Developer can accurately express which remote configuration to keep, which to preserve locally. Enables highly accurate keyword preservation.

Cons: Additional steps added to export command, not suitable for automated processes, only as accurate as developer’s ability to manually reconcile.

Hybrid

Present a choice to developer before export as to which environment to prefer during problematic reconciliations. In practice, may be exposed through an argument passed in during export.

Pros: Developer has some ability to express intentions, will preserve majority of keywords and remote changes, operable in automated workflows.

Cons: Treats all reconciliations broadly, does not allow ability to prefer some remote changes while simultaneously preferring local ones.

Proposed Implementation

Given the technical challenges and questions around DX, a final implementation of keyword preservation could be quite large and complex. To enable a balance of timely delivery and thoughtful evolution, this functionality is planned to be delivered iteratively over time. The below outlines the initial functionality of keyword preservation.

Prerequisites

Customers who wish to leverage the keyword preservation functionality will need to have the following:

  • Existing configuration files local directory
  • AUTH0_KEYWORD_REPLACEMENT_MAPPINGS configuration value defined

Without configuration files and keyword mappings present, there is nothing to preserve. Customers who do not satisfy these requirements while attempting to use the feature will incur an error.

Opt-in Boolean Flag

Keyword preservation on export to be enabled through a --attempt-preserve-keywords (or similarly named) boolean flag. The flag maintains backwards compatibility and prevents a new major release.

The name is intentional to convey the possibility of problematic reconciliations, making it clear that the functionality is not performing any magic.

Prefer Local Values During Reconciliation

For the sake of simplicity, the reconciliation problem can be heavily mitigated if we prefer the local state during instances of problematic reconciliation. By preferring the local values during reconciliation, we avoid a manual step for the user and provide a predictable behavior that can be comprehended.

This behavior will allow the preservation of all keyword markers in local configuration and keep most changes on remote. However, this opens the possibility of some remote changes being unkept when local values are preferred. These instances will be communicated to the user through logging outputs.

Future Iterations

The proposed implementation sets the groundwork for future improvements to be built atop. Optimizations and configurability could be added in the future to enhance the developer experience.

A default preference for local configuration values may not be sensible in some instances. In certain situations, a developer may want to keep remote changes more than they want to preserve keywords. This preferences of reconciliation behavior could be expressed as an option atop the existing flag as an argument: --attempt-preserve-keywords="prefer-remote". This lightweight option would allow developers to better express intentions without adding an extra step to the export process.

Algorithmic improvements to string reconciliation and diff’ing could reduce the occurrence of problematic reconciliations. As feedback on this functionality becomes more clear, we can identify typical use cases and begin to cater towards those. Some changes could even be applied transparently to the developer.

Finally, there may be an implementation where a fully interactive CLI mode is necessary to triage the changes in remote and local environments on a per-reconciliation basis. This type of mechanism may have some overlap with the requested test mode functionality whereby differences in remote and local are surfaced to the developer.

Requested Feedback

  • What use cases have begged the need for keyword preservation? Are changes often made to both remote and local configurations?
  • Are exports that require keyword preservation performed ad-hoc or in automation?
  • During cases of “problematic reconciliation”, is it sensible to prefer the preservation of local configuration values over remote values?

Relevant Github Issues:

@willvedd willvedd added feature request proposal Proposing new or revised functionality for comment and removed feature request labels Nov 4, 2022
@willvedd willvedd pinned this issue Nov 4, 2022
@hayashi-ay
Copy link

Thanks for the proposal but I personally don't need these features. They only complicate the cli.

The cause of the problem of keyword replacement is that the import and export configuration files are the same, but in our operation, they are separate files, so there is no problem.

The exported config files are also used as a backup in case the tenant's configuration changes unexpectedly during operation via Auth0 dashboard or the cli.

Of course, it is necessary to manually reflect the changes to the import configuration files, but there are not that many lines to change and no more worrying about whether REMOTE or LOCAL should be prioritized.

@jcerjak
Copy link

jcerjak commented Nov 13, 2022

Thanks for the writeup, @willvedd. In general, I think the proposal is sensible. I have some comments/proposals, please see below.

Comments

There are a few options available for addressing instances of the reconciliation problem. Certainly, the Deploy CLI will attempt to preserve keywords when the situation is trivial

I think it would be good to define what situations are trivial, and what are considered a reconciliation problem.

Given your comment about magic:

The name is intentional to convey the possibility of problematic reconciliations, making it clear that the functionality is not performing any magic.

I would initially go for the simplest and safest solution, and not try to do any merge-like reconciliations at all. I propose it would work like this:

  • If remote value matches local value (with keyword replacements applied) EXACTLY, keep the local value (with keywords).
  • Otherwise keep the remote value.

This would allow:

  • Basic export/import roundtrip flow to work
  • Easily detect any remote changes (e.g. by doing a git diff), and manually reconcile them by editing local config.

For workflows that rely on frequent remote updates, a more complex/magic approach might be needed. But that can be added later, and shouldn't be the default.

Prefer Local Values During Reconciliation
This behavior will allow the preservation of all keyword markers in local configuration and keep most changes on remote. However, this opens the possibility of some remote changes being unkept when local values are preferred. These instances will be communicated to the user through logging outputs.

I think it's much easier to do a git diff, instead of going through logging output. It would also allow to set up a job which would periodically check for any changes on remote, and trigger an alert (e.g., to detect if a malicious user is doing any changes on remote). So, I would vote to prefer remote values, and let the user reconcile problematic cases manually.

Requested feedback

What use cases have begged the need for keyword preservation? Are changes often made to both remote and local configurations?

After doing initial export, we plan to make changes in local config, then deploy. But sometimes there might be some remote changes which we want to export again. Additionally, we want to detect if there were some changes done on remote - for example, if an admin was temporarily testing something, like editing allowed logout URLs, and then forgot to remove these changes. It is impractical to see a bunch of changes due to keyword preservation not working. We want to see only meaningful changes when doing an export.

Are exports that require keyword preservation performed ad-hoc or in automation?

Right now, we plan to do exports and imports ad-hoc, also to double check the changes before deploying them (when dry-run mode is added in #70 ).

But we might set up an automated export to detect any changes on remote (e.g. as part of CI). For this use case keyword preservation is crucial so that we can fail the build (or trigger automated alert), if there were some changes done on remote.

During cases of “problematic reconciliation”, is it sensible to prefer the preservation of local configuration values over remote values?

As described above, I think it would be easier to review the changes if remote values would be preferred.

@willvedd
Copy link
Contributor Author

willvedd commented Mar 3, 2023

Copy-pasting my comment on the original Github issue:

The initial iteration of this feature is complete and released in v7.17.0.

Goal:
Preserve the majority of manually-placed ##KEYWORD_REPLACE## markers in the configuration files when performing subsequent exports. Otherwise, they will be overwritten with the remote values and require toilous step of re-adding.

Usage:
Keyword preservation is an opt-in feature that can be enabled through the AUTH0_PRESERVE_KEYWORDS boolean configuration property (docs).

Prerequisites:
To leverage the keyword preservation feature, the following criteria must be satisfied:

  • Presence of local configuration files at the same location as the export target
  • Defined keyword replace mappings via the AUTH0_KEYWORD_REPLACE_MAPPINGS configuration property

Limitations:
The keyword preservation functionality will attempt to preserve as many keywords while also maintaining the accuracy of your resource configuration files. And it the majority of cases, it will work without any intervention by the user. However, some key limitations exist:

  • In the case of a keyword-replaced configuration field with differing values between local and remote, the local configuration value will always be favored. This will
  • Arrays without a specific identifiers are not eligible for preservation. Ex: [ "http://site.com/logout", "localhost:3000/logout", "##LOGOUT_URL##" ]. This is because the ordering of these values are non-deterministic. Alternatively, to preserve these values, it is recommended to leverage the @@ARRAY_REPLACE@@ keyword replace syntax with the entire value.

Final Words:
Despite the above limitations, we believe this feature to reduce the majority of the toil related to managing overwritten keyword replace markers. And best of all, it should be transparent to the developer in most cases. Keep in mind that this is only the first iteration. Despite a plethora of testing, adjustments may need to be made to accommodate real-world usage. Also, the functionality is expected to be iterated on over time to best suit the needs of developers.

Appreciate everyone's patience and feedback on this one, really excited to see how its received!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal Proposing new or revised functionality for comment
Projects
None yet
Development

No branches or pull requests

3 participants