Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XWIKI-22726: Allow customizing the validation of HQL queries through configuration #3747

Merged
merged 4 commits into from
Dec 13, 2024

Conversation

tmortagne
Copy link
Member

Jira URL

https://jira.xwiki.org/browse/XWIKI-22726

Changes

Description

  • introduce an (internal for now) extension point to provide HQL query validators in charge of check if a query is safe to be allowed for user who don't have programming right
  • move the current validation to the new DefaultHQLQueryValidator
  • introduce a regex based ConfigurableHQLQueryValidator and its corresponding xwiki.properties documentation

Clarifications

  • while it would be interesting to make a similar extension point public one day, I don't have the required time to spend on designing the cleanest extension point possible for this (we probably need something more generic)

Executed Tests

TODO: add regex related tests

Expected merging strategy

  • Prefers squash: Yes
  • Backport on branches: 16.10.x, 16.4.x, 15.10.x

@tmortagne tmortagne marked this pull request as draft December 10, 2024 16:53
#-#
# query.hql.safe=select pro1, prop2 from CustomTable
# query.hql.safe=select\\s+((prop1|prop2|prop3)\\s*,?\\s*)+\\s+from MyCustomTable

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the config is internal, we shouldn't expose nor document it here. The mere fact to describe it here makes it an official configuration that people may use.

Copy link
Member Author

@tmortagne tmortagne Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is internal, but this does not say anything about the configuration (most of the configuration is used by internal code). To me, this is a public configuration that people may use (I'm not sure an internal configuration make much sense in general, in fact).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok so the config is public but the api is internal. Hmm... that's dangerous since it's going to be hard to change the key in the future if it's needed. For example if we allow to have several implementations, we need some hint in the key.

Something like: query.hql.validation.standard.safe|unsafe, where standard is the hint of the impl, and safe|unsafe are some config properties of that impl.

WDYT?

Thx

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way the code works is that it asks all implementations, not just one. And the first one that answers positively or negatively wins. So there is no reason to configure one implementation, we need the ability to have several implementations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok thx Michael. Are you saying that all impls need to support the safe and unsafe config parameters (or that they are independent of the impls), and thus that impls don't need to support any config options (ie that the safe and unsafe params are not config params of the default impl)?

Copy link
Member Author

@tmortagne tmortagne Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all impls need to support the safe and unsafe config parameters

No. One of the validators is a configuration based validator and is the one which reads this. Other validators are doing completely different things. It does not make any sense having the hint of the validator (which is currently not an API) in the configuration key.

To me, the only thing that needs to be discussed is the name of the configuration key from the point of view of the user of the configuration. And I really don't think including in the key implementation details (like the fact that some specific validator is reading that configuration and the name of the validator) is a good idea.

}

/**
* @param statementString the statement to evaluate
* @return true if the select is allowed for user without PR
* @deprecated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing documentation.

Copy link
Member Author

@tmortagne tmortagne Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really have any alternative to put there, actually. It's technically an API (because it's protected) so I have to keep it, but I don't think it was really the intent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make sense to point to the new HQLStatementValidator, even if it is internal? Or to make that interface non-internal?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make sense to point to the new HQLStatementValidator, even if it is internal?

I'm afraid some people might actually use an internal tool without thinking about it if that's what we recommend (since that's what this javadoc is about).

Or to make that interface non-internal?

I really don't want to start an infinite debate on if it's good enough to be public (I would prefer something more generic myself).

#-# Note: be very careful to not use too large regular expression when adding a safe pattern as it could introduce a vulnerability.
#-#
# query.hql.safe=select pro1, prop2 from CustomTable
# query.hql.safe=select\\s+((prop1|prop2|prop3)\\s*,?\\s*)+\\s+from MyCustomTable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small typos and ideas of improvements:

  1. Grammar error. Proposal:

    While HQL queries executed by users without programming right are already validated, it's sometimes necessary to [...]

  2. It would be even nicer to give a real use case example.

  3. Small grammar error. s/allows/allow (since there's more than 1 property)

  4. Small grammar error. s/expression/expressions (it's plural since there's a list).

  5. s/to considering it unsafe/to the unsafe pattern or s/to applying the unsafe pattern

  6. s/select pro1/select prop1

Thx

Copy link
Member Author

@tmortagne tmortagne Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be even nicer to give a real use case example.

To me, the example which indicate a way to allow selecting from a custom table you introduced and consider safe to read is a real use case. Any example that would be based on a standard table would probably mean that there is a bug in the default validator.

@tmortagne tmortagne marked this pull request as ready for review December 11, 2024 18:10
@tmortagne tmortagne force-pushed the feature-XWIKI-22726 branch 2 times, most recently from db0867d to d5b0b51 Compare December 13, 2024 08:55
Copy link
Contributor

@michitux michitux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, just some small comments mainly about the documentation.

*
* @version $Id$
* @since 1.6M1
*/
@SuppressWarnings("checkstyle:ClassFanOutComplexity")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we put an explanation why it is okay to put the ignore rule?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"I tried but it's hard" did not felt like a compelling argument.

}

/**
* @param statementString the statement to evaluate
* @return true if the select is allowed for user without PR
* @deprecated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make sense to point to the new HQLStatementValidator, even if it is internal? Or to make that interface non-internal?

*
* @version $Id$
* @since 17.0.0RC1
* @since 16.10.2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder (applies to many other places, too) if we should really indicate 17.0.0RC1 as since-version if we already have 16.10.2.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to prefer being very explicit in the code and indicate each branch where it was committed. There is no release note related confusion involved at this level.

…configuration

* refactor the internal validator system to make it easier to reusable for other use cases
* make sure the configurable validator is executed before the standard one
* improve safe named queries initialization
* tried to remove stuff from HqlQueryExecutor, but was not enough to get rid of the ClassFanOutComplexity unfortunately
* various documentation improvements
…configuration

* rename a component hint to be more consistent with the class name
* fail the whole validator if any of the patterns is wrong
* add some validator priority related documentation
@tmortagne
Copy link
Member Author

Thanks a lot for the input @michitux and @vmassol. Merging.

@tmortagne tmortagne merged commit 620256c into master Dec 13, 2024
3 checks passed
@tmortagne tmortagne self-assigned this Dec 13, 2024
Copy link

The backport to stable-15.10.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-stable-15.10.x stable-15.10.x
# Navigate to the new working tree
cd .worktrees/backport-stable-15.10.x
# Create a new branch
git switch --create backport-3747-to-stable-15.10.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 620256c23459ca643ad15269d3979e15ccb9ca8a
# Push it to GitHub
git push --set-upstream origin backport-3747-to-stable-15.10.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-stable-15.10.x

Then, create a pull request where the base branch is stable-15.10.x and the compare/head branch is backport-3747-to-stable-15.10.x.

Copy link

The backport to stable-16.4.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-stable-16.4.x stable-16.4.x
# Navigate to the new working tree
cd .worktrees/backport-stable-16.4.x
# Create a new branch
git switch --create backport-3747-to-stable-16.4.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 620256c23459ca643ad15269d3979e15ccb9ca8a
# Push it to GitHub
git push --set-upstream origin backport-3747-to-stable-16.4.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-stable-16.4.x

Then, create a pull request where the base branch is stable-16.4.x and the compare/head branch is backport-3747-to-stable-16.4.x.

github-actions bot pushed a commit that referenced this pull request Dec 13, 2024
…configuration (#3747)

* introduce an (internal) extensible validation system HQL queries
* introduce a new validator based on configurable regex of safe/unsafe regex patterns

(cherry picked from commit 620256c)
tmortagne pushed a commit that referenced this pull request Dec 13, 2024
…configuration (#3747) (#3763)

* introduce an (internal) extensible validation system HQL queries
* introduce a new validator based on configurable regex of safe/unsafe regex patterns

(cherry picked from commit 620256c)
tmortagne added a commit that referenced this pull request Dec 13, 2024
…configuration (#3747)

* introduce an (internal) extensible validation system HQL queries
* introduce a new validator based on configurable regex of safe/unsafe regex patterns

(cherry picked from commit 620256c)
tmortagne added a commit that referenced this pull request Dec 13, 2024
…configuration (#3747)

* introduce an (internal) extensible validation system HQL queries
* introduce a new validator based on configurable regex of safe/unsafe regex patterns

(cherry picked from commit 620256c)
@tmortagne tmortagne deleted the feature-XWIKI-22726 branch December 13, 2024 13:01
tmortagne added a commit that referenced this pull request Dec 16, 2024
tmortagne added a commit that referenced this pull request Dec 16, 2024
…configuration (#3747)

* ignore Sonar false positive

(cherry picked from commit 494f8ec)
tmortagne added a commit that referenced this pull request Dec 16, 2024
…configuration (#3747)

* ignore Sonar false positive

(cherry picked from commit 494f8ec)
tmortagne added a commit that referenced this pull request Dec 16, 2024
…configuration (#3747)

* ignore Sonar false positive

(cherry picked from commit 494f8ec)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants