-
-
Notifications
You must be signed in to change notification settings - Fork 581
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Behavior when attempting to *override* an official dialect is not well defined when recursing into subschemas containing the official dialect's $schema identifier #994
Comments
Hi. Thanks. I'm not sure when I'll have time to dig through asdf, so it'd certainly help if you could simplify this to something that doesn't involve asdf by diagnosing which usage of this library leads to behavior you don't expect (e.g. perhaps by producing a self contained bit of code that has this behavior). Otherwise I'm happy to leave this open and whenever I get a chance to try to help. |
@Julian, sorry for not creating an asdf independent reproducer yesterday. The following is a reproducer of the issue, though I think it can be simplified: import jsonschema.validators as mvalidators
def check_schema(meta_schema, schema, validate_default=True):
validators = mvalidators.Draft4Validator.VALIDATORS.copy()
if validate_default:
def _validate_default(validator, default, instance, schema):
# Do something drastic to demonstrate this function getting executed
raise RuntimeError("This error should be raised")
validators.update({"default": _validate_default})
cls = mvalidators.create(
meta_schema=meta_schema,
validators=validators,
type_checker=mvalidators.Draft4Validator.TYPE_CHECKER,
id_of=mvalidators.Draft4Validator.ID_OF,
)
validator = cls(meta_schema)
validator.validate(schema)
s = {"type": "object", "properties": {"a": {"type": "integer", "default": "foo"}}}
meta_schema = {
"$schema": "http://json-schema.org/draft-04/schema",
"allOf": [
{"$ref": "http://json-schema.org/draft-04/schema"},
]
}
# Attempt to run the `_validate_default` validator
try:
check_schema(meta_schema, s)
except RuntimeError:
# We want the error to be raised to demonstrate `_validate_default` being run.
pass
else:
raise RuntimeError("An Error should have been raised")
# Run again without the `_validate_default` validator
check_schema(meta_schema, s, False) Under jsonschema 4.9.1 (and below to at least 4.0.1) this will run without issue. However in jsonschema 4.15.0 (and 4.10.0, though I have not tried any of the versions in between), this does fail with:
This shows that the |
Note that changing jsonschema/jsonschema/validators.py Line 238 in 2f15a98
to result = NewValidator(**changes)
result.VALIDATORS.update(self.VALIDATORS)
return result Will fix the first part ( |
Nice! No worries, that looks great, will have a look in the morning, thanks for it. |
A better fix than the one in #994 (comment) is to replace jsonschema/jsonschema/validators.py Line 228 in 2f15a98
BaseValidator = validator_for(schema, default=cls)
NewValidator = extend(
validator=BaseValidator,
validators=self.VALIDATORS
) It seems to pass my through my reproducer but fails this unit test: jsonschema/jsonschema/tests/test_validators.py Lines 1510 to 1531 in 2f15a98
This is a bit ironic, since I was one of the ones that raised the issue that prompted that unit test. |
Better is relative, it appears to have created a recursion error (don't have time to explore this today) in a different (mostly unrelated) part of asdf, and caused asdf to benchmark significantly slower. |
I have to look a bit closer, but part of what's here is expected behavior after the bugfix -- specifically your meta schema is saying it wants to follow the rules of draft 4 (by claiming its |
The bit of asdf at issue is here. In asdf, we require all schemas to follow the our own "yaml-schema" meta-schema which is asdf's extension of the draft 4 meta-schema. Indeed, "yaml-schema" is supposed to be written in a way which makes it a valid schema under draft 4. Note that much of the relevant asdf code I stripped out was mostly the code necessary to guess (if needed) and load asdf's meta-schema into the python objects that jsonschema consumes. This means basically had to hand code asdf's meta-schema for the example to work, which resulted in me adding the minimum portions of the yaml-schema to recreate the issue. The reason for using |
Tagging @eslavich for comments about asdf. |
@WilliamJamieson I'm confused by this example: The |
Here's a similar but I think more realistic example: import jsonschema.validators as mvalidators
def check_schema(meta_schema, schema, data, validate_default=True):
validators = mvalidators.Draft4Validator.VALIDATORS.copy()
if validate_default:
def _validate_default(validator, default, instance, schema):
# Do something drastic to demonstrate this function getting executed
raise RuntimeError("This error should be raised")
validators.update({"default": _validate_default})
cls = mvalidators.create(
meta_schema=meta_schema,
validators=validators,
type_checker=mvalidators.Draft4Validator.TYPE_CHECKER,
id_of=mvalidators.Draft4Validator.ID_OF,
)
validator = cls(schema)
validator.validate(data)
data = {"a": 14}
s = {"type": "object", "properties": {"a": {"type": "integer", "default": "foo"}}}
meta_schema = {
"$schema": "http://json-schema.org/draft-04/schema",
"allOf": [
{"$ref": "http://json-schema.org/draft-04/schema"},
]
}
# Attempt to run the `_validate_default` validator
try:
check_schema(meta_schema, s, data, True)
except RuntimeError:
# We want the error to be raised to demonstrate `_validate_default` being run.
pass
else:
raise RuntimeError("An Error should have been raised")
# Run again without the `_validate_default` validator
check_schema(meta_schema, s, data, False) It succeeds on all of jsonschema 4.9, 4.10, and 4.15. |
Succeeds or fails? Are you saying it's working or broken? |
@eslavich I think you have changed what is being validated. What is breaking is our validation of |
I'm having the same issue since 4.10. We are using this recipe to set default values while validating, which works great on 4.9. On 4.10+ though, it loses the special This behavior makes sense to me, the default filling was on top of the draft4 schema that we are using, (and that was declared in the $schema,) but I'm not sure what the best way to override this behavior is and force the custom default filling to descend through the $refs. |
I can try to have another look at this next week |
Well, I think I just figured a reasonable workaround. Declaring that our custom extended validator validates the default schema. Not sure if this should just be added to the docs, or if there is a better way. CURRENT_SCHEMA_VERSION = 'http://json-schema.org/draft-04/schema#'
# Make a couple validators that have some extensions to the default draft4 validator
SchemaValidator = jsonschema.validators.extend(jsonschema.Draft4Validator, validators)
SchemaValidatorWDefaults = jsonschema.validators.extend(SchemaValidator, {'properties': validate_properties_w_defaults})
...
# Right before using any of our custom validators, declare that they validate the default draft4 schema
if set_defaults:
jsonschema.validators.validates(CURRENT_SCHEMA_VERSION)(SchemaValidatorWDefaults)
validator = SchemaValidatorWDefaults(schema, resolver=resolver, format_checker=format_checker)
else:
jsonschema.validators.validates(CURRENT_SCHEMA_VERSION)(SchemaValidator)
validator = SchemaValidator(schema, resolver=resolver, format_checker=format_checker)
# Use the validator EDIT: We didn't want to declare a custom $schema uri, because our schemas get used by some external tools in Javascript world where the default behavior is fine. Also, because depending on the context, we sometimes enable the default filling, and sometimes not, as can probably be inferred from above snippet. |
ASDF managed to resolve this issue with asdf-format/asdf#1203. Those changes maybe useful to resolving the issue. |
I tweaked my solution a little bit to make sure that the validator that sets defaults is unregistered right after being used. It caused problems in other places in our code if it remained installed. |
The "right" solution should likely involve using some new dialect URI (i.e. not |
Yeah, I figured that was the 'right' solution, and I'm pretty sure it would work. It was just an inconvenient solution on our end, since we don't want to change the validation semantics, just add the defaults processing sometimes, and we are using separate libraries on JS side, that would also have to be taught about the new dialect URI. I'm okay with current behavior, since we can forcefully override the validator for a given dialect URI (even if that isn't a recommended solution.) Might look in to fixing it properly on our end later when I have time/motivation. |
The documentation for
I have been using Note that the semantics of my callables are exactly the same as draft 4 semantics - my callables are just adding a few extra attributes to validation errors, so that I can make better error messages (viz #119). I don't use @gazpachoking's solution in #994 (comment) works, though I agree this is a kind of weird workaround. |
To reproduce: import jsonschema
def validate_type(validator, value, instance, schema):
raise Exception("hit")
NewValidator = jsonschema.validators.extend(jsonschema.Draft4Validator, {"type": validate_type})
validator = NewValidator({
"$schema": "http://json-schema.org/draft-04/schema#",
# Note: The referenced schema has "$schema": "http://json-schema.org/draft-04/schema#".
"$ref": "https://standard.open-contracting.org/schema/1__1__5/release-schema.json"
})
for error in validator.iter_errors({"ocid": 0, "id": "1", "date": "", "tag": ["tender"], "initiationType": "tender"}):
# No exception raised.
print(error.validator) Before jsonschema 4.10, an exception would be raised (i.e. I don't see a scenario where a user would want to override the validators for a top-level schema, but not the validators for referenced schema. If I'm using I suppose the |
I can try to have another look at this issue once 4.18 is out --
This line means "if you define a validator called Redefining the semantics of what it means to be draft 4 is what changed, it's hard to say how much of that is "supported behavior" or not but it certainly was a bugfix with regard to (specified via the specification) correcting the behavior when referencing some dialect in a
I don't think anything like that is the case right now, it's purely the case where you write
you get Draft4Validator, not yours, so you're already doing the equivalent of passing some validator's schema to some other validator (e.g. passing a draft 6 schema to the draft 4 validator). But all the above being said, as I say, it's possible there's a way to still make this work, or that an API tweak is needed. |
It sounds like the solution to this (perhaps narrow) use case is to use |
#1197 has another example of this kind of behavior: from referencing import Registry, Resource
from referencing.jsonschema import DRAFT7
from collections.abc import Mapping
import jsonschema
import collections
class Config(collections.ChainMap):
def __init__(self, *maps):
super().__init__(*maps)
def __getitem__(self, key):
return self._chained_getitem(key)
def _chained_getitem(self, key, **kwargs):
"""
Actual implementation of ``__getitem__`` with chained behavior.
When returning a new instance of `DeepChainMap`, the ``kwargs`` are
passed to the constructor. Subclasses should override this method
instead of ``__getitem__`` in order to pass additional ``__init__``
args.
"""
chained_values = []
for m in self.maps:
try:
chained_values.append(m[key])
except KeyError:
continue
if not chained_values:
raise KeyError(key)
first = chained_values[0]
# Note: Although instances of dict are also instances of Mapping,
# isinstance(x, dict) is much faster than isinstance(x, Mapping), and
# dict being the most common case we explicitly check for it first.
if not isinstance(first, (dict, Mapping)):
return first
nested = []
for m in chained_values:
if isinstance(m, (dict, Mapping)):
nested.append(m)
else:
break
return self.__class__(*nested, **kwargs)
config = Config({'a': {'b': 'c', 'd': {}}})
main_schema = {
'$id': 'main_schema',
'type': 'object',
'properties':
{'a': {'properties': {'b': {},
'd': {}},
'$ref': 'schema_1'}
}
}
schema_1 = {'type': 'object',
'properties': {'b': {'enum': ['c']},
'd': {'$ref': 'd_schema'}},
'required': ['b']}
schema_2 = {'$id': 'schema_2',
'fixed_inputs': {
'type': 'object',
'default': {},
'properties': {'e': {'type': 'integer', 'minimum': 1}}
}
}
d_schema = {'$schema': 'http://json-schema.org/draft-07/schema#',
'$id': 'd_schema',
'$ref': 'schema_2#/fixed_inputs'}
def retrieve(uri: str):
if uri == 'schema_1':
contents = schema_1
elif uri == 'schema_2':
contents = schema_2
elif uri == 'd_schema':
contents = d_schema
return Resource.from_contents(contents, DRAFT7)
registry = Registry(retrieve=retrieve)
def is_my_object(checker, instance):
return (
isinstance(instance, (dict, Mapping))
)
type_checker = jsonschema.Draft7Validator.TYPE_CHECKER.redefine(
"object", is_my_object,
)
CustomValidator = jsonschema.validators.extend(
jsonschema.Draft7Validator,
type_checker=type_checker,
)
validator = CustomValidator(schema=main_schema, registry=registry)
validator.validate(config) |
This is related to part of #981 (comment); namely, the broken validation of
default
values in asdf. However, I believe the issue itself originates in theevolve
:jsonschema/jsonschema/validators.py
Lines 222 to 238 in 2f15a98
Essentially, we are creating a a new validator using the
Draft4Validator
, which contains an additionalvalidators
function (under thedefault
keyword). See https://github.com/asdf-format/asdf/blob/bf954a3921df6b3e5f36c211c18257ac97d4f423/asdf/schema.py#L705-L750 in the asdf code. However, it appears that the additionaldefault
validator is being lost when attempting to validate something. The following should produce aValidationError
with the latest version of asdf:Indeed, if one has jsonschema 4.9.1 installed, one gets a
ValidationError
; however, if one has jsonschema 4.15.0 installed, one gets no validation error.After carefully stepping the the code as it is running with jsonschema 4.15.0 I have found that at some point while iterating, the
evolve
method returns a validator class which does not havedefault
in itsVALIDATORS
attribute when it should have one. Making it so that there is no validator available when it does try to validatedefault
. This occurs whenvalidator_for
does not returnDefaultValidator
:jsonschema/jsonschema/validators.py
Line 1143 in 2f15a98
Draft4Validator
which does not containdefault
in itsVALIDATORS
dictionary. Meaning the validator returned byevolve
has now lost thedefault
validator for further iterations.The text was updated successfully, but these errors were encountered: