Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XMLSchemaException does not catch all exceptions #404

Open
nuntius35 opened this issue Jun 22, 2024 · 1 comment
Open

XMLSchemaException does not catch all exceptions #404

nuntius35 opened this issue Jun 22, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@nuntius35
Copy link

There are errors in an xsd file that raise an xml.etree.ElementTree.ParseError and do not cause an XMLSchemaException. The documentation of XMLSchemaException claims that it catches all exceptions.

Example

File: invalid_schema.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
    xmlns:vc="http://www.w3.org/2007/XMLSchema-versioning" vc:minVersion="1.1">
</xsd:schem>

File: validate_schema.py

"""Loads an xsd file"""
import xmlschema


def main():
    """Exception raised by invalid schema is expected to be caught"""
    try:
        xmlschema.XMLSchema11("invalid_schema.xsd")
    except xmlschema.XMLSchemaException as e:
        print(e)


if __name__ == '__main__':
    main()

Run the python script with python validate_schema.py.

Expected behaviour: XMLSchema11 raises e.g. XMLSchemaValidatorError and the exception is caught.

Actual behaviour: the exception xml.etree.ElementTree.ParseError is raised and not caught.

@brunato
Copy link
Member

brunato commented Jun 23, 2024

Hi,
the documentation says that the base exception "let you catch all the errors generated by the library". This case is on the boundary because this a ParseError of the ElementTree library.

Anyway it's better to reason on that because other errors are caught and re-raised.

In this case the error is in the syntax of the XML source, so it should raise an XMLResourceError or a derived exception from it. For v4.0 the XMLResource will be extended to support also parsing with lxml and custom url openers. All the XML data access in the library is delegated to this class so having its error type hierarchy could help to distinguish between XML data access/parsing and XML validation.

An hypothesis for this could be:

from xml.etree.ElementTree import ParseError
from xmlschema import XMLSchemaException


class XMLResourceError(XMLSchemaException):
    """A generic error on an XML resource that let you catch all the errors generated by an XML resource."""


class XMLResourceOSError(XMLResourceError, OSError):
    """Raised when an error is found accessing an XML resource."""


class XMLResourceParseError(XMLResourceError, ParseError):
    """Raised when an error is found parsing an XML resource."""


class XMLResourceBlocked(XMLResourceError):
    """Raised when an XML resource access is blocked by security settings."""


class XMLResourceForbidden(XMLResourceError):
    """Raised when the parsing of an XML resource is forbidden for safety reasons."""

Deriving XMLResourceParseError from ParseError instead of SyntaxError preserves backward compatibility.

@brunato brunato added the enhancement New feature or request label Jul 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants