-
-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
change: Stop caching parsed docstrings #340
Comments
Since we'll want to render sections from objects to strings, section items ( Additionally, to reduce the cost of docstring modifications, which will then involve parsing then redumping, as well as to prevent spurious warnings too early in the process, I suggest we rework our parsers to use two phases:
Blocks should be yielded alongside their section kind, as well as their ending (starting?) line numbers. This will allow these use-cases:
|
Hmmm, I'm in a rabbit hole. Initially I wanted to stop caching docstrings, because extensions would modify the parsed sections themselves, and that would prevent changing (fixing?) the style later, for example with a local mkdocstrings option (see mkdocstrings/python#183). But. If instead we tell the extension to parse and re-dump, that doesn't change anything. It will dump something with the wrong style. Even if we just insert contents without parsing, it will split and insert with the wrong style. Ultimately, what this means, is that not caching doesn't solve the linked issue at all. What would solve the issue is to properly assign the right style to each docstring before they are ever parsed.
import re
import griffe
class ApplyDocstringStyle(griffe.Extension):
def __init__(self, regex: str = "<!-- style: (google|numpy|sphinx) -->") -> None:
self.regex = re.compile(regex)
def on_instance(self, *, obj: griffe.Object, **kwargs) -> None:
if obj.docstring:
if match := self.regex.search(obj.docstring.value):
obj.docstring.parser = match.group(1)
import re
import griffe
class ApplyDocstringStyle(griffe.Extension):
def __init__(self, regex: str = ".*# style: (google|numpy|sphinx)$") -> None:
self.regex = re.compile(regex)
def on_instance(self, *, obj: griffe.Object, **kwargs) -> None:
if obj.docstring:
if match := self.regex.search(obj.docstring.source):
obj.docstring.parser = match.group(1)
import griffe
from fnmatch import fnmatch
# path.to.obj1: google
# path.to.obj2: numpy
# path.to.obj3.*: sphinx
# path.to.obj4*: google
class ApplyDocstringStyle(griffe.Extension):
def __init__(self, config: dict[str, str]):
self.instances = {}
self.globs = {}
for key, value in config.items():
if "*" in key:
self.globs[key] = value
else:
self.instances[key] = value
def on_instance(self, *, obj: griffe.Object, **kwargs) -> None:
if obj.path in self.instances:
if obj.docstring:
obj.docsring.parser = self.instances[obj.path]
else:
for pattern, style in self.globs:
if fnmatch(obj.path, pattern):
if obj.docstring:
obj.docstring.parser = style So in the end, we'll continue caching parsed sections. We can still rework our parsers as described above:
|
I'm really in a rabbit hole. With this recent change: 88fb6b6, parsing docstrings now computes inherited members. We stopped caching those to prevent issues, and updated both Griffe and mkdocstrings-python to avoid perf hits. But parsed docstrings are still cached, and could be cached too early (extension parsing and modifying cached sections). This would only be an issue in the scenario of mkdocstrings/python#175, but this is still annoying. So we might want to provide an API that lets extensions modify docstrings without parsing them, or at the very least without caching them. |
Is your change request related to a problem? Please describe.
Cached parsed docstrings are causing more issues than they solve. See mkdocstrings/python#183. Cache invalidation of course! The second of the two hardest problems in computer science, after naming things and before off-by-1 errors.
Templates can very easily iterate on the sections once. Other programmatic uses can store the sections in a variable. No need to cache it. I think I initially decided to cache it because it would allow extensions to easily modify sections, which are data structures, instead of having to mess with the raw docstring. But we can solve this too, see next section.
Describe the solution you'd like
Instead of caching the parsed docstrings, provide utilities to make it as easy to modify raw docstrings. Typically:
docstring.append(DocstringSectionStuff(...))
. Here, we don't care what style is used:append
will internally parse the docstring using the attached style (or auto-infer it when the feature is available), append the section, and re-dump everything.append
is just a suggestion, we should design a good API. For example, and related to perfs, we should allow appending several sections at once, etc. Context manager (transactions)?Describe alternatives you've considered
Not considering. Needs to be done.
Additional context
This will need a deprecation period of course, with warnings and all. We should also reach out to extension authors using cached parsed sections. We use this ourselves in griffe-warnings-deprecated I believe.
The text was updated successfully, but these errors were encountered: