Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change: Consider removing fallback feature #61

Closed
pawamoy opened this issue Jan 8, 2025 · 1 comment
Closed

change: Consider removing fallback feature #61

pawamoy opened this issue Jan 8, 2025 · 1 comment
Assignees
Labels
refactor Change suggestion, not a bug nor a feature.

Comments

@pawamoy
Copy link
Member

pawamoy commented Jan 8, 2025

Is your change request related to a problem? Please describe.

Two or three years ago, I committed this change into mkdocstrings: mkdocstrings/mkdocstrings@c7ac043.

fix: Don't preemptively register identifiers as anchors

Registering every identifier of a rendered object is wrong,
since only one of these identifiers actually appear on the page.
What it tried to achieve instead was to register every identifier
as pointing to the rendered anchor. But autorefs doesn't have
mechanisms to do that unambiguously, yet.

Turns out autorefs now has such a mechanism: its plugin's register_anchor method now accepts a third argument, anchor, in addition to identifier. This change in autorefs was brought to support Markdown anchors, a solution to create unique heading aliases and avoid undeterministic resolution of cross-references when their identifiers appear on several pages.

Thanks to this change, we can now register all the anchors of an object (its aliases) so that they point to the actually rendered anchor on the page. This change will be done in mkdocstrings' AutoDocProcessor._process_headings method:

for anchor in anchors:
    if anchor != rendered_anchor:
        self._autorefs.register_anchor(page, anchor, rendered_anchor)

With this change, the fallback mechanism becomes useless. Indeed, when we fallback, we simply ask the handler to provide the known anchors, and we try again with those. Since we will now register these anchors ahead of time, if an identifier isn't found, the fallback mechanism will not find it either.

All in all, if handlers provide all relevant anchors of a given object (through their get_anchors method), then we can be sure that we won't ever need to fallback. So we could actually remove this mechanism entirely. It was wrong anyway, since it would query all handlers for anchors, which could yield false-positives (identifier matching in another handler). This could be fixed since autoref HTML elements now carry a domain value, but there's no point in fixing it if we can remove it.

Describe the solution you'd like

Entirely remove the fallback mechanism. There will be a nice ripple effect in handlers too: they won't have to maintain a fallback_config attribute, and the associated logic in their collect method. The base handler can also be cleaned up, as well as the Handlers class.

Handlers previously handled "fallback collection" differently, because the fallback logic was also triggered by optional cross-references to external projects for which we don't have an inventory loaded. In these cases, we didn't want to try and load additional data from the disk, just to get anchors, since we knew for sure the corresponding objects would not be rendered in the docs anyway (and so we wouldn't be able to cross-ref them).

Without the fallback logic firing, we pretty much know that each object we try to collect when getting anchors for each heading rendered on a page is already collected (and possibly cached) or at least is not external, so we don't need any special value or parameter to do collect data conditionally.

Is there something fundamental I'm missing about languages other than Python, which would require that we keep the fallback mechanism? Based solely on the logic, I would say no: the fallback just means fetching aliases to try with them. If we already registered these aliases, no point in falling back again.

Describe alternatives you've considered

Fix the fallback mechanism to only query handlers of the relevant domain. I tried locally, it works, with just one deprecation message (signature of callables accepted by autorefs.get_fallback_anchor now accepting a second argument, the domain).

Additional context

We will first have to implement the changes mentioned above in mkdocstrings.

@pawamoy pawamoy added the refactor Change suggestion, not a bug nor a feature. label Jan 8, 2025
@pawamoy pawamoy self-assigned this Jan 8, 2025
pawamoy added a commit that referenced this issue Jan 10, 2025
This will be useful in mkdocstrings, where we want to register URLs for all aliases of a rendered object's identifier early in the process, so that we can drop the fallback mechanism in autorefs.

Primary URLs will take precedence when resolving cross-references, to avoid logging warnings about multiple URLs found.

For example:

- Object `a.b.c.d` has aliases `a.b.d` and `a.d`
- Object `a.b.c.d` is rendered.
- We register `a.b.c.d` -> page#a.b.c.d as primary
- We register `a.b.d` -> page#a.b.c.d as secondary
- We register `a.d` -> page#a.b.c.d as secondary
- Later, if `a.b.d` or `a.d` are rendered, we will register primary and secondary URLs the same way
- This way we are sure that each of `a.b.c.d`, `a.b.d` or `a.d` will link to their primary URL, if any, or their secondary URL, accordingly

Related-to-issue-61: #61
pawamoy added a commit that referenced this issue Jan 10, 2025
@pawamoy
Copy link
Member Author

pawamoy commented Jan 12, 2025

Code is marked for removal thanks to Yore comments, we can close.

@pawamoy pawamoy closed this as completed Jan 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactor Change suggestion, not a bug nor a feature.
Projects
None yet
Development

No branches or pull requests

1 participant