change: Consider removing fallback feature #61

pawamoy · 2025-01-08T21:05:55Z

Is your change request related to a problem? Please describe.

Two or three years ago, I committed this change into mkdocstrings: mkdocstrings/mkdocstrings@c7ac043.

fix: Don't preemptively register identifiers as anchors

Registering every identifier of a rendered object is wrong,
since only one of these identifiers actually appear on the page.
What it tried to achieve instead was to register every identifier
as pointing to the rendered anchor. But autorefs doesn't have
mechanisms to do that unambiguously, yet.

Turns out autorefs now has such a mechanism: its plugin's register_anchor method now accepts a third argument, anchor, in addition to identifier. This change in autorefs was brought to support Markdown anchors, a solution to create unique heading aliases and avoid undeterministic resolution of cross-references when their identifiers appear on several pages.

Thanks to this change, we can now register all the anchors of an object (its aliases) so that they point to the actually rendered anchor on the page. This change will be done in mkdocstrings' AutoDocProcessor._process_headings method:

for anchor in anchors:
    if anchor != rendered_anchor:
        self._autorefs.register_anchor(page, anchor, rendered_anchor)

With this change, the fallback mechanism becomes useless. Indeed, when we fallback, we simply ask the handler to provide the known anchors, and we try again with those. Since we will now register these anchors ahead of time, if an identifier isn't found, the fallback mechanism will not find it either.

All in all, if handlers provide all relevant anchors of a given object (through their get_anchors method), then we can be sure that we won't ever need to fallback. So we could actually remove this mechanism entirely. It was wrong anyway, since it would query all handlers for anchors, which could yield false-positives (identifier matching in another handler). This could be fixed since autoref HTML elements now carry a domain value, but there's no point in fixing it if we can remove it.

Describe the solution you'd like

Entirely remove the fallback mechanism. There will be a nice ripple effect in handlers too: they won't have to maintain a fallback_config attribute, and the associated logic in their collect method. The base handler can also be cleaned up, as well as the Handlers class.

Handlers previously handled "fallback collection" differently, because the fallback logic was also triggered by optional cross-references to external projects for which we don't have an inventory loaded. In these cases, we didn't want to try and load additional data from the disk, just to get anchors, since we knew for sure the corresponding objects would not be rendered in the docs anyway (and so we wouldn't be able to cross-ref them).

Without the fallback logic firing, we pretty much know that each object we try to collect when getting anchors for each heading rendered on a page is already collected (and possibly cached) or at least is not external, so we don't need any special value or parameter to do collect data conditionally.

Is there something fundamental I'm missing about languages other than Python, which would require that we keep the fallback mechanism? Based solely on the logic, I would say no: the fallback just means fetching aliases to try with them. If we already registered these aliases, no point in falling back again.

Describe alternatives you've considered

Fix the fallback mechanism to only query handlers of the relevant domain. I tried locally, it works, with just one deprecation message (signature of callables accepted by autorefs.get_fallback_anchor now accepting a second argument, the domain).

Additional context

We will first have to implement the changes mentioned above in mkdocstrings.

The text was updated successfully, but these errors were encountered:

This will be useful in mkdocstrings, where we want to register URLs for all aliases of a rendered object's identifier early in the process, so that we can drop the fallback mechanism in autorefs. Primary URLs will take precedence when resolving cross-references, to avoid logging warnings about multiple URLs found. For example: - Object `a.b.c.d` has aliases `a.b.d` and `a.d` - Object `a.b.c.d` is rendered. - We register `a.b.c.d` -> page#a.b.c.d as primary - We register `a.b.d` -> page#a.b.c.d as secondary - We register `a.d` -> page#a.b.c.d as secondary - Later, if `a.b.d` or `a.d` are rendered, we will register primary and secondary URLs the same way - This way we are sure that each of `a.b.c.d`, `a.b.d` or `a.d` will link to their primary URL, if any, or their secondary URL, accordingly Related-to-issue-61: #61

Issue-61: #61

pawamoy · 2025-01-12T21:48:59Z

Code is marked for removal thanks to Yore comments, we can close.

pawamoy added the refactor Change suggestion, not a bug nor a feature. label Jan 8, 2025

pawamoy self-assigned this Jan 8, 2025

pawamoy added a commit that referenced this issue Jan 10, 2025

refactor: Deprecate fallback mechanism

5e89cd8

Issue-61: #61

pawamoy closed this as completed Jan 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

change: Consider removing fallback feature #61

change: Consider removing fallback feature #61

pawamoy commented Jan 8, 2025 •

edited

Loading

pawamoy commented Jan 12, 2025

change: Consider removing fallback feature #61

change: Consider removing fallback feature #61

Comments

pawamoy commented Jan 8, 2025 • edited Loading

Is your change request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

pawamoy commented Jan 12, 2025

pawamoy commented Jan 8, 2025 •

edited

Loading