mirror of
https://github.com/xoofx/markdig.git
synced 2026-02-11 21:37:17 +00:00
Unsafe caching of a previous renderer #457
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @AndreyChechel on GitHub (Apr 29, 2021).
There seems to be an issue in the way how a previous renderer is cached in RendererBase.Write method.
Each registered renderer has an
Acceptmethod saying whether the renderer can process a particular object. Since theAcceptmethod is virtual, it's possible to create custom selective renderers. The problem with a cached renderer is that it's never tested whether it can accept another object or not - the implementation makes an assumption just by the object type, whereas a customAcceptmethod can have much more criteria to check.Moreover, even if the
Acceptmethod had invoked on the cached previous render before using it, we'd have still had an issue, because some renderer(s) located in the pipeline prior to the cached one could accept a new object while rejecting the previous object of the same type. That contemplation leads us to the conclusion that previous renderer cache will always be unsafe and risky to use.Here's an issue we ran into. We needed to emit custom markup for HTML anchors defined as
[label](#anchor). Those anchors are actually regularLinkInlineobjects. We wrote a custom renderer with an overriddenAcceptmethod checking that a link URL is an anchor URL, so it's only invoked for anchors. TheAnchorRendererwas placed beforeLinkInlineRendererin the pipeline, so theAnchorRendererwas supposed to be invoked for each link parsed from a markdown document. However, if a paragraph has two URLs: a regular one, and an anchor - the first URL is processed withLinkInlineRendererthat is cached and then reused for the anchor. This way theAnchorRendereris never actually invoked.The workaround is to create a parser that would produce an object of different type, smth like
AnchorLinkInlineso it's not picked up byLinkInlineRenderer. Which is fine, but requires more entities implemented and adds extra computation costs on parsing stage.@xoofx commented on GitHub (Apr 29, 2021):
Yes, it was a choice in design and yes, it was known that, wrongly used, you could get invalid behavior.
Yep, that's the reason why it was not like that 😉
So It's by design. The Accept can't rely on anything else than statically type information, not runtime information.
@xoofx commented on GitHub (Apr 29, 2021):
I mean, that whatever the solution is, it has a cost, dynamic dispatch of renderer based on runtime data was not a goal, and is left to specialized users that can re-implement an entirely new renderer pipeline if they really want to go that route.
@AndreyChechel commented on GitHub (Apr 29, 2021):
Thanks for the quick response.
I see your point regarding the performance - it totally makes sense. Well, one thing that can be still confusing is that the
Acceptmethod gets an actual object instance instead of the object type. Passing a type will likely eliminate potential issues with custom renderers misusing theAcceptmethod.Anyways, I'm glad that the issue is known and being taken into the account 👍
@MihaZupan commented on GitHub (Apr 29, 2021):
That sounds right. We should also fix the one location where we override this internaly:
0a0040450f/src/Markdig/Extensions/AutoLinks/NormalizeAutoLinkRenderer.cs (L14-L27)(
&& link.IsAutoLink;above)@xoofx commented on GitHub (Apr 29, 2021):
I can explain why: This was actually for performance reason 😉
Initially the method was taking Type, but Type.IsAssignableFrom was slower than performing a check through
isdirectly on the instance, hence the reason why the instance was passed. I haven't benchmarked this with a recent runtime, it could have changed, but yeah, that was - afair - the underlying reason@MihaZupan commented on GitHub (Apr 29, 2021):
This should now be a one-time cost per pipeline since we're caching the renderers (and the
renderersPerTypeis not being reset).@xoofx commented on GitHub (Apr 29, 2021):
the caching was already done I think at that time, but it was still a (small) perf hit. But we can double check for sure if it is no longer an issue.
@AndreyChechel commented on GitHub (Apr 29, 2021):
Adding a rough benchmark result for
isoperator vs.Type.IsAssignableFromfor .NET5@MihaZupan commented on GitHub (Apr 29, 2021):
It was already cached in the sense that we would only perform the check once per type/renderer. So for a very large document this would be just noise.
renderersPerTypeis rebuilt for each renderer instance, so now that the renderers are getting reused this will always be noise.