mirror of
https://github.com/xoofx/markdig.git
synced 2026-02-08 13:54:54 +00:00
Add support for normalize - markdown document rountrip #147
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @xoofx on GitHub (Oct 24, 2017).
Followup of #17 and #32 and #154
This is an issue to track the remaining work to handle full markdown document round-trip
6717be5)de5ed11963)GFMmarkdown, then the rest)GFM)GFM)cc: @tthiery
@xoofx commented on GitHub (Oct 27, 2017):
So I have fixed a few cases with list for better handling loose list.
I have added support for escape characters.
I have also started to add
NormalizeOptionsto control a bit the markdown output.The limit text width will require a bit more infrastructure work on the
TextRendererBase, still nothing done for this (wip)@xoofx commented on GitHub (Oct 27, 2017):
I have added support for HtmlBlock renderer. The normalize is able to process the CommonMark specs and the output is almost identical to the original which is already a good start.
@duncanawoods commented on GitHub (Sep 21, 2019):
TaskListsare marked as done but normalise actually strips the task syntax:prints
@duncanawoods commented on GitHub (Sep 22, 2019):
Ah, I see, have to use the same pipeline:
@tthiery commented on GitHub (Sep 22, 2019):
@duncanawoods so your issue is resolved?
@duncanawoods commented on GitHub (Sep 22, 2019):
Yep - thanks :)
My feedback is - the renderer API could be clearer e.g. taking the pipeline as an argument. The only way I found out I need to setup a pipeline is reading the internals looking to create a PR!
@tthiery commented on GitHub (Sep 22, 2019):
I do not know how @xoofx generally handles it here, maybe discuss the API change ahead of time in a separate issue. I think PRs are generally welcome, also here in the normalizer topic.
@MihaZupan commented on GitHub (Sep 22, 2019):
I think this issue was meant for tracking the progress of the Normalisation rendering implementation.
Issue proposals and PRs are of course welcome!
@raffaeler commented on GitHub (Jul 15, 2020):
I add this case here as it looks similar to the one posted in the comments:
Code:
Input string:
Output string:
The error is of course the
[]:at the end of the output string.My goal is:
Since I just started working with this library, I may be doing something wrong ...
TIA
@xoofx commented on GitHub (Jul 15, 2020):
Yeah... the problem is that the
NormalizeRendererhas never been finished. It's probably several hours of work to fix the remaining errors just for core CommonMark (without extensions). Don't have time nor a personal interest for this feature at the moment, so unless someone is willing to do that work, theNormalizeRendereris not really usable.@raffaeler commented on GitHub (Jul 15, 2020):
Hi @xoofx nice to see you here as well instead of some random city around the globe! :)
I am not really interested in
NormalizeRendererspecifically, but just getting back the text for the markdown that I have modified by removing a block.How can I get back the text of the modified document?
Thank you
@xoofx commented on GitHub (Jul 15, 2020):
Oops, sorry, should have started by saying hello Raffaele! 😉
That's what the
NormalizeRendererwas supposed to do (not only normalize but "roundtrip" even if this is a much longer way to support full rountrip)@raffaeler commented on GitHub (Jul 15, 2020):
:)
argh, this was unexpected. I gave the roundtrip for granted when I started looking for a markdown library :-/
Do you have any suggestion to get rid of the problem for "just" going back to the text?
Looping on blocks and their descendants recursively?
TIA
@xoofx commented on GitHub (Jul 15, 2020):
That's exactly what
NormalizeRendereris supposed to do: go through the nodes recursively and print the equivalent Markdown, there is no magic in it.For the specific error you have, it's probably a small fix, but I can't have a look at it.
@tthiery commented on GitHub (Jul 15, 2020):
@xoofx sorry that I left this project mid-status done. Work life increased a lot and my private project use case vanished.
@raffaeler commented on GitHub (Jul 15, 2020):
/cc @xoofx

I just found the bug is on the
Parserand not on the Normalizer.The parser incorrectly add a LinkReferenceDefinitionGroup at the end of the document
Later on, the Normalizer renders (empty) the
[]:tags which stand for the link and do not derive from the html comments.Given the instability of the parser, unfortunately I have no choice but searching for a new library :(
@xoofx commented on GitHub (Jul 15, 2020):
I don't think it's a bug in the parser. I believe it is because the
LinkReferenceDefinitionGroupRendererwas incorrectly added in the first place (while there is no renderer like this for the Html renderer).@raffaeler commented on GitHub (Jul 15, 2020):
@xoofx I don't get why the renderer should be involved at that time.
The screenshot is what I see on the document right after parsing the markdown.
I didn't render anything at that moment.
@xoofx commented on GitHub (Jul 15, 2020):
I know and it is "correct". The
LinkReferenceDefinitionGroupis added automatically by the system. Don't know why there is an item in it (that part is maybe a bug here), butLinkReferenceDefinitionGroupis used for book-keeping links (backward and forward) and made accessible through the AST. In the Html renderer, this element is not rendered and it should not be rendered also in normalize. That's more the actual bug.@xoofx commented on GitHub (Jul 15, 2020):
As a workaround, without modifying the renderer, you could remove
LinkReferenceDefinitionGroupafter the document is parsed..@raffaeler commented on GitHub (Jul 15, 2020):
Thanks anyway @xoofx I was already trying another solution.
I have to test more cases, but apparently the following
ifthat I added inLinkReferenceDefinitionRendererresolves the problem:Note:
MdRendereris the class I used to replace the renderer.@xoofx commented on GitHub (Jul 15, 2020):
It won't work when you will have links in your document, because
LinkReferenceDefinitionGroupis still rendered and the links in it, while it should not.You can remove the
LinkReferenceDefinitionGroupRendererfrom the normalize renderer directly which is probably a more correct workaround:normalizeRenderer.ObjectRenderers.TryRemove<LinkReferenceDefinitionGroupRenderer>();@raffaeler commented on GitHub (Jul 15, 2020):
You mean
LinkReferenceDefinitionRenderer, right? ... I did it and it works. Not sure if this change may have side-effects.BTW
LinkReferenceDefinitionGroupRendererdidn't work.@xoofx commented on GitHub (Jul 15, 2020):
Oh right, actually both
LinkReferenceDefinitionRendererandLinkReferenceDefinitionGroupRenderershould be removed, otherwise RendererBase is still visiting the children ofLinkReferenceDefinitionGroup(which is aContainerBlock)Forgot that
LinkReferenceDefinitionis also not used for rendering, I added them by mistake in the NormalizeRenderer back in the days.@xoofx commented on GitHub (Jul 15, 2020):
Never mind, it seems that it was added on purpose for cases like this:
In HTML they are not rendered, but in Markdown, they should be rendered.
So the problem is likely the empty link that is put in the
LinkReferenceDefinitionGroupI haven't touched this code for the past 3 years, so yeah... it's old, and my memory cannot keep up at some point 😅
@raffaeler commented on GitHub (Jul 15, 2020):
So, at the end, the
ifstatement I added as for this comment should solve the problem.@xoofx commented on GitHub (Jul 16, 2020):
Yeah, it's enough to workaround.
@kieronlanning commented on GitHub (May 24, 2025):
Just came across this issue myself, where tables are being normalised to plain text. Is there a fix for this anywhere?
@xoofx commented on GitHub (May 24, 2025):
Nope. It will only come if someone want it and create a PR for it.
@kieronlanning commented on GitHub (May 24, 2025):
@xoofx I think possibly I don't need a normalise renderer (or maybe I do?).
After making some modifications to the MarkdonwDocument, I want to get the Markdown content as a string... is that the same thing as the normaliser, or is there an alternative?
@xoofx commented on GitHub (May 26, 2025):
There is a roundtrip approach from #481 (which came after normalizer) that could better fit, but it can be more laborious to get right. Though, similarly to normalize, extensions like pipetable were not implemented.
@kieronlanning commented on GitHub (May 26, 2025):
Thankyou, I'll give it a shot.