mirror of
https://github.com/xoofx/markdig.git
synced 2026-02-09 21:42:15 +00:00
HTML block parsed incorreclty? #555
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @MihailsKuzmins on GitHub (Aug 29, 2022).
I am referring to this issue https://github.com/MyNihongo/MudBlazor.Markdown/issues/117
Please refer to this sample:
The input is this HTML string, but Markdig returns 4 elements (
<detals>+<summary>, some text in the middle, closing tag for</details>). I would expect it to return a single element which is just the HTMLHtmlBlock, but maybe I am wrong in my assumption.Could you please comment whether or not this behaviour is correct?
@MihaZupan commented on GitHub (Aug 29, 2022):
Markdig is parsing according to the CommonMark spec here - HTML parsing is limited to what the spec defines.
HtmlBlockis more of a "this is the part we can't treat as Markdown" rather than a full HTML AST.Consider this example where
foois not treated as Markdown, whilebaris (because of the extra blank line).In this case the
HtmlBlockstarts with<details>and ends on the first blank line.If you want an actual HTML syntax tree, pass Markdig's output to a library like
AngleSharp.Regarding the HTML Markdig generates, it matches what all the other CommonMark-compliant parsers do.
@MihailsKuzmins commented on GitHub (Aug 30, 2022):
OK, thanks I just wanted to confirm it. Indeed the empty line seems to be the end of the html block according to the link you sent.
