mirror of
https://github.com/xoofx/markdig.git
synced 2026-02-03 21:36:36 +00:00
Parser: Inline HTML combined with code block (if no blank line) #430
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @paultechguy on GitHub (Feb 2, 2021).
When using the parser, if a code block immediately follows inline HTML, they are both combined into a single HTML block. See end for info on the CommonMark spec. The CommonMark online editor handles both cases below fine, parsing things as two separate blocks. Markdig for example, if markdown is:
The above is returned by the parser's Parse method as one whole HtmlBlock. The way to fix this is to separate the inline HTML by blank line(s):
The above is returned as two separate blocks by the Markdig parser.
Is there a way for the Markdig parser to return two separate blocks regardless of format?
The CommonMark spec indicates:
A fenced code block may interrupt a paragraph, and does not require a blank line either before or after.
@xoofx commented on GitHub (Feb 2, 2021):
Nope, but it's indeed a corner case. I believe most implementers took the HTML rule as superseding the fenced block rule.
Almost all CommonMark implementations are following the same behavior than markdig. Seems that GitHub flavored choose a different path on that particular case, which is unfortunate...
So we can't really change that, unless you are willing to make a PR with an extension to add an option to turn that off, but I'm not sure how easy it is to fit that in...
@paultechguy commented on GitHub (Feb 2, 2021):
To add more context from my testing. I've tried a lot of online parsers and most of them tend to be sensitive to the type of inline HTML that appears before a code block. If the HTML is a CSS block element (div, img, p) then the parsing detects the HTML and code block as a single HTML Block. If the HTML is a CSS inline element (span, a, em), then the parsing detects the HTML and code block separately, as an inline block and fenced code block respectively. This sort of makes sense. The Markdig parser is also sensitive to the HTML element types as I describe. A flag to toggle the handling of this might be useful.