Parser: Inline HTML combined with code block (if no blank line) #430

New Issue

claunia · 2026-01-29T14:36:31Z

claunia commented

2026-01-29 14:36:31 +00:00

Originally created by @paultechguy on GitHub (Feb 2, 2021).

When using the parser, if a code block immediately follows inline HTML, they are both combined into a single HTML block. See end for info on the CommonMark spec. The CommonMark online editor handles both cases below fine, parsing things as two separate blocks. Markdig for example, if markdown is:

    <img src="http://domain.com" />
    ```html
      <strong>Foo</strong>
    ```

The above is returned by the parser's Parse method as one whole HtmlBlock. The way to fix this is to separate the inline HTML by blank line(s):

    <img src="http://domain.com" />
    
    ```html
      <strong>Foo</strong>
    ```

The above is returned as two separate blocks by the Markdig parser.

Is there a way for the Markdig parser to return two separate blocks regardless of format?

The CommonMark spec indicates:
A fenced code block may interrupt a paragraph, and does not require a blank line either before or after.

Originally created by @paultechguy on GitHub (Feb 2, 2021). When using the parser, if a code block immediately follows inline HTML, they are both combined into a single HTML block. See end for info on the CommonMark spec. The CommonMark online editor handles both cases below fine, parsing things as two separate blocks. Markdig for example, if markdown is: ```html <img src="http://domain.com" /> ```html <strong>Foo</strong> ``` ``` The above is returned by the parser's *Parse* method as one whole *HtmlBlock*. The way to fix this is to separate the inline HTML by blank line(s): ```html <img src="http://domain.com" /> ```html <strong>Foo</strong> ``` ``` The above is returned as two separate blocks by the Markdig parser. **Is there a way for the Markdig parser to return two separate blocks regardless of format?** The [CommonMark spec](https://spec.commonmark.org/0.29/#fenced-code-blocks) indicates: *A fenced code block may interrupt a paragraph, and does not require a blank line either before or after.*

claunia added the question invalid labels 2026-01-29 14:36:31 +00:00

claunia closed this issue

2026-01-29 14:36:31 +00:00

claunia commented

2026-01-29 14:36:32 +00:00

@xoofx commented on GitHub (Feb 2, 2021):

Is there a way for the Markdig parser to return two separate blocks regardless of format?
The CommonMark spec indicates:
A fenced code block may interrupt a paragraph, and does not require a blank line either before or after.

Nope, but it's indeed a corner case. I believe most implementers took the HTML rule as superseding the fenced block rule.

Almost all CommonMark implementations are following the same behavior than markdig. Seems that GitHub flavored choose a different path on that particular case, which is unfortunate...

So we can't really change that, unless you are willing to make a PR with an extension to add an option to turn that off, but I'm not sure how easy it is to fit that in...

@xoofx commented on GitHub (Feb 2, 2021): > Is there a way for the Markdig parser to return two separate blocks regardless of format? > The CommonMark spec indicates: > A fenced code block may interrupt a paragraph, and does not require a blank line either before or after. Nope, but it's indeed a corner case. I believe most implementers took the HTML rule as superseding the fenced block rule. Almost all CommonMark implementations are following the [same behavior](https://babelmark.github.io/?text=%3Cimg+src%3D%22http%3A%2F%2Fdomain.com%22+%2F%3E%0A%60%60%60html%0A++++++%3Cstrong%3EFoo%3C%2Fstrong%3E%0A%60%60%60) than markdig. Seems that GitHub flavored choose a different path on that particular case, which is unfortunate... So we can't really change that, unless you are willing to make a PR with an extension to add an option to turn that off, but I'm not sure how easy it is to fit that in...

claunia commented

2026-01-29 14:36:32 +00:00

@paultechguy commented on GitHub (Feb 2, 2021):

To add more context from my testing. I've tried a lot of online parsers and most of them tend to be sensitive to the type of inline HTML that appears before a code block. If the HTML is a CSS block element (div, img, p) then the parsing detects the HTML and code block as a single HTML Block. If the HTML is a CSS inline element (span, a, em), then the parsing detects the HTML and code block separately, as an inline block and fenced code block respectively. This sort of makes sense. The Markdig parser is also sensitive to the HTML element types as I describe. A flag to toggle the handling of this might be useful.

@paultechguy commented on GitHub (Feb 2, 2021): To add more context from my testing. I've tried a lot of online parsers and most of them tend to be sensitive to the type of inline HTML that appears before a code block. If the HTML is a CSS block element (div, img, p) then the parsing detects the HTML and code block as a single *HTML Block*. If the HTML is a CSS inline element (span, a, em), then the parsing detects the HTML and code block separately, as an *inline block* and *fenced code block* respectively. This sort of makes sense. The Markdig parser is also sensitive to the HTML element types as I describe. A flag to toggle the handling of this might be useful.

claunia referenced this issue

2026-01-29 14:48:45 +00:00

[PR #430] [MERGED] Fix relative uri detection to be cross-platform compatible #1025

claunia referenced this issue

2026-01-29 14:48:49 +00:00

[PR #430] Fix relative uri detection to be cross-platform compatible #1029

Sign in to join this conversation.