How can I exempt stretches of text from being processed as Markdown? #306

Closed
opened 2026-01-29 14:33:19 +00:00 by claunia · 17 comments
Owner

Originally created by @deanebarker on GitHub (Jul 1, 2019).

Is there anyway to turn off the parser for stretches of code? Like:

This is some Markdown.

# markdownoff
This is some crazy HTML construct that I don't want Markdown to touch or even know about.
# markdownon

This is some more Markdown.

Clearly that doesn't work, but that's basically what I'm looking for.

Originally created by @deanebarker on GitHub (Jul 1, 2019). Is there anyway to turn off the parser for stretches of code? Like: ``` This is some Markdown. # markdownoff This is some crazy HTML construct that I don't want Markdown to touch or even know about. # markdownon This is some more Markdown. ``` Clearly that doesn't work, but that's basically what I'm looking for.
claunia added the enhancementPR Welcome! labels 2026-01-29 14:33:19 +00:00
Author
Owner

@MihaZupan commented on GitHub (Jul 1, 2019):

Text inside html blocks will be copied verbatim.
For example

*foo*

<div>
*bar*
</div>
<p><em>foo</em></p>
<div>
*bar*
</div>
@MihaZupan commented on GitHub (Jul 1, 2019): Text inside html blocks will be copied verbatim. For example ```md *foo* <div> *bar* </div> ``` ```html <p><em>foo</em></p> <div> *bar* </div> ```
Author
Owner

@deanebarker commented on GitHub (Jul 1, 2019):

That's what I thought, but I don't think it's what I'm seeing. I'll re-test. If I can't duplicate, I'll close.

@deanebarker commented on GitHub (Jul 1, 2019): That's what I thought, but I don't think it's what I'm seeing. I'll re-test. If I can't duplicate, I'll close.
Author
Owner

@deanebarker commented on GitHub (Jul 2, 2019):

I did some testing, and this depends on there being no blank lines inside the HTML blocks. If you have this:

*foo*
<div>

*bar*
</div>

That will generate HTML inside the div because of the blank line after the opening tag.

In my situation, due to some pre-processing that generates the Markdown, I can't guarantee there won't be any blank lines.

Do I have any options to toggle Markdown processing in this situation?

@deanebarker commented on GitHub (Jul 2, 2019): I did some testing, and this depends on there being _no blank lines_ inside the HTML blocks. If you have this: ``` *foo* <div> *bar* </div> ``` That _will_ generate HTML inside the `div` because of the blank line after the opening tag. In my situation, due to some pre-processing that generates the Markdown, I can't guarantee there won't be any blank lines. Do I have any options to toggle Markdown processing in this situation?
Author
Owner

@stevehurcombe commented on GitHub (Jul 2, 2019):

You could wrap the content in a

yourself? Then it's a known quantity.

S.

From: Deane Barker notifications@github.com
Sent: 02 July 2019 11:00
To: lunet-io/markdig markdig@noreply.github.com
Cc: Subscribed subscribed@noreply.github.com
Subject: Re: [lunet-io/markdig] How can I exempt stretches of text from being processed as Markdown? (#350)

I did some testing, and this depends on there being no blank lines inside the HTML blocks. If you have this:

foo

bar

That will general HTML inside the div because of the blank line after the opening tag.

In my situation, due to some pre-processing that generates the Markdown, I can't guarantee there won't be any blank lines.

Do I have any options to toggle Markdown processing in this situation?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub https://github.com/lunet-io/markdig/issues/350?email_source=notifications&email_token=AAPQMIWLD7PVFNMYFVLCZP3P5MRKBA5CNFSM4H4VX5U2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZAX4BA#issuecomment-507608580 , or mute the thread https://github.com/notifications/unsubscribe-auth/AAPQMIULT6LQ6GRKQ3MP3ALP5MRKBANCNFSM4H4VX5UQ . https://github.com/notifications/beacon/AAPQMIV74AMVNM6MLQXTSXTP5MRKBA5CNFSM4H4VX5U2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZAX4BA.gif

@stevehurcombe commented on GitHub (Jul 2, 2019): You could wrap the content in a <div> yourself? Then it's a known quantity. S. From: Deane Barker <notifications@github.com> Sent: 02 July 2019 11:00 To: lunet-io/markdig <markdig@noreply.github.com> Cc: Subscribed <subscribed@noreply.github.com> Subject: Re: [lunet-io/markdig] How can I exempt stretches of text from being processed as Markdown? (#350) I did some testing, and this depends on there being no blank lines inside the HTML blocks. If you have this: *foo* <div> *bar* </div> That will general HTML inside the div because of the blank line after the opening tag. In my situation, due to some pre-processing that generates the Markdown, I can't guarantee there won't be any blank lines. Do I have any options to toggle Markdown processing in this situation? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <https://github.com/lunet-io/markdig/issues/350?email_source=notifications&email_token=AAPQMIWLD7PVFNMYFVLCZP3P5MRKBA5CNFSM4H4VX5U2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZAX4BA#issuecomment-507608580> , or mute the thread <https://github.com/notifications/unsubscribe-auth/AAPQMIULT6LQ6GRKQ3MP3ALP5MRKBANCNFSM4H4VX5UQ> . <https://github.com/notifications/beacon/AAPQMIV74AMVNM6MLQXTSXTP5MRKBA5CNFSM4H4VX5U2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZAX4BA.gif>
Author
Owner

@deanebarker commented on GitHub (Jul 2, 2019):

There might be child tags, also with blank lines. Worst case I could try to strip all newlines in a post-post-processing step, but I feel like this would be a handy addition to the library in general. There are other situations where this might come up.

@deanebarker commented on GitHub (Jul 2, 2019): There might be child tags, also with blank lines. Worst case I could try to strip all newlines in a post-post-processing step, but I feel like this would be a handy addition to the library in general. There are other situations where this might come up.
Author
Owner

@xoofx commented on GitHub (Jul 2, 2019):

It requires a new kind of raw custom container. PR welcome. If someone tackle this, I would like to see a proposal of the syntax, including a check on other major Markdown libraries (including Pandoc) in case we could reuse/agree on a similar syntax

@xoofx commented on GitHub (Jul 2, 2019): It requires a new kind of raw custom container. PR welcome. If someone tackle this, I would like to see a proposal of the syntax, including a check on other major Markdown libraries (including Pandoc) in case we could reuse/agree on a similar syntax
Author
Owner

@MihaZupan commented on GitHub (Jul 2, 2019):

You could also use fenced code blocks for this, I think that would be the least intrusive language-wise.

If you don't want to see <code> and language html in the result, you can replace those blocks when post-processing the document.

For example using "mdignore" as the info:

foreach (var codeBlock in document.Descendants<FencedCodeBlock>())
{
    if (codeBlock.Info == "mdignore")
    {
        var parent = codeBlock.Parent;
        int childIndex = parent.IndexOf(codeBlock);
        parent.Remove(codeBlock);
        parent.Insert(childIndex, new HtmlBlock(null)
        {
            Lines = codeBlock.Lines
        });
        // HtmlBlock is used as it's a simple non-abstract LeafBlock
    }
}

If you want to extract the actual content from the block (for example if you want to use a ParagraphBlock instead):

foreach (var codeBlock in document.Descendants<FencedCodeBlock>())
{
    if (codeBlock.Info == "mdignore")
    {
        var parent = codeBlock.Parent;
        int childIndex = parent.IndexOf(codeBlock);
        parent.Remove(codeBlock);

        int startIndex = markdown.IndexOf('\n', codeBlock.Span.Start) + 1;
        int endIndex = markdown.LastIndexOf('\n', codeBlock.Span.End) - 1;
        if (markdown[endIndex] == '\r') endIndex--;

        var replacement = new ParagraphBlock
        {
            Inline = new ContainerInline()
        };
        replacement.Inline.AppendChild(new LiteralInline()
        {
            Content = new StringSlice(markdown, startIndex, endIndex)
        });

        parent.Insert(childIndex, replacement);
    }
}

Side-note:
Even for a simple example like this it would be useful to have some more "document api" like ReplaceBy for Blocks. I can do PR for that.

@MihaZupan commented on GitHub (Jul 2, 2019): You could also use fenced code blocks for this, I think that would be the least intrusive language-wise. If you don't want to see `<code>` and language html in the result, you can replace those blocks when post-processing the document. For example using "mdignore" as the info: ```c# foreach (var codeBlock in document.Descendants<FencedCodeBlock>()) { if (codeBlock.Info == "mdignore") { var parent = codeBlock.Parent; int childIndex = parent.IndexOf(codeBlock); parent.Remove(codeBlock); parent.Insert(childIndex, new HtmlBlock(null) { Lines = codeBlock.Lines }); // HtmlBlock is used as it's a simple non-abstract LeafBlock } } ``` If you want to extract the actual content from the block (for example if you want to use a ParagraphBlock instead): ```c# foreach (var codeBlock in document.Descendants<FencedCodeBlock>()) { if (codeBlock.Info == "mdignore") { var parent = codeBlock.Parent; int childIndex = parent.IndexOf(codeBlock); parent.Remove(codeBlock); int startIndex = markdown.IndexOf('\n', codeBlock.Span.Start) + 1; int endIndex = markdown.LastIndexOf('\n', codeBlock.Span.End) - 1; if (markdown[endIndex] == '\r') endIndex--; var replacement = new ParagraphBlock { Inline = new ContainerInline() }; replacement.Inline.AppendChild(new LiteralInline() { Content = new StringSlice(markdown, startIndex, endIndex) }); parent.Insert(childIndex, replacement); } } ``` Side-note: Even for a simple example like this it would be useful to have some more "document api" like ReplaceBy for Blocks. I can do PR for that.
Author
Owner

@deanebarker commented on GitHub (Jul 3, 2019):

Here's what I found in AsciiDoc: Literal blocks

Some Markdown...
...................................
Stuff you don't want processed as Markdown...
...................................
Some more Markdown...

Elsewhere it says that block fencing is:

normally a series of four or more repeated characters

So, I think it would open and close with four dots.

@deanebarker commented on GitHub (Jul 3, 2019): Here's what I found in AsciiDoc: [Literal blocks](http://www.methods.co.nz/asciidoc/userguide.html#X65) ``` Some Markdown... ................................... Stuff you don't want processed as Markdown... ................................... Some more Markdown... ``` Elsewhere it says that block fencing is: >normally a series of four or more repeated characters So, I think it would open and close with four dots.
Author
Owner

@deanebarker commented on GitHub (Jul 3, 2019):

Here is Textile's version: notextile

<notextile>
Don't touch this!
</notextile>

The inline version is double equals:

Don't ==**bold**== this.
@deanebarker commented on GitHub (Jul 3, 2019): Here is Textile's version: [notextile](https://www.promptworks.com/textile/html-integration-and-escapement#no-textile) ``` <notextile> Don't touch this! </notextile> ``` The inline version is double equals: ``` Don't ==**bold**== this. ```
Author
Owner

@MihaZupan commented on GitHub (Jul 3, 2019):

Some Markdown...
...................................
Stuff you don't want processed as Markdown...
...................................
Some more Markdown...

That looks to me like a similar approach to using code blocks. Since they are already a part of regular markdown, is there some limitation preventing their use in your case?

@MihaZupan commented on GitHub (Jul 3, 2019): > Some Markdown... ................................... Stuff you don't want processed as Markdown... ................................... Some more Markdown... That looks to me like a similar approach to using code blocks. Since they are already a part of regular markdown, is there some limitation preventing their use in your case?
Author
Owner

@deanebarker commented on GitHub (Jul 3, 2019):

Explain to me the syntax. I see the mdignore stuff above, to go in the "info", but I'm a little fuzzy on how that syntax might actually work/look.

@deanebarker commented on GitHub (Jul 3, 2019): Explain to me the syntax. I see the `mdignore` stuff above, to go in the "info", but I'm a little fuzzy on how that syntax might actually work/look.
Author
Owner

@MihaZupan commented on GitHub (Jul 3, 2019):

The simplest syntax is the indented code block where you indent a block with 4 spaces.

I personally prefer the fenced code blocks, for those you enclose text with three (or more) backticks or tildes and you can also specify the "info" aka language. This is what you use to get syntax highlighting on GitHub for example.

This will enclose the inner text in <pre><code></code></pre> tags, so that's what I meant by replacing those elements from the document. Then markdown like this would not emit extra tags but simply embed the text as it was in the source.

@MihaZupan commented on GitHub (Jul 3, 2019): The [simplest syntax](https://babelmark.github.io/?text=++++*this+is+just+text*) is the indented code block where you indent a block with 4 spaces. I personally prefer the [fenced code blocks](https://babelmark.github.io/?text=%60%60%60%0A*simple+fenced+code+block*%0A%60%60%60%0A%0A~~~%0A**same+as+this**%0A~~~), for those you enclose text with three (or more) backticks or tildes and you can also specify the ["info" aka language](https://babelmark.github.io/?text=%60%60%60c%23%0AConsole.WriteLine(%22Hello+world!%22)%3B%0A%60%60%60). This is what you use to get syntax highlighting on GitHub for example. This will enclose the inner text in `<pre><code></code></pre>` tags, so that's what I meant by replacing those elements from the document. Then markdown like [this](https://babelmark.github.io/?text=%60%60%60%60%60mdignore%0A*some+text*%0A%60%60%60%60%60%0A~~~mdignore%0A*some+text*%0A~~~) would not emit extra tags but simply embed the text as it was in the source.
Author
Owner

@deanebarker commented on GitHub (Jul 3, 2019):

So...?

```mdignore
Don't touch this

@deanebarker commented on GitHub (Jul 3, 2019): So...? ``` ```mdignore Don't touch this ```
Author
Owner

@MihaZupan commented on GitHub (Jul 3, 2019):

Yes just three backticks at start and end

```mdignore
Copied verbatim
```
@MihaZupan commented on GitHub (Jul 3, 2019): Yes just three backticks at start and end ````md ```mdignore Copied verbatim ``` ````
Author
Owner

@deanebarker commented on GitHub (Jul 3, 2019):

Okay, putting together your stuff from above, I get this, which does work.

Markdown

This is some *Markdown*.
```mdignore
<aside>

      This is my content in an <code>ASIDE</code> tag
      with all sorts of indents and line breaks and stuff.
	  
</aside>
```
This is some more *Markdown*.

Code (note: I've never gone past the Markdown.ToHtml() static method before, so this might suck...)

var document = Markdown.Parse(markdown);
foreach (var codeBlock in document.Descendants<FencedCodeBlock>())
{
  if (codeBlock.Info == "mdignore")
  {
    var parent = codeBlock.Parent;
    int childIndex = parent.IndexOf(codeBlock);
    parent.Remove(codeBlock);
    parent.Insert(childIndex, new HtmlBlock(null)
    {
      Lines = codeBlock.Lines
    });
  }
}

var writer = new StringWriter();
var renderer = new HtmlRenderer(writer);
renderer.Render(document);
writer.Flush();
var html = writer.ToString();

Result

<p>This is some <em>Markdown</em>.</p>
<aside>

      This is my content in an <code>ASIDE</code> tag
      with all sorts of indents and line breaks and stuff.
    
</aside>
<p>This is some more <em>Markdown</em>.</p>

Does that code look fine? If so (and even if not), I have a solution and we can probably resolve.

(I do think this might be handy to build into the core @xoofx, even if just like @MihaZupan has suggested above.)

@deanebarker commented on GitHub (Jul 3, 2019): Okay, putting together your stuff from above, I get this, _which does work_. Markdown ```` This is some *Markdown*. ```mdignore <aside> This is my content in an <code>ASIDE</code> tag with all sorts of indents and line breaks and stuff. </aside> ``` This is some more *Markdown*. ```` Code (note: I've never gone past the `Markdown.ToHtml()` static method before, so this might suck...) ```c# var document = Markdown.Parse(markdown); foreach (var codeBlock in document.Descendants<FencedCodeBlock>()) { if (codeBlock.Info == "mdignore") { var parent = codeBlock.Parent; int childIndex = parent.IndexOf(codeBlock); parent.Remove(codeBlock); parent.Insert(childIndex, new HtmlBlock(null) { Lines = codeBlock.Lines }); } } var writer = new StringWriter(); var renderer = new HtmlRenderer(writer); renderer.Render(document); writer.Flush(); var html = writer.ToString(); ``` Result ```html <p>This is some <em>Markdown</em>.</p> <aside> This is my content in an <code>ASIDE</code> tag with all sorts of indents and line breaks and stuff. </aside> <p>This is some more <em>Markdown</em>.</p> ``` Does that code look fine? If so (and even if not), I have a solution and we can probably resolve. (I do think this might be handy to build into the core @xoofx, even if just like @MihaZupan has suggested above.)
Author
Owner

@MihaZupan commented on GitHub (Jul 3, 2019):

The code looks fine, if you find yourself using a custom pipeline don't forget to also call pipeline.Setup(renderer); before rendering.

@MihaZupan commented on GitHub (Jul 3, 2019): The code looks fine, if you find yourself using a custom pipeline don't forget to also call `pipeline.Setup(renderer);` before rendering.
Author
Owner

@deanebarker commented on GitHub (Jul 3, 2019):

Just implemented this in my production project. It works beautifully -- solves so many problems and eliminates so many workarounds.

Thank you for all the help on this, and I hope this issue resolution helps others down the road.

@deanebarker commented on GitHub (Jul 3, 2019): Just implemented this in my production project. It works beautifully -- solves _so_ many problems and eliminates _so_ many workarounds. Thank you for all the help on this, and I hope this issue resolution helps others down the road.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/markdig#306