Any way top stop the parser from adding strange <p> tags? #528

Closed
opened 2026-01-29 14:38:50 +00:00 by claunia · 4 comments
Owner

Originally created by @zerogr4vity on GitHub (Apr 18, 2022).

I am using Markdig.Markdown.ToHtml() to display the contents of a .md file on a webpage. If I provide the following sample string, I get the expected output:
Markdig.Markdown.ToHtml("# Text A\nText B\n\n## Text C")
becomes

<h1>Text A</h1>
<p>Text B</p>

<h2>Text C</h2>

However, if I put the following in a markdown file, I get a weird output:
sample.md:

# Text A
Text B

## Text C

becomes

<p># Text A Text B</p>
<h2>Text C</h2>

I get the expected output when I view the file with Markdown Editor VS2022 extension. I also verified that the file contains the proper characters right before calling the method. Any idea what I can do to achieve the correct output?

Originally created by @zerogr4vity on GitHub (Apr 18, 2022). I am using `Markdig.Markdown.ToHtml()` to display the contents of a .md file on a webpage. If I provide the following sample string, I get the expected output: `Markdig.Markdown.ToHtml("# Text A\nText B\n\n## Text C") ` becomes ``` <h1>Text A</h1> <p>Text B</p> <h2>Text C</h2> ``` However, if I put the following in a markdown file, I get a weird output: sample.md: ``` # Text A Text B ## Text C ``` becomes ``` <p># Text A Text B</p> <h2>Text C</h2> ``` I get the expected output when I view the file with Markdown Editor VS2022 extension. I also verified that the file contains the proper characters right before calling the method. Any idea what I can do to achieve the correct output?
claunia added the question label 2026-01-29 14:38:50 +00:00
Author
Owner

@xoofx commented on GitHub (Apr 20, 2022):

However, if I put the following in a markdown file, I get a weird output:

Do you have the code that you use to load/parse it? Do you have the exact content on the disk? What is used CRLF or LF, a combination of?

@xoofx commented on GitHub (Apr 20, 2022): > However, if I put the following in a markdown file, I get a weird output: Do you have the code that you use to load/parse it? Do you have the exact content on the disk? What is used CRLF or LF, a combination of?
Author
Owner

@xoofx commented on GitHub (Apr 20, 2022):

Added a test in commit 17b5500 but can't reproduce your issue. Missing a context...

@xoofx commented on GitHub (Apr 20, 2022): Added a test in commit 17b5500 but can't reproduce your issue. Missing a context...
Author
Owner

@zerogr4vity commented on GitHub (Apr 20, 2022):

Do you have the code that you use to load/parse it? Do you have the exact content on the disk? What is used CRLF or LF, a combination of?

The file contains only CRLF endings.

public class ContractAgreement
{
    public byte[] Text { get; set; }
    public override string ToString()
    {
        return Encoding.Default.GetString(Text);
    }
}

ContractAgreements/Details.cshtml:

@using Markdig
@model ContractViewer.Models.ContractAgreement

@{
    ViewData["Title"] = "Details";
}

<h1>Details</h1>
...
<div>
    @Html.Raw(Markdown.ToHtml(Model.ToString()))
</div>
...

sample.md

@zerogr4vity commented on GitHub (Apr 20, 2022): > Do you have the code that you use to load/parse it? Do you have the exact content on the disk? What is used CRLF or LF, a combination of? The file contains only CRLF endings. ``` public class ContractAgreement { public byte[] Text { get; set; } public override string ToString() { return Encoding.Default.GetString(Text); } } ``` ContractAgreements/Details.cshtml: ``` @using Markdig @model ContractViewer.Models.ContractAgreement @{ ViewData["Title"] = "Details"; } <h1>Details</h1> ... <div> @Html.Raw(Markdown.ToHtml(Model.ToString())) </div> ... ``` [sample.md](https://github.com/xoofx/markdig/files/8523687/sample.md)
Author
Owner

@xoofx commented on GitHub (Apr 20, 2022):

Your file has a UTF8 BOM marker "EF BB BF", so it generates garbage at the beginning and it does not get removed by Encoding.Default.GetString, if you e.g a StreamReader from a byte with the detect byte order marks it would work. Not sure why you are using a byte[], but you might want to revisit this.

Closing as it doesn't look like an issue in markdig, but an encoding issue on your side.

@xoofx commented on GitHub (Apr 20, 2022): Your file has a UTF8 BOM marker "EF BB BF", so it generates garbage at the beginning and it does not get removed by `Encoding.Default.GetString`, if you e.g a `StreamReader` from a byte with the detect byte order marks it would work. Not sure why you are using a byte[], but you might want to revisit this. Closing as it doesn't look like an issue in markdig, but an encoding issue on your side.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/markdig#528