Turning on TrackTrivia prevents EmphasisInline elements being created #472

Closed
opened 2026-01-29 14:37:37 +00:00 by claunia · 6 comments
Owner

Originally created by @nikkilocke on GitHub (Jul 9, 2021).

I am trying to create a converter which will convert Markdown to a form suitable for posting Telegram messages. I have got quite far, to the extent that I can parse some markdown, and turn it into a text string, with MessageEntity objects which show the offset, length and attributes (e.g. a Url for a link) - which is how Telegram does formatting.

Unfortunately the text string has the "insignificant" white space removed - for instance, my first test markdown is:

Test stuff
- **bold text**
- _italic text_
- ~~strikethrough text~~
- https://google.com?search=autolink
- [Full link](https://google.com)
- [**Bold full link**](https://google.com)
- **[Bold full link](https://google.com)**

I ran this through the Roundtrip renderer, and it came out as Test stuff-bold text-italic text-~~strikethrough text~~-https://google.com?search=autolink-Full link-Bold full link-Bold full link

My telegram renderer (which removes the markdown furniture) shows the same.

My renderer is a subclass of RoundtripRenderer, which extracts the bold, italic and url elements, and finds these Entities:

Type:Offset:Length:Text:Url
Bold:10:10:-bold text:
Italic:20:12:-italic text:
Url:90:10:-Full link:https://google.com
Url:100:15:-Bold full link:https://google.com
Bold:100:15:-Bold full link:
Bold:115:15:-Bold full link:
Url:115:15:-Bold full link:https://google.com

I need to preserve the white space, so I tried setting EnableTrackTrivia in the parser.

Unfortunately the document then has no EmphasisInline elements in it. The roundtrip output is (correctly):

Test stuff
- **bold text**
- _italic text_
- ~~strikethrough text~~
- https://google.com?search=autolink
- [Full link](https://google.com)
- [**Bold full link**](https://google.com)
- **[Bold full link](https://google.com)**

My Telegram renderer (which removes the markdown furniture for items it recognises) shows:

Test stuff
- **bold text**
- _italic text_
- ~~strikethrough text~~
- https://google.com?search=autolink
- Full link
- Bold full link
- **Bold full link**

but most of the inline emphasis entities are missing:

Type:Offset:Length:Text:Url
Url:112:9:Full link:https://google.com
Url:125:14:Bold full link:https://google.com
Bold:125:14:Bold full link:
Url:145:14:Bold full link:https://google.com

Should TrackTrivia turn off recognising inline emphasis? If so, is there another way to retain the newlines and spaces in the original markdown?

Originally created by @nikkilocke on GitHub (Jul 9, 2021). I am trying to create a converter which will convert Markdown to a form suitable for posting Telegram messages. I have got quite far, to the extent that I can parse some markdown, and turn it into a text string, with MessageEntity objects which show the offset, length and attributes (e.g. a Url for a link) - which is how Telegram does formatting. Unfortunately the text string has the "insignificant" white space removed - for instance, my first test markdown is: ``` Test stuff - **bold text** - _italic text_ - ~~strikethrough text~~ - https://google.com?search=autolink - [Full link](https://google.com) - [**Bold full link**](https://google.com) - **[Bold full link](https://google.com)** ``` I ran this through the Roundtrip renderer, and it came out as `Test stuff-bold text-italic text-~~strikethrough text~~-https://google.com?search=autolink-Full link-Bold full link-Bold full link` My telegram renderer (which removes the markdown furniture) shows the same. My renderer is a subclass of RoundtripRenderer, which extracts the bold, italic and url elements, and finds these Entities: ``` Type:Offset:Length:Text:Url Bold:10:10:-bold text: Italic:20:12:-italic text: Url:90:10:-Full link:https://google.com Url:100:15:-Bold full link:https://google.com Bold:100:15:-Bold full link: Bold:115:15:-Bold full link: Url:115:15:-Bold full link:https://google.com ``` I need to preserve the white space, so I tried setting EnableTrackTrivia in the parser. Unfortunately the document then has no EmphasisInline elements in it. The roundtrip output is (correctly): ``` Test stuff - **bold text** - _italic text_ - ~~strikethrough text~~ - https://google.com?search=autolink - [Full link](https://google.com) - [**Bold full link**](https://google.com) - **[Bold full link](https://google.com)** ``` My Telegram renderer (which removes the markdown furniture for items it recognises) shows: ``` Test stuff - **bold text** - _italic text_ - ~~strikethrough text~~ - https://google.com?search=autolink - Full link - Bold full link - **Bold full link** ``` but most of the inline emphasis entities are missing: ``` Type:Offset:Length:Text:Url Url:112:9:Full link:https://google.com Url:125:14:Bold full link:https://google.com Bold:125:14:Bold full link: Url:145:14:Bold full link:https://google.com ``` Should TrackTrivia turn off recognising inline emphasis? If so, is there another way to retain the newlines and spaces in the original markdown?
claunia added the bug label 2026-01-29 14:37:37 +00:00
Author
Owner

@nikkilocke commented on GitHub (Jul 9, 2021):

Just FYI, I have looked carefully at the code from NormalizeRenderer, and modified all my renderers to do what that does, and the output is now acceptable, although I would prefer it to match the input more exactly if possible.

So the problem is no longer serious for me, but you might find it intriguing, and worth investigating, as it may be a bug in the parser.

@nikkilocke commented on GitHub (Jul 9, 2021): Just FYI, I have looked carefully at the code from NormalizeRenderer, and modified all my renderers to do what that does, and the output is now acceptable, although I would prefer it to match the input more exactly if possible. So the problem is no longer serious for me, but you might find it intriguing, and worth investigating, as it may be a bug in the parser.
Author
Owner

@xoofx commented on GitHub (Aug 27, 2021):

Note that NormalizeRenderer should not be used with TrackTrivia but instead RoundtripRenderer. NormalizeRenderer might be deprecated at some point, as the normalize part should be better done as a modification of the AST that can be feed into the RountripRenderer

@xoofx commented on GitHub (Aug 27, 2021): Note that `NormalizeRenderer` should not be used with TrackTrivia but instead `RoundtripRenderer`. `NormalizeRenderer` might be deprecated at some point, as the normalize part should be better done as a modification of the AST that can be feed into the `RountripRenderer`
Author
Owner

@generateui commented on GitHub (Nov 10, 2021):

Can you provide a minimally viable test that fails on your input and include assertion? That's go a long way in fixing this.

@generateui commented on GitHub (Nov 10, 2021): Can you provide a minimally viable test that fails on your input and include assertion? That's go a long way in fixing this.
Author
Owner

@jo3w4rd commented on GitHub (Mar 9, 2022):

I notice this as well, just using Markdown.ToHtml().

The code:

        public static void TrackTrivia()
        {
            string filePath = "D:\\Repos\\MarkdownTest.md";
            string markdown = File.ReadAllText(filePath);
            Console.WriteLine(markdown);
            var pipeline = new MarkdownPipelineBuilder()
                .EnableTrackTrivia()
                .Build();
            Console.WriteLine(Markdown.ToHtml(markdown, pipeline));
        }

produces output:

# Look at emphasis

**bold** __bold, too__

*italic* _also italic_

`code`

Plain

<h1>Look at emphasis</h1>
<p>**bold** __bold, too__
</p>
<p>*italic* _also italic_
</p>
<p><code>code</code>
</p>
<p>Plain
</p>

If you remove the EnableTrackTrivia() call, the output is correct:

<h1>Look at emphasis</h1>
<p><strong>bold</strong> <strong>bold, too</strong></p>
<p><em>italic</em> <em>also italic</em></p>
<p><code>code</code></p>
<p>Plain</p>
@jo3w4rd commented on GitHub (Mar 9, 2022): I notice this as well, just using `Markdown.ToHtml()`. The code: ``` public static void TrackTrivia() { string filePath = "D:\\Repos\\MarkdownTest.md"; string markdown = File.ReadAllText(filePath); Console.WriteLine(markdown); var pipeline = new MarkdownPipelineBuilder() .EnableTrackTrivia() .Build(); Console.WriteLine(Markdown.ToHtml(markdown, pipeline)); } ``` produces output: ``` # Look at emphasis **bold** __bold, too__ *italic* _also italic_ `code` Plain <h1>Look at emphasis</h1> <p>**bold** __bold, too__ </p> <p>*italic* _also italic_ </p> <p><code>code</code> </p> <p>Plain </p> ``` If you remove the `EnableTrackTrivia()` call, the output is correct: ``` <h1>Look at emphasis</h1> <p><strong>bold</strong> <strong>bold, too</strong></p> <p><em>italic</em> <em>also italic</em></p> <p><code>code</code></p> <p>Plain</p> ```
Author
Owner

@xoofx commented on GitHub (Mar 9, 2022):

Yeah, if EnableTrackTrivia() is making such changes, than it's definitely a serious bug.

@xoofx commented on GitHub (Mar 9, 2022): Yeah, if `EnableTrackTrivia()` is making such changes, than it's definitely a serious bug.
Author
Owner

@xoofx commented on GitHub (Mar 11, 2022):

I believe this should be fixed by 983187e and available in 0.28.0

Please note that I have opened a new issue #604

I would really highly suggest to not use EnableTrackTrivia() for rendering to HTML. EnableTrackTrivia() was mainly introduced for roundtrip. I have seen other rendering issues with it.

Otherwise I'm curious about the use case for using EnableTrackTrivia() with rendering to HTML?

@xoofx commented on GitHub (Mar 11, 2022): I believe this should be fixed by 983187e and available in 0.28.0 Please note that I have opened a new issue #604 I would really highly suggest to not use `EnableTrackTrivia()` for rendering to HTML. `EnableTrackTrivia()` was mainly introduced for roundtrip. I have seen other rendering issues with it. Otherwise I'm curious about the use case for using `EnableTrackTrivia()` with rendering to HTML?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/markdig#472