Is there a way to render a summary by short circuiting or truncating the rendering? #337

Open
opened 2026-01-29 14:34:08 +00:00 by claunia · 3 comments
Owner

Originally created by @haacked on GitHub (Nov 4, 2019).

Is there an extension to render a summary of markdown text by truncating the content after a set number of words?

For example, suppose I'm building a site where people write articles in markdown. But on the home page, I want to display the first 300 words of each article. I'd like to avoid rendering the whole thing and then parse the HTML to find the first 300 words (while properly closing any open tags). It'd be nice if there's a way I could do it as MarkDig is parsing (or rendering) the markdoown.

If no such thing exists, I could try to implement this myself, but would appreciate some hints as to how I'd go about it.

Originally created by @haacked on GitHub (Nov 4, 2019). Is there an extension to render a summary of markdown text by truncating the content after a set number of words? For example, suppose I'm building a site where people write articles in markdown. But on the home page, I want to display the first 300 words of each article. I'd like to avoid rendering the whole thing and then parse the HTML to find the first 300 words (while properly closing any open tags). It'd be nice if there's a way I could do it as MarkDig is parsing (or rendering) the markdoown. If no such thing exists, I could try to implement this myself, but would appreciate some hints as to how I'd go about it.
Author
Owner

@MihaZupan commented on GitHub (Nov 4, 2019):

By far the simplest way is for you to find the first 300 words and pass that substring to Markdig. The downside is that if links are defined after that (as is common with Markdown where they are at the bottom), those wouldn't work. This could also break headings, emphasis text...

A much more correct approach, if the first one isn't suitable for you, is to edit the syntax tree to remove paragraphs. This is a better approach IMO. Ping me if you wanna go this route.

@MihaZupan commented on GitHub (Nov 4, 2019): By far the simplest way is for you to find the first 300 words and pass that substring to Markdig. The downside is that if links are defined after that (as is common with Markdown where they are at the bottom), those wouldn't work. This could also break headings, emphasis text... A much more correct approach, if the first one isn't suitable for you, is to edit the syntax tree to remove paragraphs. This is a better approach IMO. Ping me if you wanna go this route.
Author
Owner

@haacked commented on GitHub (Nov 4, 2019):

A much more correct approach, if the first one isn't suitable for you, is to edit the syntax tree to remove paragraphs. This is a better approach IMO. Ping me if you wanna go this route.

I'm looking for a correct approach. This sounds like a good option.

@haacked commented on GitHub (Nov 4, 2019): > A much more correct approach, if the first one isn't suitable for you, is to edit the syntax tree to remove paragraphs. This is a better approach IMO. Ping me if you wanna go this route. I'm looking for a correct approach. This sounds like a good option.
Author
Owner

@MihaZupan commented on GitHub (Nov 5, 2019):

I wrote a sample trimming implementation. Right now it will try to keep as many elements as will fit into the limit and then start discarding elements.

You can change how new lines affect the limit. The current implementation will be limiting the output based on character count, to change it to word count, you only have to change the TrimSpan implementation.

I did a bit of testing, but there are probably some edge-cases I didn't test that won't count towards the limit properly.

// pipeline setup

MarkdownDocument document = Markdown.Parse(markdown, pipeline);

TrimDocument(document, numberOfCharactersToKeep: 200);

// rendering to html

There are some implementation details such as whether you want to cut a link in the middle. If not, change this to charactersAvailable -= autoLink.Url.Length.

@MihaZupan commented on GitHub (Nov 5, 2019): I wrote a [sample trimming implementation](https://gist.github.com/MihaZupan/51e6e30542f45e73f70e1fc6636a49b7). Right now it will try to keep as many elements as will fit into the limit and then start discarding elements. You can change how new lines affect the limit. The current implementation will be limiting the output based on character count, to change it to word count, you only have to change the [TrimSpan](https://gist.github.com/MihaZupan/51e6e30542f45e73f70e1fc6636a49b7#file-trimmarkdowndocument-cs-L119) implementation. I did a bit of testing, but there are probably some edge-cases I didn't test that won't count towards the limit properly. ```c# // pipeline setup MarkdownDocument document = Markdown.Parse(markdown, pipeline); TrimDocument(document, numberOfCharactersToKeep: 200); // rendering to html ``` There are some implementation details such as whether you want to cut a link in the middle. If not, change [this](https://gist.github.com/MihaZupan/51e6e30542f45e73f70e1fc6636a49b7#file-trimmarkdowndocument-cs-L82) to `charactersAvailable -= autoLink.Url.Length`.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/markdig#337