mirror of
https://github.com/xoofx/markdig.git
synced 2026-02-03 21:36:36 +00:00
How to remove things like CodeBlocks from ToPlainText rendering #669
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @kaylumah on GitHub (Apr 5, 2024).
Hi Xoofx,
The repo does not have discussions enabled so I am submitting it here. I apologise in advance if there is a better place to put these kind of questions.
For my blog I am looking into a clean way to count the number of words present in a specific article.
I came across the
ToPlainTextmethod for my Markdown. That appears to make it mostly clean text.However, it leaves in things like the code blocks (my blog is technical, so lots of code snippets).
Is there an extension point I missed, in which I can remove code blocks from the PlainText view?
Any pointers would be appreciated
Thanks for the awesome work you did on both Markdig and Scriban
Max
@xoofx commented on GitHub (Apr 5, 2024):
Not that I'm aware, but you can just take the Markdown AST, search/remove the code blocks, and call PlainText later.
In my own blog post engine, I do it differently, convert to HTML, and extract the text from there with NUglify here
@kaylumah commented on GitHub (Apr 5, 2024):
I don't see an equivalent ToText as an extension
8e22754db4/src/Markdig/Markdown.cs (L136)So based on
8e22754db4/src/Markdig/Markdown.cs (L240)I think I need to do something like this
@BeneHenke commented on GitHub (Sep 10, 2024):
You can iterate through your MarkdownDocument and remove blocks like this.
foreach (CodeBlock item in document.Descendants<CodeBlock>()) { document.Remove(item); }