mirror of
https://github.com/xoofx/markdig.git
synced 2026-02-03 21:36:36 +00:00
Question: Determine Word Count? #409
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @Mike-E-angelo on GitHub (Oct 22, 2020).
Greetings... thank you for making this great library!
In my case, I am looking for a Markdown parser/library/api that can quickly tell me the line count and word count of a given Markdown document. I see that there is a
MarkdownDocument.LineCountso that takes care of one of the requirements.Is there a way of determining the word count of a document in a similar fashion? I see there's a way of iterating through the document via #381, so I will be attempting to do this, but wanted to quickly ping here to see if there is an easier/more obvious way of doing so.
Thank you for any assistance you can provide. 👍
@xoofx commented on GitHub (Oct 23, 2020):
It's not straightforward to go through all the AST to collect words. Why don't you perform that on the input string directly with a regex?
Regex.Matches(markdownText, @"\b\w+\b").CountRegex.Matches(markdownText, "\n").Count@Mike-E-angelo commented on GitHub (Oct 23, 2020):
That's actually what I am doing with plain text documents, @xoofx. 😁 The thought did occur to do the same with Markdown documents but wanted to ensure there wasn't some obvious feature I was overlooking with the API here to be more accurate and/or engrained, for lack of a better word.
So, it looks like I will go with
RegEx, then. Thank you for your time and input!