mirror of
https://github.com/xoofx/markdig.git
synced 2026-02-04 13:54:44 +00:00
Possible bug with string *bob*&_margaret_ #396
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @xakpc on GitHub (Aug 21, 2020).
Hey!
I switched from CommonMark to Markdig to get more control over what markdown parse and what ignore and find out that one of my unit test failed.
Here is TestCase
[TestCase("*bob*&_margaret_", "<em>bob</em>&<em>margaret</em>")]CommonMark result:
<em>bob</em>&<em>margaret</em>Markdig result:
<em>bob</em>&_margaret_Is this a bug?
P.S. Check how it works in Github: bob&margaret
@MihaZupan commented on GitHub (Aug 21, 2020):
Looks like this could be a bug: BabelMark
@iskcal commented on GitHub (Aug 25, 2020):
I have found the problem may be at the function
CheckUnicodeCategoryof classCharHelperfrom Line 174. The character & is not regarded as a valid punctuation in this function, because & may be used for HTML entities or print unicode. Markdig considers that the parts&_margaret_is a word and_has no effect in a word. If & is replaced to other punctuations, it would be parsed properly. For example,*bob*+_margaret_would be transformed into<em>bob</em>+<em>margaret</em>.Interestingly, the code of this function seems to come from CommomMark.NET, but the behavior of CommonMark.NET is correct. If we don't want to change the function
CheckUnicodeCategory, we can add a function to recoginize what & stands for, and it might be a good idea to see how CommonMark.NET implements it.