mirror of
https://github.com/xoofx/markdig.git
synced 2026-02-11 05:44:45 +00:00
Consider Non-breaking space (unicode spaces) as regular spaces for formatting #576
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @linkdotnet on GitHub (Oct 18, 2022).
Hey,
if we have for example no-break spaces in ## (aka h2) elements, they don't get transform into
<h2>elements because the library checks whether or not a "regular space" is used. A no-break space would be this:aka U+00A= (info here: https://www.compart.com/en/unicode/U+00A0)Also GitHub does not consider them (whereas stackedit.io does recognize them).
## This should be an h2 element
If you have a Mac you can easily reproduce this with Option+SpaceBar, which produces the non-breaking space.
My request would be to include those characters as valid space-bars so that the above example would work.
I am not sure what the Markdown specification says about those "unicode spaces".
@MihaZupan commented on GitHub (Oct 18, 2022):
I think GitHub may have stripped these characters from your example.
Do you mean if the heading is
"#\u00A0# foo"? Or## f\u00A0oo?For the former, that doesn't make sense as the input, just like it wouldn't make sense with regular spaces
# # foo. The spec doesn't mention that anything can be in between the#characters.@linkdotnet commented on GitHub (Oct 18, 2022):
The latter example
## f\u00A0oo.This example "should" work:
## My h2 title.You can copy&paste this into a website like this, which shows you non-printable characters.

@xoofx commented on GitHub (Oct 18, 2022):
I think the spec says
##must be followed by a space or a tab (not a unicode in the space category), so if you look at all other CommonMark compliant parser here they don't parse## foocorrectly either.So I would say, it's constrained as per spec, and so won't fix. Thoughts @MihaZupan ?
@MihaZupan commented on GitHub (Oct 18, 2022):
My interpretation of the spec is that same as @xoofx.
If the spec intends for the broader interpretation of whitespace, it generally explicitly calls it
whitespaceorUnicode whitespace.Given that
commonmark.jsand GitHub don't support it either, I would also lean towards not changing anything in Markdig.In any case, you can apply the fix on your side
or a more targeted
@linkdotnet commented on GitHub (Oct 18, 2022):
Fair point. I guess for now we can close the issue. Thanks for your inputs!