mirror of
https://github.com/xoofx/markdig.git
synced 2026-02-03 21:36:36 +00:00
Line breaks in convert to plain text #181
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @DDAndyChen on GitHub (Jan 16, 2018).
By using Markdown.ToPlainText method, the following markdown
is converted to
some line breaks are dropped, so "Smith" and "Start" become one word, the same as "2018-02-07" and "Position".
I think it the output would be better in this:
that is
@xoofx commented on GitHub (Jan 16, 2018):
Yep, good catch, likely a bug
@hemantkd commented on GitHub (Mar 18, 2018):
Hi @xoofx,
I'm interested in fixing this bug.
I have cloned the repository and using Visual Studio 2017.
I'm getting the following Build error:
"The specified language targets for uap10.0 is missing. Ensure correct tooling is installed.\r\nMissing File: C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\MSBuild\Microsoft\WindowsXaml\v15.0\Microsoft.Windows.UI.Xaml.CSharp.targets Markdig C:\Users\XXX\.nuget\packages\msbuild.sdk.extras\1.0.9\build\netstandard1.0\MSBuildSdkExtras.Common.targets"
Does the Solution absolutely need the UWP tooling enabled to be able to Build successfully?
@xoofx commented on GitHub (Mar 18, 2018):
Yes
@hemantkd commented on GitHub (Mar 22, 2018):
Hi @xoofx,
Thanks for earlier.
After spending some time and having to manually install the UAP version required by the project using the Visual Studio Installer, finally managed to build the solution successfully.
Spent some time looking at the code and common mark spec too.
[Removed the test scenario from the comment as I have realised that it was incorrect]
@xavierdecoster commented on GitHub (Oct 5, 2018):
Any plans on fixing this bug? :) Or is there a work-around?
@xoofx commented on GitHub (Oct 6, 2018):
No, I don't have any spare time left, PR welcome
@xavierdecoster commented on GitHub (Oct 8, 2018):
Sorry to hear that.
It's easily reproducible using the following test (the top one for HTML is already there, I just replicated it for
Markdown.ToPlainText()to reproduce the issue):You'll notice this new test fails.
I noticed it's parsing a single
ParagraphBlock, and theLineReaderparses 2 lines (*1*and*2*), whilst dropping/ignoring the CRLF characters at the end. (LineReader.csln 55)I've no idea why the last occurrence of the new line character is discarded when parsing to plain text, whereas it's not when parsing to html... Trying to spot the difference.
If you have any idea/pointers where to look at, please share :)
@xoofx commented on GitHub (Oct 8, 2018):
The problem is not in the parser but in the renderer. The HTML renderer is used today to render as a text. There are likely a few places in the code where we don't output newline, while the HTML doesn't care, the text version would care. I don't like the idea of using the HTML renderer to output plain text, but for some that was the quickest solution. I don't think it is much work in the codebase, but you need to dig into it (HtmlRenderer is quite simple, so it should not be difficult)
@neilha commented on GitHub (Nov 27, 2018):
Hi @xoofx - thanks for merging pull request. Do you have any timelines for publishing a new version to nuget? Thanks.
@bstoked commented on GitHub (Jan 14, 2020):
Wondering if this fix has been published to nuget yet? On a project that definitely needs it. :) Thanks.
@MihaZupan commented on GitHub (Jan 14, 2020):
Last NuGet is from 3 months ago so definitely yes.