Compare commits

...

24 Commits

Author SHA1 Message Date
Alexandre Mutel
bce4b70dc6 Merge pull request #649 from MihaZupan/commonmark-whitespace-punctuation
Align Whitespace and Punctuation definitions with CommonMark
2022-08-12 07:46:24 +02:00
Alexandre Mutel
1f71520de9 Merge pull request #650 from gfoidl/htmlhelper-TryParseHtmlTagOpenTag_remove_branches
Remove some branches in HtmlHelper.TryParseHtmlTagOpenTag by using bitmask
2022-08-12 07:45:31 +02:00
Günther Foidl
bfd7b6460c Remove some branches in HtmlHelper.TryParseHtmlTagOpenTag by using bitmasks 2022-07-21 12:11:03 +02:00
Miha Zupan
0e26ec5382 Align Whitespace and Punctuation definitions with CommonMark 2022-07-17 20:22:26 +02:00
Alexandre Mutel
5f80d86265 Merge pull request #642 from mnaoumov/issue-579
More accurate check for YAML renderers
2022-06-15 07:52:51 +02:00
Alexandre Mutel
b85cc0daf5 Merge pull request #644 from PaulVrugt/master
added support for ToPlainText for pipetables extension
2022-06-15 07:52:06 +02:00
Paul Vrugt
76073e81c0 added support for ToPlainText for pipetables extension 2022-06-08 19:47:48 +02:00
Michael Naumov
5b6621d729 Add tests to cover all possible cases of adding Yaml renderers 2022-06-07 15:06:14 -06:00
Michael Naumov
9723eda455 More accurate check for YAML renderers 2022-06-07 14:02:40 -06:00
Alexandre Mutel
7228ad5072 Merge pull request #638 from mnaoumov/issue-579
Add YamlFrontMatterRoundtripRenderer
2022-06-07 21:40:58 +02:00
Alexandre Mutel
96f55d0aa6 Merge pull request #637 from MihaZupan/perf-parse-overhead
Reduce the overhead of Parse calls
2022-06-07 21:38:08 +02:00
Alexandre Mutel
2d69ac4499 Merge pull request #624 from mattj23/docs
Additional parser documentation
2022-06-07 21:37:17 +02:00
Michael Naumov
5e91b9b763 Add YamlFrontMatterRoundtripRenderer 2022-05-17 16:46:24 -06:00
Miha Zupan
e6255de62b Avoid locking overhead in ObjectCache 2022-05-14 19:04:51 +02:00
Miha Zupan
c7b8772669 Avoid SelfPipeline search overhead 2022-05-14 19:03:31 +02:00
Alexandre Mutel
89a10ee76b Remove old benchmark graphs 2022-04-23 17:37:06 +02:00
Alexandre Mutel
7676079b4e Update doc readme with new benchmarks 2022-04-23 17:36:17 +02:00
Alexandre Mutel
495abab743 Update benchmark code and dependencies 2022-04-23 16:51:16 +02:00
Alexandre Mutel
210b39e8fb Improve RenderBase optimization with Type.GetTypeHandle (#632) 2022-04-23 16:21:09 +02:00
Alexandre Mutel
f09d030fd3 Cleanup code after #632 2022-04-23 14:05:26 +02:00
Alexandre Mutel
f2ca6be7a6 Fix name for SpecFileGen 2022-04-23 07:59:18 +02:00
Matt Jarvis
f0c200fc28 Updates to parsing documentation, inlines 2022-04-15 18:16:07 -04:00
Matt Jarvis
70184179b7 Merge remote-tracking branch 'upstream/master' into docs 2022-04-15 15:45:16 -04:00
Matt Jarvis
4169e538af Merge remote-tracking branch 'upstream/master' into docs 2022-04-02 22:22:11 -04:00
32 changed files with 673 additions and 404 deletions

View File

@@ -125,34 +125,3 @@ public class BlinkExtension : IMarkdownExtension
}
}
```
## Parsers
Markdig has two types of parsers, both of which derive from `ParserBase<TProcessor>`.
Block parsers, derived from `BlockParser`, identify block elements from lines in the source text and push them onto the abstract syntax tree. Inline parsers, derived from `InlineParser`, identify inline elements from `LeafBlock` elements and push them into an attached container.
Both inline and block parsers are regex-free, and instead work on finding opening characters and then making fast read-only views into the source text.
### Block Parser
**(The contents of this section I am very unsure of, this is from my reading of the code but I could use some guidance here)**
**(Does `CanInterrupt` specifically refer to interrupting a paragraph block?)**
In order to be added to the parsing pipeline, all block parsers must be derived from `BlockParser`.
Internally, the main parsing algorithm will be stepping through the source text, using the `HasOpeningCharacter(char c)` method of the block parser collection to pre-identify parsers which *could* be opening a block at a given position in the text based on the active character. Thus any derived implementation needs to set the value of the `char[]? OpeningCharacter` property with the initial characters that might begin the block.
If a parser can potentially open a block at a place in the source text it should expect to have the `TryOpen(BlockProcessor processor)` method called. This is a virtual method that must be implemented on any derived class. The `BlockProcessor` argument is a reference to an object which stores the current state of parsing and the position in the source.
**(What are the rules concerning how the `BlockState` return type should work for `TryOpen`? I see examples returning `None`, `Continue`, `BreakDiscard`, `ContinueDiscard`. How does the return value change the algorithm behavior?)**
**(Should a new block always be pushed into `processor.NewBlocks` in the `TryOpen` method?)**
As the main parsing algorithm moves forward, it will then call `TryContinue(...)` on blocks that were opened in `TryOpen(..)`.
**(Is this where/how you close a block? Is there anything that needs to be done to perform that beyond `block.UpdateSpanEnd` and returning `BlockState.Break`?)**
### Inline Parser

View File

@@ -71,7 +71,7 @@ var pipeline = new MarkdownPipelineBuilder()
var document = Markdown.Parse(markdownText, pipeline);
```
## The Parser and the Pipeline
## Markdown.Parse and the MarkdownPipeline
As metioned in the [Introduction](#introduction), Markdig's parsing machinery involves two surface components: the `Markdown.Parse(...)` method, and the `MarkdownPipeline` type. The main parsing algorithm (not to be confused with individual `BlockParser` and `InlineParser` components) lives in the `Markdown.Parse(...)` static method. The `MarkdownPipeline` is responsible for configuring the behavior of the parser.
@@ -117,6 +117,8 @@ This section discusses the pipeline builder and the concept of *extensions* in m
### Extensions (IMarkdownExtension)
***Note**: This section discusses how to consume extensions by adding them to pipeline. For a discussion on how to implement an extension, refer to the [Extensions/Parsers](parsing-extensions.md) document.*
Extensions are the primary mechanism for modifying the parsers in the pipeline.
An extension is any class which implements the `IMarkdownExtension` interface found in [IMarkdownExtension.cs](https://github.com/xoofx/markdig/blob/master/src/Markdig/IMarkdownExtension.cs). This interface consists solely of two `Setup(...)` overloads, which both take a `MarkdownPipelineBuilder` as the first argument.
@@ -125,8 +127,6 @@ When the `MarkdownPipelineBuilder.Build()` method is invoked as the final stage
Because of this, *some* extensions may need to be ordered in relation to others, for instance if they modify a parser that gets added by a different extension. The `OrderedList<T>` class contains convenience methods to this end, which aid in finding other extensions by type and then being able to added an item before or after them.
For a discussion on how to implement an extension, refer to the [Extensions/Parsers](parsing-extensions.md) document.
### The MarkdownPipelineBuilder
Because the `MarkdownPipeline` is a sealed internal class, it cannot (and *should* not be attempted to) be created directly. Rather, the `MarkdownPipelineBuilder` manages the requisite construction of the pipeline after the configuration has been provided by the client code.
@@ -225,3 +225,112 @@ Internally, the fluent interface wraps manual operations on the three primary co
All three collections are `OrderedList<T>`, which is a collection type custom to Markdig which contains special methods for finding and inserting derived types. With the builder created, manual configuration can be performed by accessing these collections and their elements and modifying them as necessary.
***Warning**: be aware that it should not be necessary to directly modify either the `BlockParsers` or the `InlineParsers` collections directly during the pipeline configuration. Rather, these can and should be modified whenever possible through the `Setup(...)` method of extensions, which will be deferred until the pipeline is actually built and will allow for ordering such that operations dependent on other operations can be accounted for.*
## Block and Inline Parsers
Let's dive deeper into the parsing system. With a configured pipeline, the `Markdown.Parse` method will run through two two conceptual passes to produce the abstract syntax tree.
1. First, `BlockProcessor.ProcessLine` is called on the file's lines, one by one, trying to identify block elements in the source
2. Next, an `InlineProcessor` is created or borrowed and run on each block to identify inline elements.
These two conceptual operations dictate Markdig's two types of parsers, both of which derive from `ParserBase<TProcessor>`.
Block parsers, derived from `BlockParser`, identify block elements from lines in the source text and push them onto the abstract syntax tree. Inline parsers, derived from `InlineParser`, identify inline elements from `LeafBlock` elements and push them into an attached container: the `ContainerInline? LeafBlock.Inline` property.
Both inline and block parsers are regex-free, and instead work on finding opening characters and then making fast read-only views into the source text.
### Block Parser
**(The contents of this section I am very unsure of, this is from my reading of the code but I could use some guidance here)**
**(Does `CanInterrupt` specifically refer to interrupting a paragraph block?)**
In order to be added to the parsing pipeline, all block parsers must be derived from `BlockParser`.
Internally, the main parsing algorithm will be stepping through the source text, using the `HasOpeningCharacter(char c)` method of the block parser collection to pre-identify parsers which *could* be opening a block at a given position in the text based on the active character. Thus any derived implementation needs to set the value of the `char[]? OpeningCharacter` property with the initial characters that might begin the block.
If a parser can potentially open a block at a place in the source text it should expect to have the `TryOpen(BlockProcessor processor)` method called. This is a virtual method that must be implemented on any derived class. The `BlockProcessor` argument is a reference to an object which stores the current state of parsing and the position in the source.
**(What are the rules concerning how the `BlockState` return type should work for `TryOpen`? I see examples returning `None`, `Continue`, `BreakDiscard`, `ContinueDiscard`. How does the return value change the algorithm behavior?)**
**(Should a new block always be pushed into `processor.NewBlocks` in the `TryOpen` method?)**
As the main parsing algorithm moves forward, it will then call `TryContinue(...)` on blocks that were opened in `TryOpen(..)`.
**(Is this where/how you close a block? Is there anything that needs to be done to perform that beyond `block.UpdateSpanEnd` and returning `BlockState.Break`?)**
### Inline Parsers
Inline parsers extract inline markdown elements from the source, but their starting point is the text of each individual `LeafBlock` produced by the block parsing process. To understand the role of each inline parser it is necessary to first understand the inline parsing process as a whole.
#### The Inline Parsing Process
After the block parsing process has occurred, the abstract syntax tree of the document has been populated only with block elements, starting from the root `MarkdownDocument` node and ending with the individual `LeafBlock` derived block elements, most of which will be `ParagraphBlocks`, but also include things like `CodeBlocks`, `HeadingBlocks`, `FigureCaptions`, and so on.
At this point, the parsing machinery will iterate through each `LeafBlock` one by one, creating and assigning its `LeafBlock.Inline` property with an empty `ContainerInline`, and then sweeping through the `LeafBlock`'s text running the inline parsers. This occurs by the following process:
Starting at the first character of the text it will run through all of its `InlineParser` objects which have that character as a possible opening character for the type of inline they extract. The parsers will run in order (as such ordering is the *only* way which conflicts between parsers are resolved, and thus is important to the overall behavior of the parsing system) and the `Match(...)` method will be called on each candidate parser, in order, until one of them returns `true`.
The `Match(...)` method will be passed a slice of the text beginning at the *specific character* being processed and running until the end of the `LeafBlock`'s complete text. If the parser can create an `Inline` element it will do so and return `true`, otherwise it will return `false`. The parser will store the created `Inline` object in the processor's `InlineProcessor.Inline` property, which as passed into the `Match(...)` method as an argument. The parser will also advance the start of the working `StringSlice` by the characters consumed in the match.
* If the parser has created an inline element and returned `true`, that element is pushed into the deepest open `ContainerInline`
* If `false` was returned, a default `LiteralInlineParser` will run instead:
* If the `InlineProcessor.Inline` property already has an existing `LiteralInline` in it, these characters will be added to the existing `LiteralInline`, effectively growing it
* If no `LiteralInline` exists in the `InlineProcessor.Inline` property, a new one will be created containing the consumed characters and pushed into the deepest open `ContainerInline`
After that, the working text of the `LeafBlock` has been conceptually shortened by the advancing start of the working `StringSlice`, moving the starting character forward. If there is still text remaining, the process repeats from the new starting character until all of the text is consumed.
At this point, when all of the source text from the `LeafBlock` has been consumed, a post-processing step occurs. `InlineParser` objects in the pipeline which also implement `IPostInlineProcessor` are invoked on the `LeafBlock`'s root `ContainerInline`. This, for example, is the mechanism by which the unstructured output of the `EmphasisInlineParser` is then restructured into cleanly nested `EmphasisInline` and `LiteralInline` elements.
#### Responsibilities of an Inline Parser
Like the block parsers, an inline parser must provide an array of opening characters with the `char[]? OpeningCharacter` property.
However, inline parsers only require one other method, the `Match(InlineProcessor processor, ref StringSlice slice)` method, which is expected to determine if a match for the related inline is located at the starting character of the slice.
Within the `Match` method a parser should:
1. Determine if a match begins at the starting character of the `slice` argument
2. If no match exists, the method should return `false` and not advance the `Start` property of the `slice` argument
3. If a match does exist, perform the following actions:
* Instantiate the appropriate `Inline` derived class and assign it to the processor argument with `processor.Inline = myInlineObject`
* Advance the `Start` property of the `slice` argument by the number of characters contained in the match, for example by using the `NextChar()`, `SkipChar()`, or other helper methods of the `StringSlice` class
* Return `true`
While parsing, the `InlineProcessor` performing the processing, which is available to the `Match` function through the `processor` argument, contains a number of properties which can be used to access the current state of parsing. For example, the `processor.Inline` property is the mechanism for returning a new inline element, but before assignment it contains the last created inline, which in turn can be accessed for its parents.
Additionally, in the case of inlines which can be expected to contain other inlines, a possible strategy is to inject an inline element derived from `DelimiterInline` when the opening delimiter is detected, then to replace the opening delimiter with the final desired element when the closing delimiter is found. This is the strategy used by the `LinkInlineParser`, for example. In such cases the tools described in the next section, such as the `ReplaceBy` method, can be used. Note that if this method is used the post-processing should be invoked on the `InlineProcessor` in order to finalize any emphasis elements. For example, in the following code adapted from the `LinkInlineParser`:
```csharp
var parent = processor.Inline?.FirstParentOfType<MyDelimiterInline>();
if (parent is null) return;
var myInline = new MySpecialInline { /* set span and other parameters here */ };
// Replace the delimiter inline with the final inline type, adopting all of its children
parent.ReplaceBy(myInline);
// Notifies processor as we are creating an inline locally
processor.Inline = myInline;
// Process emphasis delimiters
processor.PostProcessInlines(0, myInline, null, false);
```
#### Inline Post-Processing
The purpose of post-processing inlines is typically to re-structure inline elements after the initial parsing is complete and the entire structure of the inline elements within a parent container is now available in a way it was not during the parsing process. Generally this consists of removing, replacing, and re-ordering `Inline` elements.
To this end, the `Inline` abstract base class contains several helper methods intended to allow manipulation of inline elements during the post-processing phase.
|Method|Purpose|
|-|-|
|`InsertAfter(...)`|Takes a new inline as an argument and inserts it into the same parent container after this instance|
|`InsertBefore(...)`|Takes a new inline as an argument and inserts it into the same parent container before this instance|
|`Remove()`|Removes this inline from its parent container|
|`ReplaceBy(...)`|Removes this instance and replaces it with a new inline specified in the argument. Has an option to move all of the original inline's children into the new inline.|
Additionally, the `PreviousSibling` and `NextSibling` properties can be used to determine the siblings of an inline element within its parent container. The `FirstParentOfType<T>()` method can be used to search for a parent element, which is often useful when searching for `DelimiterInline` derived elements, which are implemented as containers.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 9.1 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 8.6 KiB

View File

@@ -112,82 +112,34 @@ This software is released under the [BSD-Clause 2 license](https://github.com/lu
## Benchmarking
This is an early preview of the benchmarking against various implementations:
The latest benchmark was collected on April 23 2022, against the following implementations:
**C implementations**:
- [cmark](https://github.com/jgm/cmark) (version: 0.25.0): Reference C implementation of CommonMark, no support for extensions
- [Moonshine](https://github.com/brandonc/moonshine) (version: : popular C Markdown processor
**.NET implementations**:
- [Markdig](https://github.com/lunet-io/markdig) (version: 0.5.x): itself
- [CommonMark.NET(master)](https://github.com/Knagis/CommonMark.NET) (version: 0.11.0): CommonMark implementation for .NET, no support for extensions, port of cmark
- [CommonMark.NET(pipe_tables)](https://github.com/AMDL/CommonMark.NET/tree/pipe-tables): An evolution of CommonMark.NET, supports extensions, not released yet
- [MarkdownDeep](https://github.com/toptensoftware/markdowndeep) (version: 1.5.0): another .NET implementation
- [MarkdownSharp](https://github.com/Kiri-rin/markdownsharp) (version: 1.13.0): Open source C# implementation of Markdown processor, as featured on Stack Overflow, regexp based.
- [Marked.NET](https://github.com/T-Alex/MarkedNet) (version: 1.0.5) port of original [marked.js](https://github.com/chjj/marked) project
- [Microsoft.DocAsCode.MarkdownLite](https://github.com/dotnet/docfx/tree/dev/src/Microsoft.DocAsCode.MarkdownLite) (version: 2.0.1) used by the [docfx](https://github.com/dotnet/docfx) project
### Analysis of the results:
- Markdig is roughly **x100 times faster than MarkdownSharp**, **30x times faster than docfx**
- **Among the best in CPU**, Extremely competitive and often faster than other implementations (not feature wise equivalent)
- **15% to 30% less allocations** and GC pressure
Because Marked.NET, MarkdownSharp and DocAsCode.MarkdownLite are way too slow, they are not included in the following charts:
![BenchMark CPU Time](img/BenchmarkCPU.png)
![BenchMark Memory](img/BenchmarkMemory.png)
### Performance for x86:
- [Markdig](https://github.com/lunet-io/markdig) (version: 0.30.2): itself
- [cmark](https://github.com/commonmark/cmark) (version: 0.30.2): Reference C implementation of CommonMark, no support for extensions
- [CommonMark.NET(master)](https://github.com/Knagis/CommonMark.NET) (version: 0.15.1): CommonMark implementation for .NET, no support for extensions, port of cmark, deprecated.
- [MarkdownSharp](https://github.com/Kiri-rin/markdownsharp) (version: 2.0.5): Open source C# implementation of Markdown processor, as featured previously on Stack Overflow, regexp based.
```
BenchmarkDotNet-Dev=v0.9.7.0+
OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i7-4770 CPU 3.40GHz, ProcessorCount=8
Frequency=3319351 ticks, Resolution=301.2637 ns, Timer=TSC
HostCLR=MS.NET 4.0.30319.42000, Arch=32-bit RELEASE
JitModules=clrjit-v4.6.1080.0
// * Summary *
Type=Program Mode=SingleRun LaunchCount=2
WarmupCount=2 TargetCount=10
BenchmarkDotNet=v0.13.1, OS=Windows 10.0.22000
AMD Ryzen 9 5950X, 1 CPU, 32 logical and 16 physical cores
.NET SDK=6.0.202
[Host] : .NET 6.0.4 (6.0.422.16404), X64 RyuJIT
DefaultJob : .NET 6.0.4 (6.0.422.16404), X64 RyuJIT
Method | Median | StdDev |Scaled | Gen 0 | Gen 1| Gen 2|Bytes Allocated/Op |
--------------------------- |------------ |---------- |------ | ------ |------|---------|------------------ |
Markdig | 5.5316 ms | 0.0372 ms | 0.71 | 56.00| 21.00| 49.00| 1,285,917.31 |
CommonMark.NET(master) | 4.7035 ms | 0.0422 ms | 0.60 | 113.00| 7.00| 49.00| 1,502,404.60 |
CommonMark.NET(pipe_tables) | 5.6164 ms | 0.0298 ms | 0.72 | 111.00| 56.00| 49.00| 1,863,128.13 |
MarkdownDeep | 7.8193 ms | 0.0334 ms | 1.00 | 120.00| 56.00| 49.00| 1,884,854.85 |
cmark | 4.2698 ms | 0.1526 ms | 0.55 | -| -| -| NA |
Moonshine | 6.0929 ms | 0.1053 ms | 1.28 | -| -| -| NA |
Marked.NET | 207.3169 ms | 5.2628 ms | 26.51 | 0.00| 0.00| 0.00| 303,125,228.65 |
MarkdownSharp | 675.0185 ms | 2.8447 ms | 86.32 | 40.00| 27.00| 41.00| 2,413,394.17 |
Microsoft DocfxMarkdownLite | 166.3357 ms | 0.4529 ms | 21.27 |4,452.00|948.00|11,167.00| 180,218,359.60 |
| Method | Mean | Error | StdDev |
|------------------ |-----------:|----------:|----------:|
| markdig | 1.979 ms | 0.0221 ms | 0.0185 ms |
| cmark | 2.571 ms | 0.0081 ms | 0.0076 ms |
| CommonMark.NET | 2.016 ms | 0.0169 ms | 0.0158 ms |
| MarkdownSharp | 221.455 ms | 1.4442 ms | 1.3509 ms |
```
### Performance for x64:
- Markdig is roughly **x100 times faster than MarkdownSharp**
- **20% faster than the reference cmark C implementation**
```
BenchmarkDotNet-Dev=v0.9.6.0+
OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz, ProcessorCount=8
Frequency=3319351 ticks, Resolution=301.2637 ns, Timer=TSC
HostCLR=MS.NET 4.0.30319.42000, Arch=64-bit RELEASE [RyuJIT]
JitModules=clrjit-v4.6.1080.0
Type=Program Mode=SingleRun LaunchCount=2
WarmupCount=2 TargetCount=10
Method | Median | StdDev | Gen 0 | Gen 1 | Gen 2 | Bytes Allocated/Op |
--------------------- |---------- |---------- |------- |------- |------ |------------------- |
TestMarkdig | 5.5276 ms | 0.0402 ms | 109.00 | 96.00 | 84.00 | 1,537,027.66 |
TestCommonMarkNet | 4.4661 ms | 0.1190 ms | 157.00 | 96.00 | 84.00 | 1,747,432.06 |
TestCommonMarkNetNew | 5.3151 ms | 0.0815 ms | 229.00 | 168.00 | 84.00 | 2,323,922.97 |
TestMarkdownDeep | 7.4076 ms | 0.0617 ms | 318.00 | 186.00 | 84.00 | 2,576,728.69 |
```
## Donate

View File

@@ -8,29 +8,11 @@
<ItemGroup>
<None Remove="spec.md" />
</ItemGroup>
<ItemGroup>
<Reference Include="CommonMarkNew, Version=0.1.0.0, Culture=neutral, PublicKeyToken=001ef8810438905d, processorArchitecture=MSIL">
<HintPath>lib\CommonMarkNew.dll</HintPath>
<SpecificVersion>False</SpecificVersion>
<Aliases>newcmark</Aliases>
<Private>True</Private>
</Reference>
<Reference Include="MoonShine">
<HintPath>lib\MoonShine.dll</HintPath>
</Reference>
<Reference Include="MarkdownDeep">
<HintPath>lib\MarkdownDeep.dll</HintPath>
</Reference>
</ItemGroup>
<ItemGroup>
<Content Include="cmark.dll">
<HintPath>cmark.dll</HintPath>
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</Content>
<Content Include="libsundown.dll">
<HintPath>libsundown.dll</HintPath>
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</Content>
<Content Include="spec.md">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</Content>
@@ -39,9 +21,9 @@
<PackageReference Include="BenchmarkDotNet" Version="0.13.1" />
<PackageReference Include="BenchmarkDotNet.Diagnostics.Windows" Version="0.13.1" />
<PackageReference Include="CommonMark.NET" Version="0.15.1" />
<PackageReference Include="Markdown" Version="2.2.1" />
<PackageReference Include="MarkdownSharp" Version="2.0.5" />
<PackageReference Include="Microsoft.Diagnostics.Runtime" Version="2.0.226801" />
<PackageReference Include="Microsoft.Diagnostics.Tracing.TraceEvent" Version="2.0.74" />
</ItemGroup>
<ItemGroup>
<ProjectReference Include="..\Markdig\Markdig.csproj" />

View File

@@ -1,9 +1,7 @@
// Copyright (c) Alexandre Mutel. All rights reserved.
// Copyright (c) Alexandre Mutel. All rights reserved.
// This file is licensed under the BSD-Clause 2 license.
// See the license.txt file in the project root for more information.
extern alias newcmark;
using System;
using System.Diagnostics;
using System.IO;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Configs;
@@ -25,7 +23,7 @@ namespace Testamina.Markdig.Benchmarks
}
//[Benchmark(Description = "TestMarkdig", OperationsPerInvoke = 4096)]
[Benchmark]
[Benchmark(Description = "markdig")]
public void TestMarkdig()
{
//var reader = new StreamReader(File.Open("spec.md", FileMode.Open));
@@ -33,7 +31,7 @@ namespace Testamina.Markdig.Benchmarks
//File.WriteAllText("spec.html", writer.ToString());
}
[Benchmark]
[Benchmark(Description = "cmark")]
public void TestCommonMarkCpp()
{
//var reader = new StreamReader(File.Open("spec.md", FileMode.Open));
@@ -41,7 +39,7 @@ namespace Testamina.Markdig.Benchmarks
//File.WriteAllText("spec.html", writer.ToString());
}
[Benchmark]
[Benchmark(Description = "CommonMark.NET")]
public void TestCommonMarkNet()
{
////var reader = new StreamReader(File.Open("spec.md", FileMode.Open));
@@ -55,93 +53,25 @@ namespace Testamina.Markdig.Benchmarks
//writer.ToString();
}
[Benchmark]
public void TestCommonMarkNetNew()
{
////var reader = new StreamReader(File.Open("spec.md", FileMode.Open));
// var reader = new StringReader(text);
//CommonMark.CommonMarkConverter.Parse(reader);
//CommonMark.CommonMarkConverter.Parse(reader);
//reader.Dispose();
//var writer = new StringWriter();
newcmark::CommonMark.CommonMarkConverter.Convert(text);
//writer.Flush();
//writer.ToString();
}
[Benchmark]
public void TestMarkdownDeep()
{
new MarkdownDeep.Markdown().Transform(text);
}
[Benchmark]
[Benchmark(Description = "MarkdownSharp")]
public void TestMarkdownSharp()
{
new MarkdownSharp.Markdown().Transform(text);
}
[Benchmark]
public void TestMoonshine()
{
Sundown.MoonShine.Markdownify(text);
}
static void Main(string[] args)
{
bool markdig = args.Length == 0;
bool simpleBench = false;
var config = ManualConfig.Create(DefaultConfig.Instance);
//var gcDiagnoser = new MemoryDiagnoser();
//config.Add(new Job { Mode = Mode.SingleRun, LaunchCount = 2, WarmupCount = 2, IterationTime = 1024, TargetCount = 10 });
//config.Add(new Job { Mode = Mode.Throughput, LaunchCount = 2, WarmupCount = 2, TargetCount = 10 });
//config.Add(gcDiagnoser);
if (simpleBench)
{
var clock = Stopwatch.StartNew();
var program = new Program();
GC.Collect(2, GCCollectionMode.Forced, true);
var gc0 = GC.CollectionCount(0);
var gc1 = GC.CollectionCount(1);
var gc2 = GC.CollectionCount(2);
const int count = 12*64;
for (int i = 0; i < count; i++)
{
if (markdig)
{
program.TestMarkdig();
}
else
{
program.TestCommonMarkNetNew();
}
}
clock.Stop();
Console.WriteLine((markdig ? "MarkDig" : "CommonMark") + $" => time: {(double)clock.ElapsedMilliseconds/count}ms (total {clock.ElapsedMilliseconds}ms)");
DumpGC(gc0, gc1, gc2);
}
else
{
//new TestMatchPerf().TestMatch();
var config = ManualConfig.Create(DefaultConfig.Instance);
//var gcDiagnoser = new MemoryDiagnoser();
//config.Add(new Job { Mode = Mode.SingleRun, LaunchCount = 2, WarmupCount = 2, IterationTime = 1024, TargetCount = 10 });
//config.Add(new Job { Mode = Mode.Throughput, LaunchCount = 2, WarmupCount = 2, TargetCount = 10 });
//config.Add(gcDiagnoser);
//var config = DefaultConfig.Instance;
BenchmarkRunner.Run<Program>(config);
//BenchmarkRunner.Run<TestDictionary>(config);
//BenchmarkRunner.Run<TestMatchPerf>();
//BenchmarkRunner.Run<TestStringPerf>();
}
}
private static void DumpGC(int gc0, int gc1, int gc2)
{
Console.WriteLine($"gc0: {GC.CollectionCount(0)-gc0}");
Console.WriteLine($"gc1: {GC.CollectionCount(1)-gc1}");
Console.WriteLine($"gc2: {GC.CollectionCount(2)-gc2}");
//var config = DefaultConfig.Instance;
BenchmarkRunner.Run<Program>(config);
//BenchmarkRunner.Run<TestDictionary>(config);
//BenchmarkRunner.Run<TestMatchPerf>();
//BenchmarkRunner.Run<TestStringPerf>();
}
}
}

Binary file not shown.

View File

@@ -0,0 +1,25 @@
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Markdig.Renderers.Roundtrip;
using Markdig.Syntax;
using NUnit.Framework;
using static Markdig.Tests.TestRoundtrip;
namespace Markdig.Tests.RoundtripSpecs
{
[TestFixture]
public class TestYamlFrontMatterBlock
{
[TestCase("---\nkey1: value1\nkey2: value2\n---\n\nContent\n")]
[TestCase("No front matter")]
[TestCase("Looks like front matter but actually is not\n---\nkey1: value1\nkey2: value2\n---")]
public void FrontMatterBlockIsPreserved(string value)
{
RoundTrip(value);
}
}
}

View File

@@ -0,0 +1,92 @@
using System.Collections.Generic;
using System.Globalization;
using Markdig.Helpers;
using NUnit.Framework;
namespace Markdig.Tests
{
public class TestCharHelper
{
// An ASCII punctuation character is
// !, ", #, $, %, &, ', (, ), *, +, ,, -, ., / (U+00212F),
// :, ;, <, =, >, ?, @ (U+003A0040),
// [, \, ], ^, _, ` (U+005B0060),
// {, |, }, or ~ (U+007B007E).
private static readonly HashSet<char> s_asciiPunctuation = new()
{
'!', '"', '#', '$', '%', '&', '\'', '(', ')', '*', '+', ',', '-', '.', '/',
':', ';', '<', '=', '>', '?', '@',
'[', '\\', ']', '^', '_', '`',
'{', '|', '}', '~'
};
// A Unicode punctuation character is an ASCII punctuation character or anything in the general Unicode categories
// Pc, Pd, Pe, Pf, Pi, Po, or Ps.
private static readonly HashSet<UnicodeCategory> s_punctuationCategories = new()
{
UnicodeCategory.ConnectorPunctuation,
UnicodeCategory.DashPunctuation,
UnicodeCategory.ClosePunctuation,
UnicodeCategory.FinalQuotePunctuation,
UnicodeCategory.InitialQuotePunctuation,
UnicodeCategory.OtherPunctuation,
UnicodeCategory.OpenPunctuation
};
private static bool ExpectedIsPunctuation(char c)
{
return c <= 127
? s_asciiPunctuation.Contains(c)
: s_punctuationCategories.Contains(CharUnicodeInfo.GetUnicodeCategory(c));
}
private static bool ExpectedIsWhitespace(char c)
{
// A Unicode whitespace character is any code point in the Unicode Zs general category,
// or a tab (U+0009), line feed (U+000A), form feed (U+000C), or carriage return (U+000D).
return c == '\t' || c == '\n' || c == '\u000C' || c == '\r' ||
CharUnicodeInfo.GetUnicodeCategory(c) == UnicodeCategory.SpaceSeparator;
}
[Test]
public void IsWhitespace()
{
for (int i = char.MinValue; i <= char.MaxValue; i++)
{
char c = (char)i;
Assert.AreEqual(ExpectedIsWhitespace(c), CharHelper.IsWhitespace(c));
}
}
[Test]
public void CheckUnicodeCategory()
{
for (int i = char.MinValue; i <= char.MaxValue; i++)
{
char c = (char)i;
bool expectedSpace = c == 0 || ExpectedIsWhitespace(c);
bool expectedPunctuation = c == 0 || ExpectedIsPunctuation(c);
CharHelper.CheckUnicodeCategory(c, out bool spaceActual, out bool punctuationActual);
Assert.AreEqual(expectedSpace, spaceActual);
Assert.AreEqual(expectedPunctuation, punctuationActual);
}
}
[Test]
public void IsSpaceOrPunctuation()
{
for (int i = char.MinValue; i <= char.MaxValue; i++)
{
char c = (char)i;
bool expected = c == 0 || ExpectedIsWhitespace(c) || ExpectedIsPunctuation(c);
Assert.AreEqual(expected, CharHelper.IsSpaceOrPunctuation(c));
}
}
}
}

View File

@@ -10,9 +10,7 @@ namespace Markdig.Tests
{
[TestCase("| S | T |\r\n|---|---| \r\n| G | H |")]
[TestCase("| S | T |\r\n|---|---|\t\r\n| G | H |")]
[TestCase("| S | T |\r\n|---|---|\v\r\n| G | H |")]
[TestCase("| S | T |\r\n|---|---|\f\r\n| G | H |")]
[TestCase("| S | T |\r\n|---|---|\f\v\t \r\n| G | H |")]
[TestCase("| S | \r\n|---|\r\n| G |\r\n\r\n| D | D |\r\n| ---| ---| \r\n| V | V |", 2)]
public void TestTableBug(string markdown, int tableCount = 1)
{

View File

@@ -15,7 +15,7 @@ namespace Markdig.Tests
[TestCase(/* markdownText: */ "# foo\nbar", /* expected: */ "foo\nbar\n")]
[TestCase(/* markdownText: */ "> foo", /* expected: */ "foo\n")]
[TestCase(/* markdownText: */ "> foo\nbar\n> baz", /* expected: */ "foo\nbar\nbaz\n")]
[TestCase(/* markdownText: */ "`foo`", /* expected: */ "foo\n")]
[TestCase(/* markdownText: */ "`foo`", /* expected: */ "foo\n")]
[TestCase(/* markdownText: */ "`foo\nbar`", /* expected: */ "foo bar\n")] // new line within codespan is treated as whitespace (Example317)
[TestCase(/* markdownText: */ "```\nfoo bar\n```", /* expected: */ "foo bar\n")]
[TestCase(/* markdownText: */ "- foo\n- bar\n- baz", /* expected: */ "foo\nbar\nbaz\n")]
@@ -23,7 +23,7 @@ namespace Markdig.Tests
[TestCase(/* markdownText: */ "- foo&lt;baz", /* expected: */ "foo<baz\n")]
[TestCase(/* markdownText: */ "## foo `bar::baz >`", /* expected: */ "foo bar::baz >\n")]
public void TestPlainEnsureNewLine(string markdownText, string expected)
{
{
var actual = Markdown.ToPlainText(markdownText);
Assert.AreEqual(expected, actual);
}
@@ -31,6 +31,7 @@ namespace Markdig.Tests
[Test]
[TestCase(/* markdownText: */ ":::\nfoo\n:::", /* expected: */ "foo\n", /*extensions*/ "customcontainers|advanced")]
[TestCase(/* markdownText: */ ":::bar\nfoo\n:::", /* expected: */ "foo\n", /*extensions*/ "customcontainers+attributes|advanced")]
[TestCase(/* markdownText: */ "| Header1 | Header2 | Header3 |\n|--|--|--|\nt**es**t|value2|value3", /* expected: */ "Header1 Header2 Header3 test value2 value3","pipetables")]
public void TestPlainWithExtensions(string markdownText, string expected, string extensions)
{
TestParser.TestSpec(markdownText, expected, extensions, plainText: true);

View File

@@ -16,10 +16,12 @@ namespace Markdig.Tests
{
var pipelineBuilder = new MarkdownPipelineBuilder();
pipelineBuilder.EnableTrackTrivia();
pipelineBuilder.UseYamlFrontMatter();
MarkdownPipeline pipeline = pipelineBuilder.Build();
MarkdownDocument markdownDocument = Markdown.Parse(markdown, pipeline);
var sw = new StringWriter();
var nr = new RoundtripRenderer(sw);
pipeline.Setup(nr);
nr.Write(markdownDocument);

View File

@@ -0,0 +1,78 @@
using System;
using System.Collections.Generic;
using System.Linq;
using Markdig.Extensions.Yaml;
using Markdig.Renderers;
using Markdig.Syntax;
using NUnit.Framework;
namespace Markdig.Tests
{
public class TestYamlFrontMatterExtension
{
[TestCaseSource(nameof(TestCases))]
public void ProperYamlFrontMatterRenderersAdded(IMarkdownObjectRenderer[] objectRenderers, bool hasYamlFrontMatterHtmlRenderer, bool hasYamlFrontMatterRoundtripRenderer)
{
var builder = new MarkdownPipelineBuilder();
builder.Extensions.Add(new YamlFrontMatterExtension());
var markdownRenderer = new DummyRenderer();
markdownRenderer.ObjectRenderers.AddRange(objectRenderers);
builder.Build().Setup(markdownRenderer);
Assert.That(markdownRenderer.ObjectRenderers.Contains<YamlFrontMatterHtmlRenderer>(), Is.EqualTo(hasYamlFrontMatterHtmlRenderer));
Assert.That(markdownRenderer.ObjectRenderers.Contains<YamlFrontMatterRoundtripRenderer>(), Is.EqualTo(hasYamlFrontMatterRoundtripRenderer));
}
private static IEnumerable<TestCaseData> TestCases()
{
yield return new TestCaseData(new IMarkdownObjectRenderer[]
{
}, false, false) {TestName = "No ObjectRenderers"};
yield return new TestCaseData(new IMarkdownObjectRenderer[]
{
new Markdig.Renderers.Html.CodeBlockRenderer()
}, true, false) {TestName = "Html CodeBlock"};
yield return new TestCaseData(new IMarkdownObjectRenderer[]
{
new Markdig.Renderers.Roundtrip.CodeBlockRenderer()
}, false, true) {TestName = "Roundtrip CodeBlock"};
yield return new TestCaseData(new IMarkdownObjectRenderer[]
{
new Markdig.Renderers.Html.CodeBlockRenderer(),
new Markdig.Renderers.Roundtrip.CodeBlockRenderer()
}, true, true) {TestName = "Html/Roundtrip CodeBlock"};
yield return new TestCaseData(new IMarkdownObjectRenderer[]
{
new Markdig.Renderers.Html.CodeBlockRenderer(),
new Markdig.Renderers.Roundtrip.CodeBlockRenderer(),
new YamlFrontMatterHtmlRenderer()
}, true, true) {TestName = "Html/Roundtrip CodeBlock, Yaml Html"};
yield return new TestCaseData(new IMarkdownObjectRenderer[]
{
new Markdig.Renderers.Html.CodeBlockRenderer(),
new Markdig.Renderers.Roundtrip.CodeBlockRenderer(),
new YamlFrontMatterRoundtripRenderer()
}, true, true) { TestName = "Html/Roundtrip CodeBlock, Yaml Roundtrip" };
}
private class DummyRenderer : IMarkdownRenderer
{
public DummyRenderer()
{
ObjectRenderers = new ObjectRendererCollection();
}
public event Action<IMarkdownRenderer, MarkdownObject> ObjectWriteBefore;
public event Action<IMarkdownRenderer, MarkdownObject> ObjectWriteAfter;
public ObjectRendererCollection ObjectRenderers { get; }
public object Render(MarkdownObject markdownObject)
{
return null;
}
}
}
}

View File

@@ -17,122 +17,154 @@ namespace Markdig.Extensions.Tables
{
protected override void Write(HtmlRenderer renderer, Table table)
{
renderer.EnsureLine();
renderer.Write("<table").WriteAttributes(table).WriteLine('>');
bool hasBody = false;
bool hasAlreadyHeader = false;
bool isHeaderOpen = false;
bool hasColumnWidth = false;
foreach (var tableColumnDefinition in table.ColumnDefinitions)
if (renderer.EnableHtmlForBlock)
{
if (tableColumnDefinition.Width != 0.0f && tableColumnDefinition.Width != 1.0f)
{
hasColumnWidth = true;
break;
}
}
renderer.EnsureLine();
renderer.Write("<table").WriteAttributes(table).WriteLine('>');
if (hasColumnWidth)
{
bool hasBody = false;
bool hasAlreadyHeader = false;
bool isHeaderOpen = false;
bool hasColumnWidth = false;
foreach (var tableColumnDefinition in table.ColumnDefinitions)
{
var width = Math.Round(tableColumnDefinition.Width*100)/100;
var widthValue = string.Format(CultureInfo.InvariantCulture, "{0:0.##}", width);
renderer.WriteLine($"<col style=\"width:{widthValue}%\" />");
if (tableColumnDefinition.Width != 0.0f && tableColumnDefinition.Width != 1.0f)
{
hasColumnWidth = true;
break;
}
}
}
foreach (var rowObj in table)
{
var row = (TableRow)rowObj;
if (row.IsHeader)
if (hasColumnWidth)
{
// Allow a single thead
if (!hasAlreadyHeader)
foreach (var tableColumnDefinition in table.ColumnDefinitions)
{
renderer.WriteLine("<thead>");
isHeaderOpen = true;
var width = Math.Round(tableColumnDefinition.Width * 100) / 100;
var widthValue = string.Format(CultureInfo.InvariantCulture, "{0:0.##}", width);
renderer.WriteLine($"<col style=\"width:{widthValue}%\" />");
}
hasAlreadyHeader = true;
}
else if (!hasBody)
{
if (isHeaderOpen)
{
renderer.WriteLine("</thead>");
isHeaderOpen = false;
}
renderer.WriteLine("<tbody>");
hasBody = true;
}
renderer.Write("<tr").WriteAttributes(row).WriteLine('>');
for (int i = 0; i < row.Count; i++)
foreach (var rowObj in table)
{
var cellObj = row[i];
var cell = (TableCell)cellObj;
renderer.EnsureLine();
renderer.Write(row.IsHeader ? "<th" : "<td");
if (cell.ColumnSpan != 1)
var row = (TableRow)rowObj;
if (row.IsHeader)
{
renderer.Write($" colspan=\"{cell.ColumnSpan}\"");
}
if (cell.RowSpan != 1)
{
renderer.Write($" rowspan=\"{cell.RowSpan}\"");
}
if (table.ColumnDefinitions.Count > 0)
{
var columnIndex = cell.ColumnIndex < 0 || cell.ColumnIndex >= table.ColumnDefinitions.Count
? i
: cell.ColumnIndex;
columnIndex = columnIndex >= table.ColumnDefinitions.Count ? table.ColumnDefinitions.Count - 1 : columnIndex;
var alignment = table.ColumnDefinitions[columnIndex].Alignment;
if (alignment.HasValue)
// Allow a single thead
if (!hasAlreadyHeader)
{
switch (alignment)
renderer.WriteLine("<thead>");
isHeaderOpen = true;
}
hasAlreadyHeader = true;
}
else if (!hasBody)
{
if (isHeaderOpen)
{
renderer.WriteLine("</thead>");
isHeaderOpen = false;
}
renderer.WriteLine("<tbody>");
hasBody = true;
}
renderer.Write("<tr").WriteAttributes(row).WriteLine('>');
for (int i = 0; i < row.Count; i++)
{
var cellObj = row[i];
var cell = (TableCell)cellObj;
renderer.EnsureLine();
renderer.Write(row.IsHeader ? "<th" : "<td");
if (cell.ColumnSpan != 1)
{
renderer.Write($" colspan=\"{cell.ColumnSpan}\"");
}
if (cell.RowSpan != 1)
{
renderer.Write($" rowspan=\"{cell.RowSpan}\"");
}
if (table.ColumnDefinitions.Count > 0)
{
var columnIndex = cell.ColumnIndex < 0 || cell.ColumnIndex >= table.ColumnDefinitions.Count
? i
: cell.ColumnIndex;
columnIndex = columnIndex >= table.ColumnDefinitions.Count ? table.ColumnDefinitions.Count - 1 : columnIndex;
var alignment = table.ColumnDefinitions[columnIndex].Alignment;
if (alignment.HasValue)
{
case TableColumnAlign.Center:
renderer.Write(" style=\"text-align: center;\"");
break;
case TableColumnAlign.Right:
renderer.Write(" style=\"text-align: right;\"");
break;
case TableColumnAlign.Left:
renderer.Write(" style=\"text-align: left;\"");
break;
switch (alignment)
{
case TableColumnAlign.Center:
renderer.Write(" style=\"text-align: center;\"");
break;
case TableColumnAlign.Right:
renderer.Write(" style=\"text-align: right;\"");
break;
case TableColumnAlign.Left:
renderer.Write(" style=\"text-align: left;\"");
break;
}
}
}
}
renderer.WriteAttributes(cell);
renderer.Write('>');
var previousImplicitParagraph = renderer.ImplicitParagraph;
if (cell.Count == 1)
{
renderer.ImplicitParagraph = true;
}
renderer.Write(cell);
renderer.ImplicitParagraph = previousImplicitParagraph;
renderer.WriteAttributes(cell);
renderer.Write('>');
renderer.WriteLine(row.IsHeader ? "</th>" : "</td>");
var previousImplicitParagraph = renderer.ImplicitParagraph;
if (cell.Count == 1)
{
renderer.ImplicitParagraph = true;
}
renderer.Write(cell);
renderer.ImplicitParagraph = previousImplicitParagraph;
renderer.WriteLine(row.IsHeader ? "</th>" : "</td>");
}
renderer.WriteLine("</tr>");
}
renderer.WriteLine("</tr>");
}
if (hasBody)
{
renderer.WriteLine("</tbody>");
if (hasBody)
{
renderer.WriteLine("</tbody>");
}
else if (isHeaderOpen)
{
renderer.WriteLine("</thead>");
}
renderer.WriteLine("</table>");
}
else if (isHeaderOpen)
else
{
renderer.WriteLine("</thead>");
//no html, just write the table contents
var impliciParagraph = renderer.ImplicitParagraph;
//enable implicit paragraphs to avoid newlines after each cell
renderer.ImplicitParagraph = true;
foreach (var rowObj in table)
{
var row = (TableRow)rowObj;
for (int i = 0; i < row.Count; i++)
{
var cellObj = row[i];
var cell = (TableCell)cellObj;
renderer.Write(cell);
//write a space after each cell to avoid text being merged with the next cell
renderer.Write(' ');
}
}
renderer.ImplicitParagraph = impliciParagraph;
}
renderer.WriteLine("</table>");
}
}
}

View File

@@ -1,10 +1,9 @@
// Copyright (c) Alexandre Mutel. All rights reserved.
// Copyright (c) Alexandre Mutel. All rights reserved.
// This file is licensed under the BSD-Clause 2 license.
// See the license.txt file in the project root for more information.
using Markdig.Parsers;
using Markdig.Renderers;
using Markdig.Renderers.Html;
namespace Markdig.Extensions.Yaml
{
@@ -24,9 +23,14 @@ namespace Markdig.Extensions.Yaml
public void Setup(MarkdownPipeline pipeline, IMarkdownRenderer renderer)
{
if (!renderer.ObjectRenderers.Contains<YamlFrontMatterRenderer>())
if (!renderer.ObjectRenderers.Contains<YamlFrontMatterHtmlRenderer>())
{
renderer.ObjectRenderers.InsertBefore<CodeBlockRenderer>(new YamlFrontMatterRenderer());
renderer.ObjectRenderers.InsertBefore<Renderers.Html.CodeBlockRenderer>(new YamlFrontMatterHtmlRenderer());
}
if (!renderer.ObjectRenderers.Contains<YamlFrontMatterRoundtripRenderer>())
{
renderer.ObjectRenderers.InsertBefore<Renderers.Roundtrip.CodeBlockRenderer>(new YamlFrontMatterRoundtripRenderer());
}
}
}

View File

@@ -11,7 +11,7 @@ namespace Markdig.Extensions.Yaml
/// Empty renderer for a <see cref="YamlFrontMatterBlock"/>
/// </summary>
/// <seealso cref="HtmlObjectRenderer{YamlFrontMatterBlock}" />
public class YamlFrontMatterRenderer : HtmlObjectRenderer<YamlFrontMatterBlock>
public class YamlFrontMatterHtmlRenderer : HtmlObjectRenderer<YamlFrontMatterBlock>
{
protected override void Write(HtmlRenderer renderer, YamlFrontMatterBlock obj)
{

View File

@@ -0,0 +1,27 @@
// Copyright (c) Alexandre Mutel. All rights reserved.
// This file is licensed under the BSD-Clause 2 license.
// See the license.txt file in the project root for more information.
using System;
using Markdig.Renderers;
using Markdig.Renderers.Roundtrip;
namespace Markdig.Extensions.Yaml
{
public class YamlFrontMatterRoundtripRenderer : MarkdownObjectRenderer<RoundtripRenderer, YamlFrontMatterBlock>
{
private readonly CodeBlockRenderer _codeBlockRenderer;
public YamlFrontMatterRoundtripRenderer()
{
_codeBlockRenderer = new CodeBlockRenderer();
}
protected override void Write(RoundtripRenderer renderer, YamlFrontMatterBlock obj)
{
renderer.Writer.WriteLine("---");
_codeBlockRenderer.Write(renderer, obj);
renderer.Writer.WriteLine("---");
}
}
}

View File

@@ -53,7 +53,7 @@ namespace Markdig.Helpers
// A right-flanking delimiter run is a delimiter run that is
// (1) not preceded by Unicode whitespace, and either
// (1a) not preceded by a punctuation character, or
// (2a) not preceded by a punctuation character, or
// (2b) preceded by a punctuation character and followed by Unicode whitespace or a punctuation character.
// For purposes of this definition, the beginning and the end of the line count as Unicode whitespace.
canClose = !prevIsWhiteSpace &&
@@ -144,9 +144,37 @@ namespace Markdig.Helpers
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool IsWhitespace(this char c)
{
// 2.1 Characters and lines
// A whitespace character is a space(U + 0020), tab(U + 0009), newline(U + 000A), line tabulation (U + 000B), form feed (U + 000C), or carriage return (U + 000D).
return c <= ' ' && (c == ' ' || c == '\t' || c == '\n' || c == '\v' || c == '\f' || c == '\r');
// 2.1 Characters and lines
// A Unicode whitespace character is any code point in the Unicode Zs general category,
// or a tab (U+0009), line feed (U+000A), form feed (U+000C), or carriage return (U+000D).
if (c <= ' ')
{
const long Mask =
(1L << ' ') |
(1L << '\t') |
(1L << '\n') |
(1L << '\f') |
(1L << '\r');
return (Mask & (1L << c)) != 0;
}
return c >= '\u00A0' && IsWhitespaceRare(c);
static bool IsWhitespaceRare(char c)
{
// return CharUnicodeInfo.GetUnicodeCategory(c) == UnicodeCategory.SpaceSeparator;
if (c < 5760)
{
return c == '\u00A0';
}
else
{
return c <= 12288 &&
(c == 5760 || IsInInclusiveRange(c, 8192, 8202) || c == 8239 || c == 8287 || c == 12288);
}
}
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
@@ -171,46 +199,47 @@ namespace Markdig.Helpers
// Check if a char is a space or a punctuation
public static void CheckUnicodeCategory(this char c, out bool space, out bool punctuation)
{
// Credits: code from CommonMark.NET
// Copyright (c) 2014, Kārlis Gaņģis All rights reserved.
// See license for details: https://github.com/Knagis/CommonMark.NET/blob/master/LICENSE.md
if (c <= 'ÿ')
if (IsWhitespace(c))
{
space = c == '\0' || c == ' ' || (c >= '\t' && c <= '\r') || c == '\u00a0' || c == '\u0085';
punctuation = c == '\0' || (c >= 33 && c <= 47) || (c >= 58 && c <= 64) || (c >= 91 && c <= 96) || (c >= 123 && c <= 126);
space = true;
punctuation = false;
}
else if (c <= 127)
{
space = c == '\0';
punctuation = c == '\0' || IsAsciiPunctuation(c);
}
else
{
var category = CharUnicodeInfo.GetUnicodeCategory(c);
space = category == UnicodeCategory.SpaceSeparator
|| category == UnicodeCategory.LineSeparator
|| category == UnicodeCategory.ParagraphSeparator;
punctuation = !space &&
(category == UnicodeCategory.ConnectorPunctuation
// A Unicode punctuation character is an ASCII punctuation character
// or anything in the general Unicode categories Pc, Pd, Pe, Pf, Pi, Po, or Ps.
space = false;
UnicodeCategory category = CharUnicodeInfo.GetUnicodeCategory(c);
punctuation = category == UnicodeCategory.ConnectorPunctuation
|| category == UnicodeCategory.DashPunctuation
|| category == UnicodeCategory.OpenPunctuation
|| category == UnicodeCategory.ClosePunctuation
|| category == UnicodeCategory.InitialQuotePunctuation
|| category == UnicodeCategory.FinalQuotePunctuation
|| category == UnicodeCategory.OtherPunctuation);
|| category == UnicodeCategory.OtherPunctuation;
}
}
// Same as CheckUnicodeCategory
internal static bool IsSpaceOrPunctuation(this char c)
{
if (c <= 'ÿ')
if (IsWhitespace(c))
{
return c == '\0' || c == ' ' || (c >= '\t' && c <= '\r') || c == '\u00a0' || c == '\u0085' ||
(c >= 33 && c <= 47 && c != 38) || (c >= 58 && c <= 64) || (c >= 91 && c <= 96) || (c >= 123 && c <= 126);
return true;
}
else if (c <= 127)
{
return c == '\0' || IsAsciiPunctuation(c);
}
else
{
var category = CharUnicodeInfo.GetUnicodeCategory(c);
return category == UnicodeCategory.SpaceSeparator
|| category == UnicodeCategory.LineSeparator
|| category == UnicodeCategory.ParagraphSeparator
|| category == UnicodeCategory.ConnectorPunctuation
return category == UnicodeCategory.ConnectorPunctuation
|| category == UnicodeCategory.DashPunctuation
|| category == UnicodeCategory.OpenPunctuation
|| category == UnicodeCategory.ClosePunctuation
@@ -289,44 +318,16 @@ namespace Markdig.Helpers
public static bool IsAsciiPunctuation(this char c)
{
// 2.1 Characters and lines
// An ASCII punctuation character is !, ", #, $, %, &, ', (, ), *, +, ,, -, ., /, :, ;, <, =, >, ?, @, [, \, ], ^, _, `, {, |, }, or ~.
switch (c)
{
case '!':
case '"':
case '#':
case '$':
case '%':
case '&':
case '\'':
case '(':
case ')':
case '*':
case '+':
case ',':
case '-':
case '.':
case '/':
case ':':
case ';':
case '<':
case '=':
case '>':
case '?':
case '@':
case '[':
case '\\':
case ']':
case '^':
case '_':
case '`':
case '{':
case '|':
case '}':
case '~':
return true;
}
return false;
// An ASCII punctuation character is
// !, ", #, $, %, &, ', (, ), *, +, ,, -, ., / (U+00212F),
// :, ;, <, =, >, ?, @ (U+003A0040),
// [, \, ], ^, _, ` (U+005B0060),
// {, |, }, or ~ (U+007B007E).
return c <= 127 && (
IsInInclusiveRange(c, 33, 47) ||
IsInInclusiveRange(c, 58, 64) ||
IsInInclusiveRange(c, 91, 96) ||
IsInInclusiveRange(c, 123, 126));
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]

View File

@@ -4,6 +4,7 @@
using System;
using System.Diagnostics.CodeAnalysis;
using System.Runtime.CompilerServices;
namespace Markdig.Helpers
{
@@ -193,7 +194,7 @@ namespace Markdig.Helpers
{
return false;
}
if (c == ' ' || c == '\n' || c == '"' || c == '\'' || c == '=' || c == '<' || c == '>' || c == '`')
if (IsSpaceOrSpecialHtmlChar(c))
{
break;
}
@@ -202,6 +203,26 @@ namespace Markdig.Helpers
c = text.NextChar();
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
static bool IsSpaceOrSpecialHtmlChar(char c)
{
if (c > '>')
{
return c == '`';
}
const long BitMask =
(1L << ' ')
| (1L << '\n')
| (1L << '"')
| (1L << '\'')
| (1L << '=')
| (1L << '<')
| (1L << '>');
return (BitMask & (1L << c)) != 0;
}
// We need at least one char after '='
if (matchCount == 0)
{
@@ -227,7 +248,7 @@ namespace Markdig.Helpers
while (true)
{
c = text.NextChar();
if (c.IsAlphaNumeric() || c == '_' || c == ':' || c == '.' || c == '-')
if (c.IsAlphaNumeric() || IsCharToAppend(c))
{
builder.Append(c);
}
@@ -235,6 +256,23 @@ namespace Markdig.Helpers
{
break;
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
static bool IsCharToAppend(char c)
{
if ((uint)(c - '-') > '_' - '-')
{
return false;
}
const long BitMask =
(1L << '_')
| (1L << ':')
| (1L << '.')
| (1L << '-');
return (BitMask & (1L << c)) != 0;
}
}
hasAttribute = true;

View File

@@ -3,7 +3,7 @@
// See the license.txt file in the project root for more information.
using System;
using System.Collections.Generic;
using System.Collections.Concurrent;
namespace Markdig.Helpers
{
@@ -13,14 +13,14 @@ namespace Markdig.Helpers
/// <typeparam name="T">Type of the object to cache</typeparam>
public abstract class ObjectCache<T> where T : class
{
private readonly Stack<T> builders;
private readonly ConcurrentQueue<T> _builders;
/// <summary>
/// Initializes a new instance of the <see cref="ObjectCache{T}"/> class.
/// </summary>
protected ObjectCache()
{
builders = new Stack<T>(4);
_builders = new ConcurrentQueue<T>();
}
/// <summary>
@@ -28,10 +28,7 @@ namespace Markdig.Helpers
/// </summary>
public void Clear()
{
lock (builders)
{
builders.Clear();
}
_builders.Clear();
}
/// <summary>
@@ -40,12 +37,9 @@ namespace Markdig.Helpers
/// <returns></returns>
public T Get()
{
lock (builders)
if (_builders.TryDequeue(out T instance))
{
if (builders.Count > 0)
{
return builders.Pop();
}
return instance;
}
return NewInstance();
@@ -60,10 +54,7 @@ namespace Markdig.Helpers
{
if (instance is null) ThrowHelper.ArgumentNullException(nameof(instance));
Reset(instance);
lock (builders)
{
builders.Push(instance);
}
_builders.Enqueue(instance);
}
/// <summary>

View File

@@ -55,7 +55,7 @@ namespace Markdig.Helpers
public static void ArgumentOutOfRangeException(string paramName) => throw new ArgumentOutOfRangeException(paramName);
[DoesNotReturn]
public static void ArgumentOutOfRangeException(string message, string paramName) => throw new ArgumentOutOfRangeException(message, paramName);
public static void ArgumentOutOfRangeException(string message, string paramName) => throw new ArgumentOutOfRangeException(paramName, message);
[DoesNotReturn]
public static void ArgumentOutOfRangeException_index() => throw new ArgumentOutOfRangeException("index");

View File

@@ -6,7 +6,6 @@ using System;
using System.IO;
using System.Linq;
using System.Reflection;
using Markdig.Extensions.SelfPipeline;
using Markdig.Helpers;
using Markdig.Parsers;
using Markdig.Renderers;
@@ -41,11 +40,11 @@ namespace Markdig
return DefaultPipeline;
}
var selfPipeline = pipeline.Extensions.Find<SelfPipelineExtension>();
if (selfPipeline is not null)
if (pipeline.SelfPipeline is not null)
{
return selfPipeline.CreatePipelineFromInput(markdown);
return pipeline.SelfPipeline.CreatePipelineFromInput(markdown);
}
return pipeline;
}

View File

@@ -4,7 +4,7 @@
using System;
using System.IO;
using System.Text;
using Markdig.Extensions.SelfPipeline;
using Markdig.Helpers;
using Markdig.Parsers;
using Markdig.Renderers;
@@ -13,11 +13,10 @@ namespace Markdig
{
/// <summary>
/// This class is the Markdown pipeline build from a <see cref="MarkdownPipelineBuilder"/>.
/// <para>An instance of <see cref="MarkdownPipeline"/> is immutable, thread-safe, and should be reused when parsing multiple inputs.</para>
/// </summary>
public sealed class MarkdownPipeline
{
// This class is immutable
/// <summary>
/// Initializes a new instance of the <see cref="MarkdownPipeline" /> class.
/// </summary>
@@ -36,6 +35,8 @@ namespace Markdig
InlineParsers = inlineParsers;
DebugLog = debugLog;
DocumentProcessed = documentProcessed;
SelfPipeline = Extensions.Find<SelfPipelineExtension>();
}
internal bool PreciseSourceLocation { get; set; }
@@ -54,6 +55,8 @@ namespace Markdig
internal ProcessDocumentDelegate? DocumentProcessed;
internal SelfPipelineExtension? SelfPipeline;
/// <summary>
/// True to parse trivia such as whitespace, extra heading characters and unescaped
/// string values.

View File

@@ -33,7 +33,7 @@ namespace Markdig.Parsers.Inlines
/// <summary>
/// The character of this emphasis.
/// </summary>
public char Character { get; }
public char Character { get; }
/// <summary>
/// The minimum number of character this emphasis is expected to have (must be >=1)

View File

@@ -0,0 +1,16 @@
// Copyright (c) Alexandre Mutel. All rights reserved.
// This file is licensed under the BSD-Clause 2 license.
// See the license.txt file in the project root for more information.
#if NET452 || NETSTANDARD2_0
namespace System.Collections.Concurrent
{
internal static class ConcurrentQueueExtensions
{
public static void Clear<T>(this ConcurrentQueue<T> queue)
{
while (queue.TryDequeue(out _)) { }
}
}
}
#endif

View File

@@ -4,9 +4,7 @@
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using Markdig.Helpers;
using Markdig.Syntax;
using Markdig.Syntax.Inlines;
@@ -19,7 +17,7 @@ namespace Markdig.Renderers
/// <seealso cref="IMarkdownRenderer" />
public abstract class RendererBase : IMarkdownRenderer
{
private readonly Dictionary<Type, IMarkdownObjectRenderer?> _renderersPerType = new();
private readonly Dictionary<KeyWrapper, IMarkdownObjectRenderer?> _renderersPerType = new();
internal int _childrenDepth = 0;
/// <summary>
@@ -29,11 +27,13 @@ namespace Markdig.Renderers
private IMarkdownObjectRenderer? GetRendererInstance(MarkdownObject obj)
{
var key = obj.GetType();
KeyWrapper key = GetKeyForType(obj);
Type objectType = obj.GetType();
for (int i = 0; i < ObjectRenderers.Count; i++)
{
var renderer = ObjectRenderers[i];
if (renderer.Accept(this, key))
if (renderer.Accept(this, objectType))
{
_renderersPerType[key] = renderer;
return renderer;
@@ -141,7 +141,7 @@ namespace Markdig.Renderers
// Calls before writing an object
ObjectWriteBefore?.Invoke(this, obj);
if (!_renderersPerType.TryGetValue(obj.GetType(), out IMarkdownObjectRenderer? renderer))
if (!_renderersPerType.TryGetValue(GetKeyForType(obj), out IMarkdownObjectRenderer? renderer))
{
renderer = GetRendererInstance(obj);
}
@@ -162,5 +162,25 @@ namespace Markdig.Renderers
// Calls after writing an object
ObjectWriteAfter?.Invoke(this, obj);
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static KeyWrapper GetKeyForType(MarkdownObject obj)
{
IntPtr typeHandle = Type.GetTypeHandle(obj).Value;
return new KeyWrapper(typeHandle);
}
private readonly struct KeyWrapper : IEquatable<KeyWrapper>
{
public readonly IntPtr Key;
public KeyWrapper(IntPtr key) => Key = key;
public bool Equals(KeyWrapper other) => Key == other.Key;
public override int GetHashCode() => Key.GetHashCode();
public override bool Equals(object? obj) => throw new NotImplementedException();
}
}
}

View File

@@ -63,7 +63,7 @@ namespace SpecFileGen
// NOTE: Beware of Copy/Pasting spec files - some characters may change (non-breaking space into space)!
static readonly Spec[] Specs = new[]
{
new Spec("CommonMark v. 0.29", "CommonMark.md", ""),
new Spec("CommonMarkSpecs", "CommonMark.md", ""),
new Spec("Pipe Tables", "PipeTableSpecs.md", "pipetables|advanced"),
new Spec("GFM Pipe Tables", "PipeTableGfmSpecs.md", "gfm-pipetables"),
new Spec("Footnotes", "FootnotesSpecs.md", "footnotes|advanced"),