[PR #305] [MERGED] Emoji and abbreviations parser #933

Open
opened 2026-01-29 14:47:28 +00:00 by claunia · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/xoofx/markdig/pull/305
Author: @MihaZupan
Created: 2/6/2019
Status: Merged
Merged: 2/8/2019
Merged by: @xoofx

Base: masterHead: emoji-and-abbreviations-parser


📝 Commits (8)

📊 Changes

9 files changed (+1383 additions, -257 deletions)

View changed files

📝 src/Markdig.Benchmarks/TestMatchPerf.cs (+3 -2)
📝 src/Markdig.Tests/Specs/AbbreviationSpecs.cs (+40 -1)
📝 src/Markdig.Tests/Specs/AbbreviationSpecs.md (+21 -0)
📝 src/Markdig/Extensions/Abbreviations/AbbreviationParser.cs (+38 -54)
📝 src/Markdig/Extensions/Emoji/EmojiParser.cs (+75 -72)
src/Markdig/Helpers/CompactPrefixTree.cs (+1110 -0)
src/Markdig/Helpers/TextMatcher.cs (+0 -127)
src/Markdig/Helpers/ThrowHelper.cs (+90 -0)
📝 src/Markdig/Markdig.csproj (+6 -1)

📄 Description

Fixes #296
Fewer memory allocations:

Building a new pipeline:

Method Mean Gen 0/1k Op Gen 1/1k Op Allocated Memory/Op
Markdig 12.05 us 19.1498 - 14.72 KB
Markdig_Advanced 42.11 us 45.1660 - 34.75 KB
Markdig_Advanced_Emoji 1,490.56 us 285.1563 130.8594 1502.8 KB
Markdig_Advanced_Emoji_Modified 1,499.65 us 298.8281 126.9531 1502.77 KB
(new) Method Mean Gen 0/1k Op Gen 1/1k Op Allocated Memory/Op
Markdig 12.04 us 19.1498 - 14.72 KB
Markdig_Advanced 41.97 us 45.1660 - 34.75 KB
Markdig_Advanced_Emoji 194.58 us 76.6602 15.3809 134.75 KB
Markdig_Advanced_Emoji_Modified 252.09 us 85.4492 18.5547 162.05 KB

Where Modified forces the lazy-init of dictionary properties.

Parsing speed for emojis and abbreviations is about the same (~10% faster for emojis),
as the dataset could be considered the worst-case for a prefix tree of this type (every input starts with the same character).


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/xoofx/markdig/pull/305 **Author:** [@MihaZupan](https://github.com/MihaZupan) **Created:** 2/6/2019 **Status:** ✅ Merged **Merged:** 2/8/2019 **Merged by:** [@xoofx](https://github.com/xoofx) **Base:** `master` ← **Head:** `emoji-and-abbreviations-parser` --- ### 📝 Commits (8) - [`b15b050`](https://github.com/xoofx/markdig/commit/b15b05015e6dff9a39147191d42a45156cd90ce0) Allow single-char abbreviations - [`ca38da5`](https://github.com/xoofx/markdig/commit/ca38da576ece9cad50e7c1c36f750807e119eaa4) Cross target NetCore 2.1 - [`d854b0b`](https://github.com/xoofx/markdig/commit/d854b0b941fadeb2a7eb9fc62410e7ea88d6bda3) Port CompactPrefixTree to Markdig - [`325495a`](https://github.com/xoofx/markdig/commit/325495a3676f0d8d56572e66613775a16c4b6550) Improve EmojiParser memory performance - [`ef452c2`](https://github.com/xoofx/markdig/commit/ef452c292c1ffdb41b27f0274f9c7e672ddbf776) Fix Abbreviations parser's one-char handling - [`18e9486`](https://github.com/xoofx/markdig/commit/18e9486c95f3931d6b95215b0332d28b311d499b) Remove TextMatchHelper - [`a11676e`](https://github.com/xoofx/markdig/commit/a11676ed7eebaf10b6acb71ef3e13f1f4cbe5c04) Add test case for #296 - [`b5293b9`](https://github.com/xoofx/markdig/commit/b5293b907f580d72c1a544a53608721c4c973391) Comment-out TextMatcher test in Benchmarks ### 📊 Changes **9 files changed** (+1383 additions, -257 deletions) <details> <summary>View changed files</summary> 📝 `src/Markdig.Benchmarks/TestMatchPerf.cs` (+3 -2) 📝 `src/Markdig.Tests/Specs/AbbreviationSpecs.cs` (+40 -1) 📝 `src/Markdig.Tests/Specs/AbbreviationSpecs.md` (+21 -0) 📝 `src/Markdig/Extensions/Abbreviations/AbbreviationParser.cs` (+38 -54) 📝 `src/Markdig/Extensions/Emoji/EmojiParser.cs` (+75 -72) ➕ `src/Markdig/Helpers/CompactPrefixTree.cs` (+1110 -0) ➖ `src/Markdig/Helpers/TextMatcher.cs` (+0 -127) ➕ `src/Markdig/Helpers/ThrowHelper.cs` (+90 -0) 📝 `src/Markdig/Markdig.csproj` (+6 -1) </details> ### 📄 Description Fixes #296 Fewer memory allocations: Building a new pipeline: | Method | Mean | Gen 0/1k Op | Gen 1/1k Op | Allocated Memory/Op | |-------------------------------- |------------:|------------:|------------:|--------------------:| | Markdig | 12.05 us | 19.1498 | - | 14.72 KB | | Markdig_Advanced | 42.11 us | 45.1660 | - | 34.75 KB | | Markdig_Advanced_Emoji | 1,490.56 us | 285.1563 | 130.8594 | 1502.8 KB | | Markdig_Advanced_Emoji_Modified | 1,499.65 us | 298.8281 | 126.9531 | 1502.77 KB | | (new) Method | Mean | Gen 0/1k Op | Gen 1/1k Op | Allocated Memory/Op | |-------------------------------- |------------:|------------:|------------:|--------------------:| | Markdig | 12.04 us | 19.1498 | - | 14.72 KB | | Markdig_Advanced | 41.97 us | 45.1660 | - | 34.75 KB | | Markdig_Advanced_Emoji | 194.58 us | 76.6602 | 15.3809 | 134.75 KB | | Markdig_Advanced_Emoji_Modified | 252.09 us | 85.4492 | 18.5547 | 162.05 KB | Where `Modified` forces the lazy-init of dictionary properties. Parsing speed for emojis and abbreviations is about the same (~10% faster for emojis), as the dataset could be considered the worst-case for a prefix tree of this type (every input starts with the same character). --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
claunia added the pull-request label 2026-01-29 14:47:28 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/markdig#933