[PR #500] [MERGED] Add depth limits to avoid pathological-case parsing times/StackOverflows #1065

Open
opened 2026-01-29 14:49:18 +00:00 by claunia · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/xoofx/markdig/pull/500
Author: @MihaZupan
Created: 1/7/2021
Status: Merged
Merged: 1/14/2021
Merged by: @xoofx

Base: masterHead: depth-limit


📝 Commits (1)

📊 Changes

8 files changed (+66 additions, -4 deletions)

View changed files

📝 src/Markdig.Tests/MiscTests.cs (+23 -0)
📝 src/Markdig/Helpers/ThrowHelper.cs (+23 -0)
📝 src/Markdig/Markdig.targets (+1 -1)
📝 src/Markdig/Parsers/InlineProcessor.cs (+3 -3)
📝 src/Markdig/Parsers/MarkdownParser.cs (+1 -0)
📝 src/Markdig/Renderers/RendererBase.cs (+10 -0)
📝 src/Markdig/Renderers/TextRendererBase.cs (+1 -0)
📝 src/Markdig/Syntax/Block.cs (+4 -0)

📄 Description

Fixes #497

Due to the recursive approach to rendering, a very deep AST would trigger a StackOverflow during rendering.
This PR adds a general depth limit used in the rendering phase to detect deep nesting (currently 128 which can't affect any non-malicious inputs).
This shouldn't be problematic at all, offering a cheap fail-fast and StackOverflow protection.

Some inputs hit trivial O(n^2) cases due to depth - new string('[', 100_000) or new string('>', 100_000).
It also adds a large depth limit used to fail-fast such inputs that would otherwise take minutes to process.
The current limit of 10 * 1024 is rather arbitrary - on my machine it means that such inputs will fail after ~200 ms.
For comparison, 8192 results in ~125 ms and 16384 in ~500 ms.
Note: the problem with this limit is that it limits the maximum size of a pipe table due to the way the parser is implemented (#180) - ~10k table cells.

Should the second part (large limit also affecting tables) be configurable?
We may be able to avoid the table limitation in the future if we revisit the table parsing logic.

CC: @leonyang12

Users can semi-easily avoid the rendering StackOverflow by analyzing the AST after the parsing step (https://gist.github.com/MihaZupan/2e25659bc9af623589df1d1cb1ce712c).

Without a change directly in Markdig, I don't think avoiding the very long Parse times is practical.

This PR does not guarantee that ALL possible malicious-like inputs will fail-fast.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/xoofx/markdig/pull/500 **Author:** [@MihaZupan](https://github.com/MihaZupan) **Created:** 1/7/2021 **Status:** ✅ Merged **Merged:** 1/14/2021 **Merged by:** [@xoofx](https://github.com/xoofx) **Base:** `master` ← **Head:** `depth-limit` --- ### 📝 Commits (1) - [`f6f617a`](https://github.com/xoofx/markdig/commit/f6f617a746906ba0247cc901500ad5e59bda0c88) Add depth limits ### 📊 Changes **8 files changed** (+66 additions, -4 deletions) <details> <summary>View changed files</summary> 📝 `src/Markdig.Tests/MiscTests.cs` (+23 -0) 📝 `src/Markdig/Helpers/ThrowHelper.cs` (+23 -0) 📝 `src/Markdig/Markdig.targets` (+1 -1) 📝 `src/Markdig/Parsers/InlineProcessor.cs` (+3 -3) 📝 `src/Markdig/Parsers/MarkdownParser.cs` (+1 -0) 📝 `src/Markdig/Renderers/RendererBase.cs` (+10 -0) 📝 `src/Markdig/Renderers/TextRendererBase.cs` (+1 -0) 📝 `src/Markdig/Syntax/Block.cs` (+4 -0) </details> ### 📄 Description Fixes #497 Due to the recursive approach to rendering, a very deep AST would trigger a StackOverflow during rendering. This PR adds a general depth limit used in the rendering phase to detect deep nesting (currently 128 which can't affect any non-malicious inputs). This shouldn't be problematic at all, offering a cheap fail-fast and StackOverflow protection. Some inputs hit trivial O(n^2) cases due to depth - `new string('[', 100_000)` or `new string('>', 100_000)`. It also adds a large depth limit used to fail-fast such inputs that would otherwise take ***minutes*** to process. The current limit of `10 * 1024` is rather arbitrary - on my machine it means that such inputs will fail after ~200 ms. For comparison, 8192 results in ~125 ms and 16384 in ~500 ms. **Note: the problem with this limit is that it limits the maximum size of a pipe table due to the way the parser is implemented (#180) - ~10k table cells.** Should the second part (large limit also affecting tables) be configurable? We may be able to avoid the table limitation in the future if we revisit the table parsing logic. CC: @leonyang12 Users can semi-easily avoid the rendering StackOverflow by analyzing the AST after the parsing step (https://gist.github.com/MihaZupan/2e25659bc9af623589df1d1cb1ce712c). Without a change directly in Markdig, I don't think avoiding the very long `Parse` times is practical. This PR does **not guarantee** that ALL possible malicious-like inputs will fail-fast. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
claunia added the pull-request label 2026-01-29 14:49:18 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/markdig#1065