[Help] Understanding of InlineParser Implementations #638

Closed
opened 2026-01-29 14:41:45 +00:00 by claunia · 2 comments
Owner

Originally created by @khalidabuhakmeh on GitHub (Oct 27, 2023).

👋 Hi @xoofx,

Thanks for offering help. My mental model is off somewhere, but I need to figure out where. The functionality of this parser isn't as important as understanding the process around StringSlice, which I find confusing when looking at other examples.

Given the following markdown, I expect the following result.

this is a link to [github:khalidabuhakmeh]
another [github:maartenba]

with the result.

<p>this is a link to <a href="https://github.com/khalidabuhakmeh/>khalidabuhakmeh</a>
another <a href="https://github.com/maartenba/>maartenba</a></p>

Instead, I get this. You'll notice the text this is a link to and another is missing from the final output. I'm not sure what I'm doing wrong. If I had to guess, I'm skipping the text, but other processors would handle that.

<p><a href="https://github.com/khalidabuhakmeh/>khalidabuhakmeh</a>
<a href="https://github.com/maartenba/>maartenba</a></p>

My current implementation is below. No rush. I'm trying to write a blog post about this subject because I find it fascinating, and this is a great library. Cheers :)

using System.Text.RegularExpressions;
using Markdig;
using Markdig.Helpers;
using Markdig.Parsers;
using Markdig.Renderers;
using Markdig.Syntax.Inlines;

var pipeline = new MarkdownPipelineBuilder()
    .Use<GitHubUserProfileExtension>()
    .Build();

// Note: The text of "this is a link to" gets swallowed up.
var html = Markdown
    .ToHtml("""
            this is a link to [github:khalidabuhakmeh]
            another [github:maartenba]
            """, pipeline);

Console.WriteLine(html);

public class GitHubUserProfileExtension : IMarkdownExtension
{
    public void Setup(MarkdownPipelineBuilder pipeline)
    {
        if (!pipeline.InlineParsers.Contains<GitHubUserProfileParser>())
        {
            pipeline.InlineParsers.Insert(0, new GitHubUserProfileParser());
        }
    }

    public void Setup(MarkdownPipeline pipeline, IMarkdownRenderer renderer)
    {
    }
}

public class GitHubUserProfileParser : InlineParser
{
    public override bool Match(InlineProcessor processor, ref StringSlice slice)
    {
        var precedingCharacter = slice.PeekCharExtra(-1);
        if (!precedingCharacter.IsWhiteSpaceOrZero())
        {
            return false;
        }

        while (!slice.Match("[github:"))
        {
            // keep skipping
            slice.NextChar();
        }
        
        var regex = new Regex(@"\[github:(?<username>\w+)]");
        var match = regex.Match(slice.ToString());
        
        if (!match.Success)
        {
            return false;
        }
        
        var username = match.Groups["username"].Value;
        var literal = $"<a href=\"https://github.com/{username}/>{username}</a>";
        
        processor.Inline = new HtmlInline(literal)
        {
            Span =
            {
                Start = processor.GetSourcePosition(slice.Start, out var line, out var column)
            },
            Line = line,
            Column = column,
            IsClosed = true
        };
        processor.Inline.Span.End = processor.Inline.Span.Start + match.Length - 1;
        slice.Start += match.Length;
        return true;
    }
}
Originally created by @khalidabuhakmeh on GitHub (Oct 27, 2023). 👋 Hi @xoofx, Thanks for offering help. My mental model is off somewhere, but I need to figure out where. The functionality of this parser isn't as important as understanding the process around `StringSlice`, which I find confusing when looking at other examples. Given the following markdown, I expect the following result. ```markdown this is a link to [github:khalidabuhakmeh] another [github:maartenba] ``` with the result. ```html <p>this is a link to <a href="https://github.com/khalidabuhakmeh/>khalidabuhakmeh</a> another <a href="https://github.com/maartenba/>maartenba</a></p> ``` Instead, I get this. You'll notice the text `this is a link to` and `another` is missing from the final output. I'm not sure what I'm doing wrong. If I had to guess, I'm skipping the text, but other processors would handle that. ```html <p><a href="https://github.com/khalidabuhakmeh/>khalidabuhakmeh</a> <a href="https://github.com/maartenba/>maartenba</a></p> ``` My current implementation is below. No rush. I'm trying to write a blog post about this subject because I find it fascinating, and this is a great library. Cheers :) ```csharp using System.Text.RegularExpressions; using Markdig; using Markdig.Helpers; using Markdig.Parsers; using Markdig.Renderers; using Markdig.Syntax.Inlines; var pipeline = new MarkdownPipelineBuilder() .Use<GitHubUserProfileExtension>() .Build(); // Note: The text of "this is a link to" gets swallowed up. var html = Markdown .ToHtml(""" this is a link to [github:khalidabuhakmeh] another [github:maartenba] """, pipeline); Console.WriteLine(html); public class GitHubUserProfileExtension : IMarkdownExtension { public void Setup(MarkdownPipelineBuilder pipeline) { if (!pipeline.InlineParsers.Contains<GitHubUserProfileParser>()) { pipeline.InlineParsers.Insert(0, new GitHubUserProfileParser()); } } public void Setup(MarkdownPipeline pipeline, IMarkdownRenderer renderer) { } } public class GitHubUserProfileParser : InlineParser { public override bool Match(InlineProcessor processor, ref StringSlice slice) { var precedingCharacter = slice.PeekCharExtra(-1); if (!precedingCharacter.IsWhiteSpaceOrZero()) { return false; } while (!slice.Match("[github:")) { // keep skipping slice.NextChar(); } var regex = new Regex(@"\[github:(?<username>\w+)]"); var match = regex.Match(slice.ToString()); if (!match.Success) { return false; } var username = match.Groups["username"].Value; var literal = $"<a href=\"https://github.com/{username}/>{username}</a>"; processor.Inline = new HtmlInline(literal) { Span = { Start = processor.GetSourcePosition(slice.Start, out var line, out var column) }, Line = line, Column = column, IsClosed = true }; processor.Inline.Span.End = processor.Inline.Span.Start + match.Length - 1; slice.Start += match.Length; return true; } } ```
claunia added the question label 2026-01-29 14:41:45 +00:00
Author
Owner

@xoofx commented on GitHub (Oct 27, 2023):

I don't know if it is the only reason, but you are missing declaring the opening character for your inline parser (e.g like here)

Without it, I don't recall the details, but it might not work at all.

@xoofx commented on GitHub (Oct 27, 2023): I don't know if it is the only reason, but you are missing declaring the opening character for your inline parser (e.g like [here](https://github.com/xoofx/markdig/blob/7d40bc118bc28095f01699d3d1759ae5a1c18b80/src/Markdig/Parsers/Inlines/HtmlEntityParser.cs#L24)) Without it, I don't recall the details, but it might not work at all.
Author
Owner

@khalidabuhakmeh commented on GitHub (Oct 27, 2023):

🤦 That fixed it. I'm a dummy. I appreciate your help.

Now, the parser only starts when it sees an opening character [, which is the expected behavior. Previously, it sent the entire text block, which is too much.

@khalidabuhakmeh commented on GitHub (Oct 27, 2023): 🤦 That fixed it. I'm a dummy. I appreciate your help. Now, the parser only starts when it sees an opening character `[`, which is the expected behavior. Previously, it sent the entire text block, which is too much.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/markdig#638