Why can't you re-use the same object for processor.Inline inside a parser? #312

Closed
opened 2026-01-29 14:33:33 +00:00 by claunia · 4 comments
Owner

Originally created by @deanebarker on GitHub (Jul 14, 2019).

I encountered a really interesting problem. I figured it out, but I'm curious why it is so, and leaving this here so that someone else might find it.

When parsing in an extension, you need to set processor.Inline to a unique object. If you re-use the same object, only the first instance of that object in the entire document will be replaced.

For example, say I have a Widget that can have a Color value. To save coding, I add some static properties with pre-configured objects.

public class Widget
{
  public string Color { get; set; }

  public static Black = new Widget() { Color = "Black" };
  public static White = new Widget() { Color = "White" };
}

So, in my parsing code, I can do this:

processor.Inline = Widget.Black;

This causes the problem. The first Widget.Black will be correctly replaced, but all others will not render anything. It's as if they get removed when that first one is processed. They don't even render the thing they were replacing. They just...don't...render.

I fixed it by just changing my parser code to the longer syntax I was trying to avoid:

processor.Inline = new Widget() { Color = "Black" };

After that, it worked perfectly.

Why would this be?

Originally created by @deanebarker on GitHub (Jul 14, 2019). I encountered a really interesting problem. I figured it out, but I'm curious why it is so, and leaving this here so that someone else might find it. When parsing in an extension, you need to set `processor.Inline` to a _unique object_. If you re-use the same object, only the _first_ instance of that object in the entire document will be replaced. For example, say I have a `Widget` that can have a `Color` value. To save coding, I add some static properties with pre-configured objects. ``` public class Widget { public string Color { get; set; } public static Black = new Widget() { Color = "Black" }; public static White = new Widget() { Color = "White" }; } ``` So, in my parsing code, I can do this: ``` processor.Inline = Widget.Black; ``` This causes the problem. The _first_ `Widget.Black` will be correctly replaced, but all others will not render anything. It's as if they get removed when that first one is processed. They don't even render the thing they were replacing. They just...don't...render. I fixed it by just changing my parser code to the longer syntax I was trying to avoid: ``` processor.Inline = new Widget() { Color = "Black" }; ``` After that, it worked perfectly. Why would this be?
Author
Owner

@MihaZupan commented on GitHub (Jul 14, 2019):

When you set the processor.Inline, the processor will check if the new Inline has a parent set. If it does, it will be discarded. Therefore if you reuse the same object, the first time you do it will have the parent set and be discarded any other time.

You can keep the nice sintax by changing the field to a get-only property.

public static Widget Black = new Widget() { Color = "Black" };
// to
public static Widget Black => new Widget() { Color = "Black" };
@MihaZupan commented on GitHub (Jul 14, 2019): When you set the `processor.Inline`, the processor will check if the new `Inline` has a parent set. If it does, it will be discarded. Therefore if you reuse the same object, the first time you do it will have the parent set and be discarded any other time. You can keep the nice sintax by changing the field to a get-only property. ```c# public static Widget Black = new Widget() { Color = "Black" }; // to public static Widget Black => new Widget() { Color = "Black" }; ```
Author
Owner

@deanebarker commented on GitHub (Jul 14, 2019):

@MihaZupan I will close the issue, but can you explain the logic? Why does discarding an object make logical sense? I know there's a reason, I'd just like to understand it.

@deanebarker commented on GitHub (Jul 14, 2019): @MihaZupan I will close the issue, but can you explain the logic? Why does discarding an object make logical sense? I know there's a reason, I'd just like to understand it.
Author
Owner

@MihaZupan commented on GitHub (Jul 14, 2019):

If an object already has a parent, it can not be reinserted. If you try to manually push a child (Inline or Block) into a container when that child has a parent set, you will get an exception. I believe xoofx said this was done to prevent bugs from things being inserted multiple times.

After adding an inline, processor.Inline will NOT be reset as it may be used by other parsers. For example the LiteralInlineParser will simply correct the end of the span of processor.Inline if that is also a LiteralInline and it starts off when the last one ended, to avoid allocating new objects.
The InlineProcessor will therefore not attempt to insert such Inlines as they have already been inserted before and just happen to still be the last Inline there. See source.

@MihaZupan commented on GitHub (Jul 14, 2019): If an object already has a parent, it can not be reinserted. If you try to manually push a child (Inline or Block) into a container when that child has a parent set, you will get an exception. I believe xoofx said this was done to prevent bugs from things being inserted multiple times. After adding an inline, `processor.Inline` will NOT be reset as it may be used by other parsers. For example the `LiteralInlineParser` [will simply correct the end of the span](https://github.com/lunet-io/markdig/blob/master/src/Markdig/Parsers/Inlines/LiteralInlineParser.cs#L70-L75) of `processor.Inline` if that is also a LiteralInline and it starts off when the last one ended, to avoid allocating new objects. The InlineProcessor will therefore not attempt to insert such Inlines as they have already been inserted before and just happen to still be the last Inline there. See [source](https://github.com/lunet-io/markdig/blob/master/src/Markdig/Parsers/InlineProcessor.cs#L244).
Author
Owner

@xoofx commented on GitHub (Jul 15, 2019):

but can you explain the logic? Why does discarding an object make logical sense? I know there's a reason, I'd just like to understand it.

As @MihaZupan rightly explained, an object set to Inline that doesn't have a parent will be added to the closest container otherwise it will be considered as already added (so it's incorrect to say that it is discarded).
Objects sets in Inline are used as part of an object tree for which a child can only have one parent.
But a parser could decide also to not create a new inline object but just to extend it, so having a direct access to Processor.Inline allows to do this - without having to rescan the closest container and re-fetch current inline

@xoofx commented on GitHub (Jul 15, 2019): > but can you explain the logic? Why does discarding an object make logical sense? I know there's a reason, I'd just like to understand it. As @MihaZupan rightly explained, an object set to Inline that doesn't have a parent will be added to the closest container otherwise it will be considered as already added (so it's incorrect to say that it is discarded). Objects sets in Inline are used as part of an object tree for which a child can only have one parent. But a parser could decide also to not create a new inline object but just to extend it, so having a direct access to Processor.Inline allows to do this - without having to rescan the closest container and re-fetch current inline
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/markdig#312