[Pipe Tables Extension] Normalization is inserting extra columns #384

Open
opened 2026-01-29 14:35:27 +00:00 by claunia · 7 comments
Owner

Originally created by @hamvocke on GitHub (Jul 30, 2020).

I'm working with markdig's Pipe Tables Extension and found a case where the specified behavior around normalization is different from what GitHub-flavored Markdown is specifying.

GFM says:

The remainder of the table’s rows may vary in the number of cells. If there are a number of cells fewer than the number of cells in the header row, empty cells are inserted. If there are greater, the excess is ignored

markdig's pipe tables spec says:

The tables are normalized to the maximum number of columns found in a table


This means, a table like this:

| abc | def |
| --- | --- |
| bar |
| bar | baz | boo |

is supposed to result in an HTML table that's got 2 columns according to GFM. The first body row gets an extra, empty <td> while the second body row doesn't include the boo cell.

Running this example through markdig with pipe tables enabled, however, generates a table that's got 3 columns, with the header and the first body row both gaining empty table cells.

I'm wondering if this deviation from GFM is intentional. Is there a reason why the two specifications treat normalization differently? And would it be desirable to make the normalization behavior configurable, e.g. by passing an option to the pipe tables plugin?

Originally created by @hamvocke on GitHub (Jul 30, 2020). I'm working with markdig's Pipe Tables Extension and found a case where the [specified behavior around normalization](https://github.com/lunet-io/markdig/blob/master/src/Markdig.Tests/Specs/PipeTableSpecs.md) is different from what [GitHub-flavored Markdown](https://github.github.com/gfm/#example-204) is specifying. GFM says: > The remainder of the table’s rows may vary in the number of cells. If there are a number of cells fewer than the number of cells in the header row, empty cells are inserted. If there are greater, the excess is ignored markdig's pipe tables spec says: > The tables are normalized to the maximum number of columns found in a table --- This means, a table like this: ``` | abc | def | | --- | --- | | bar | | bar | baz | boo | ``` is supposed to result in an HTML table that's got 2 columns according to GFM. The first body row gets an extra, empty `<td>` while the second body row doesn't include the `boo` cell. Running this example through markdig with pipe tables enabled, however, generates a table that's got 3 columns, with the header and the first body row both gaining empty table cells. I'm wondering if this deviation from GFM is intentional. Is there a reason why the two specifications treat normalization differently? And would it be desirable to make the normalization behavior configurable, e.g. by passing an option to the pipe tables plugin?
claunia added the enhancementPR Welcome! labels 2026-01-29 14:35:27 +00:00
Author
Owner

@xoofx commented on GitHub (Jul 30, 2020):

The GFM specs came officially much after Markdig pipe tables were developed. Before that was never really specified, probably more broken in terms of implicit specs than it is today. Never got the time to see how to get back to a closer behavior with GFM. A PR would be definitely welcome to do this work.

@xoofx commented on GitHub (Jul 30, 2020): The GFM specs came officially much after Markdig pipe tables were developed. Before that was never really specified, probably more broken in terms of implicit specs than it is today. Never got the time to see how to get back to a closer behavior with GFM. A PR would be definitely welcome to do this work.
Author
Owner

@hamvocke commented on GitHub (Jul 30, 2020):

That makes perfect sense, thanks for the explanation! Let me dig in and see what I can come up with to help here 👍

@hamvocke commented on GitHub (Jul 30, 2020): That makes perfect sense, thanks for the explanation! Let me dig in and see what I can come up with to help here 👍
Author
Owner

@xoofx commented on GitHub (Jul 30, 2020):

Let me dig in and see what I can come up with to help here 👍

Good! So there is a major difference in the way we are parsing things in Markdig for pipe table (here), so the implementation today is a bit convoluted and not easy to work with. So don't be afraid about your first impression (e.g WTF is this code doing 😅 )

@xoofx commented on GitHub (Jul 30, 2020): > Let me dig in and see what I can come up with to help here 👍 Good! So there is a major difference in the way we are parsing things in Markdig for pipe table ([here](https://talk.commonmark.org/t/parsing-strategy-for-tables/2027/49)), so the implementation today is a bit convoluted and not easy to work with. So don't be afraid about your first impression (e.g WTF is this code doing 😅 )
Author
Owner

@xoofx commented on GitHub (Jul 30, 2020):

Meaning that, if we wanted to have 100% compatibility with GFM, the implementation would have to be quite different (but probably simpler). It could be done as a new pipe table parser mode, while keeping the existing implementation the default.

@xoofx commented on GitHub (Jul 30, 2020): Meaning that, if we wanted to have 100% compatibility with GFM, the implementation would have to be quite different (but probably simpler). It could be done as a new pipe table parser mode, while keeping the existing implementation the default.
Author
Owner

@xoofx commented on GitHub (Jul 30, 2020):

(you can see some of the pathological differences in this example)

@xoofx commented on GitHub (Jul 30, 2020): (you can see some of the pathological differences in [this example](https://babelmark.github.io/?text=%7C+c+%7C+d+%7C%0A%7C+-+%7C+-+%7C%0A%7C+*a+%7C+b*+%7C%0A%7C+%60e+%7C+f%60+%7C%0A%7C+%5Bg+%7C+h%5D(http%3A%2F%2Fa.com)+%7C))
Author
Owner

@hamvocke commented on GitHub (Jul 30, 2020):

I'm trying to fix one problem at a time, and it seems like the normalization behavior is simple enough to understand and tackle 🙂

There are some other GFM incompatibilities that I've encountered but as you outline, other Markdown parsers out there are deviating from the GFM spec as well, often in the same way as Markdig - which happens to be working quite well for us.

@hamvocke commented on GitHub (Jul 30, 2020): I'm trying to fix one problem at a time, and it seems like the normalization behavior is simple enough to understand and tackle 🙂 There are some other GFM incompatibilities that I've encountered but as you outline, other Markdown parsers out there are deviating from the GFM spec as well, often in the same way as Markdig - which happens to be working quite well for us.
Author
Owner

@xoofx commented on GitHub (Jul 30, 2020):

There are some other GFM incompatibilities that I've encountered but as you outline, other Markdown parsers out there are deviating from the GFM spec as well, often in the same way as Markdig - which happens to be working quite well for us.

Indeed, I'm not sure it's worth the time fixing these specific incompatibilities (or to create an entire new GFM pipetable parser) as these discrepancies are mostly edge cases for which it's barely possible to find a acknowledge/common - end-user- expectation on them.

@xoofx commented on GitHub (Jul 30, 2020): > There are some other GFM incompatibilities that I've encountered but as you outline, other Markdown parsers out there are deviating from the GFM spec as well, often in the same way as Markdig - which happens to be working quite well for us. Indeed, I'm not sure it's worth the time fixing these specific incompatibilities (or to create an entire new GFM pipetable parser) as these discrepancies are mostly edge cases for which it's barely possible to find a acknowledge/common - end-user- expectation on them.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/markdig#384