mirror of
https://github.com/xoofx/markdig.git
synced 2026-02-09 05:49:12 +00:00
Make HeadingBlockParser configurable #264
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @dustinmoris on GitHub (Jan 27, 2019).
Hi,
First of all this is a great library, I've been using it a lot for my blog and other projects and absolutely love it. I've got a use case where I would like to use Markdig to parse user entered markdown into HTML, but only a limited subset. One of the things which I would like to limit is the header tags (e.g. user should only be able to create h3, h4, h5, and h6, but not h1 and h2).
I would like to propose the following change in the
HeadingBlockParser:Add an optional parameter to the constructor (this is pseudo code):
And then change this if statement:
Before
After
Then I could configure the
HeaderBlockParserto limit the allowed headers via the constructor when setting up BlockParsers.What do you think?
@MihaZupan commented on GitHub (Jan 27, 2019):
I would think that it's not too common to not want h1, but want h3 for example. Therefore I believe that rather than changing the parser, post-processing the document AST to remove said headers would be more appropriate, since you can then decide whether to discard them all together or replace them with regular text
@xoofx commented on GitHub (Jan 27, 2019):
I agree, otherwise someone else could come with a different rule here. It is quite straightforward to process the AST document afterwards and adapt it to your specific requirements.
@dustinmoris commented on GitHub (Jan 27, 2019):
I think that is a very common scenario, because h1 has a special meaning in a HTML document or a particular HTML block. For example you shouldn't have more than one h1 in an
articleor ahgroup. If you want to display user entered markdown in a HTML block where you want to have control that there should be only one h1 then this is a VERY COMMON use case.But nevertheless...
If this is easy and fast then this would be good enough for me too. Have you got an example somewhere?
Thanks for the fast replies!
@dustinmoris commented on GitHub (Jan 27, 2019):
My proposed suggestion would have given the flexibility to implement a different rule, because anyone could set any combination of headers into the constructor (e.g.
1, 2, 3vs.3, 4, 5, 6, etc.), but I get your point, if you say that post processing is already the correct way of applying such rules in a fast and efficient way then I'm happy to do it. An example would be much appreciated!@MihaZupan commented on GitHub (Jan 27, 2019):
Something like this
@dustinmoris commented on GitHub (Jan 27, 2019):
Ok thanks for providing this code! I think this will work for my use case, but from an architectural POV I think that my proposed solution still makes sense. Apologies for trying to convince you, but I just want to throw in a few more thoughts and then you can decide if you agree or disagree :)
I think my proposed solution to include some h-tags but not others sounds weird because you have created one parser for all. In reality these tags should be considered as different elements. For example you treat a
preblock differently than ablockquoteorp. Why is that? It "feels" different because you know that most pages would want to style them differently and also, because you know that they have a different meaning.Well with headers it is the same actually. IMHO it would be more correct to have a
Header1BlockParserand aHeader2BlockParser, etc., but I do get the point how it made more sense to have a singleHeaderBlockParser. In realityh1tag is a different block than ah2orh3and they DO have a different meaning obviously. That's why HTML doesn't have a single header tag (e.g.<h order="1"></h>and<h order="3"></h>), but different ones (h1,h2, etc.).If you made the architectural decision to warp them all together in a single
HeaderBlockParserclass, then I think it would be only fair to expose some additional control via the constructor, to which tags should actually get parsed.I don't think this opens a door for more/different rules, because there is only 6 different headers and if you allow a user to configure in the constructor which elements should get parsed out of the 6 then you have already exhausted all possibilities which one might want to configure.
I personally think there would be value in doing this, but I can also live with the longer workaround of the post processing, but it feels a bit wrong to parse elements into the AST when I know that they should have never get parsed in the first place.
@xoofx commented on GitHub (Jan 27, 2019):
Let's keep the code as it is today. You can either copy/paste the HeadingBlockParser in your own project and modify it, or go to the post-processing route, if you need a special handling. We can revisit this later if we have more people coming here looking for a similar use case.
@dustinmoris commented on GitHub (Jan 28, 2019):
Ok thanks, I've went with copy pasting the current HeadingBlockParser and added my proposed changes and it works like a charm! Thanks for considering my proposal and keep up the great work!
@petefox commented on GitHub (Feb 10, 2020):
I would like to give a +1 on @dustinmoris 's suggestion.
For use cases like Blogs or Support/Documentation pages, where multiple contributors write content intended for public facing pages, it's important from an SEO point of view, to only have one H1 tag, which describes the title of the document.
This H1 would typically be rendered outside the markdown by the wrapping page, and defined in a separate input field on the editing page.
And yes, I know you could "just tell your users" to only use one H1 tag, and always place it in the top of the page - but that defeats the purpose of using something like Markdown, where you would want to restrict the editing/design options in order to create a more uniform layout and result markup.
I think it would make great sense to make an option that restricts the use of certain tags.