Elm developer and educator. Founder of Incremental Elm Consulting.
October 8th, 2019
Extensible Markdown Parsing in Pure Elm
I'm excited to share a new approach to markdown parsing for the Elm ecosystem:
As a matter of fact, the blog post you're reading right now is being rendered with it.
Why does Elm need another markdown parser?
I built this tool so that I could:
Render markdown blocks using my preferred UI library (
, in my case, but you could use
or anything else!)
Extend what can be expressed beyond the standard markdown blocks like headings, lists, etc.
Inject data from my Elm model into my markdown
And yet, I wanted to do all of this with the benefits that come from using standard Markdown:
Great editor tooling (I write my blog posts in Ulysses using Markdown, and I have prettier set up to auto format markdown when I'm tweaking markdown directly in my code projects)
Previews render in Github
Easy for others to contribute (for example, to the
So how do you get the best of both worlds? There are three key features that give
rich extensibility without actually adding to the Markdown syntax:
Map HTML to custom Elm rendering functions
(for extensible markdown!)
(for custom rendering with your preferred styles and UI library)
Give users access to the parsed Markdown Blocks before rendering
(for inspecting, transforming, or extracting data from the parsed Markdown before passing it to your Markdown Renderer)
Let's explore these three key features in more depth.
⚙️ Map HTML to custom Elm rendering functions
I didn't want to add additional features that weren't already a part of Markdown syntax. Since HTML is already valid Markdown, it seemed ideal to just use declarative HTML tags to express these custom view elements.
leverages that to give you a declarative Elm syntax to explicitly say what kind of HTML is accepted (think JSON Decoders) and, given that accepted HTML, how to render it.
Markdown Within HTML Tags
What makes this especially useful is that we can render any Markdown content within our HTML tags. So you could have a Markdown file that looks like this.
And here's the output:
This is a nice way to abstract the presentation logic for team members' bios on an
page. We want richer presentation logic than plain markdown provides (for example, showing icons with the right dimensions, and displaying them in a row not column view, etc.) Also, since we're using Elm, we get pretty spoiled by explicit and precise error messages. So we'd like to get an error message if we don't provide a required attribute!
Here's the relevant code for handling the
HTML tag in our Markdown:
If we forget to pass in the required
attribute, we'll get an error message like this:
Avoiding low-level HTML in markdown
If you're familiar with
(it's Markdown syntax, but extended with some extra syntax from JSX, including like JS
s and JSX HTML tags). Guillermo Rauch, the creator of MDX even talks about the benefits that a more declarative approach, like the one
takes, could have over the current MDX approach of using low-level
statements and JSX syntax
in this talk (around 20:36 - 22:30)
Even with this declarative approach to explicitly allowing the HTML tags you want, it's possible to get very low-level and just create mappings to standard HTML tags. I like to treat the HTML tags within these markdown documents like Web Components rather than raw HTML. That means using it as a very high-level way of expressing your custom views. With standard Github-flavored markdown, you'll often see people injecting
tags with styles, or
tags, etc. I consider this too low-level to be injecting into Markdown in most cases. The Markdown document should be more declarative, concerned only with _what_ to render, not _how_ to render it.
🎨 Use custom renderers
Many Markdown libraries just give you the rendered HTML directly. With
, one of the main goals was to give you full control over presentation at the initial render (rather than needing to add CSS rules to apply to your rendered output). I personally like to use
whenever I can, so I wanted to use that directly not just for my navbar, but to style my rendered markdown blocks.
Beyond just rendering directly to your preferred UI library, custom Renderers also open up a number of new potential uses. You can render your Markdown into
s, but you could also render it to any other Elm type. That could be data, or even functions. Why would you render a function? Well, that would allow you to inject dynamic data from your Elm model!
Some other use cases that custom Renderers enable:
Render into ANSI color codes for rich formatting in terminal output
Render into plain text with all formatting stripped out (for search functionality)
Performing validations in Renderers
Another goal with
is to allow early and precise feedback. One of my favorite uses of Custom Renderers is to catch dead links (or images).
will stop the production build when the Renderer fails.
Here's the relevant code
🌳 Give users access to the parsed Markdown Blocks before rendering
Exposing the AST allows for a number of powerful use cases as well. And it does so without requiring you to dig into the internals. You just get access to a nice Elm custom type and you can do what you want with it before passing it on to your Custom Renderer.
Here are some use cases that this feature enables:
Extract metadata before rendering, like building a table of contents data structure with proper links (
here's an Ellie demo of that!
Run a validation and turn it into an
, for example, if there are multiple level 1 headings (having multiple
s on a page causes accessibility problems)
Transform the blocks by applying formatting rules, for example use a title casing function on all headings
Transform the AST before rendering it, for example dropping each heading down one level (H1s become H2s, etc.)
The future of dillonkearns/elm-markdown
I've been really enjoying using this in production for several weeks. But it certainly isn't fully handling all cases in Github-flavored markdown.
I'm running all 1400 end-to-end test cases from the Marked.js test suite (which is what
runs under the hood). And that test suite includes running through every example in the
Github-flavored markdown spec
. You can see nicely formatted markdown with all of the current failures
. It includes all failures from the Marked.js test suite, organized by feature area. I'm working through handling more of these cases to make it more widely useful, but feel free to use it now with that caveat in mind.
Pull requests are very welcome, I would love community contributions on this project! If you're interested in contributing, check out
the contributing guide in the Github repo
Fault-Tolerance Versus Helpful Errors
That said, the goal is not to get to 100% compliance with the Github-Flavored Markdown Spec. Markdown has a goal of being Fault-Tolerant, meaning it will always try to "do the best it can" rather than giving an error message when something unexpected happens. That means there's no such thing as "invalid markdown." But there is most certainly
"markup that probably doesn't do what you expected."
⚠️ This is technically
It "does the best it can" with the input and renders to a raw string rather than rendering a link. So this is an example that is squarely in the category of markup that
"probably doesn't do what you expected."
The goal of
is not fault-tolerance. It prioritizes
helpful error messages
So the rule of thumb for
Follow the Github-Flavored Markdown Spec whenever it doesn't cover up feedback about something that "probably doesn't do what you expected"
Otherwise, break with the Github-Flavored Markdown Spec and instead give a helpful error message
You can follow along with the
current GFM Spec Compliance here
Thanks for reading! If you give this library a try, let me know what you think. I'd love to hear from you!
You can keep the conversation going on the #elm-pages channel on
the Elm Slack
, or on this Twitter thread 👇