Suppose you have your own website or blog and want to write your articles in markdown to leverage their look and feel. To deliver HTML to the user of your website, you make use of a Python-based conversion pipeline or have implemented your own pipeline using Python-Markdown.
Don't have a conversion pipeline, yet? Check out this article, where I show you how to implement one on your own!
What is missing at this point is a feature to show the user a very important piece of information is coming up or insert further editorial notes into your article, e.g. links to external resources. While block quotes are a way to handle this, you are (at least out-of-the-box) limited to a single style of block. Furthermore, if you want to use block quotes for what they exists (inserting quotes from papers, famous people, you name it), they cannot be the solution for your issue.
This is where bootstrap's alert-boxes come into play! Wouldn't it be nice to use these well-designed boxes in your articles as well? Bootstrap is a very popular and widely used CSS framework. You will likely end up with a bootstrap-based website on your own. Consequently, implementing a Python-Markdown extension, which converts a certain piece of "markdown code" into bootstrap alert-boxes, would be a nice feature!
The Python version used in this article is Python 3.9 (CPython).
If you can't wait to use the extension, you can get it from PyPI. Detailed instructions can be found here.
Let us tackle this task by starting to implement a basic Markdown to HTML conversion pipeline [1]. First, open a terminal in your working directory, create a virtual environment and activate it.
$ python -m venv .venv
$ source .venv/bin/activate
Next, we create a directory called resources, where we will store the sample markdown article, we want to convert. The resulting HTML file will be stored there as well.
$ mkdir resources
$ touch resources/article.md
The content of article.md is as follows:
# This Is The Title
## Sample Subtitle
- first item
- second item
- third item
:: info
This is an info block.
::
As you can see, we have a title, subtitle, bullet list with three items and an unknown block starting with two colons and ending with two. These colons will be our markers with which we identify the beginning and ending of an alert-box. Let us have a closer look at the future alert-box.
The first line consists of a start token (the two colons) and the word info.
Whatever comes after the colon is used as the alert-class, e.g. alert-info
.
With the second line, the content of the alert-box starts.
It can consist of multiple lines.
The end of the alert-box is marked with the two colons in line 11.
With the idea in mind on how the pattern for an alert-box should look like in Markdown, we can start implementing the conversion pipeline for normal Markdown code first. To do so, we need to install the only dependency we have: Python-Markdown [2].
$ python -m pip install Markdown
Our simple pipeline will live in a single file called pipeline.py, which is located in the project's root directory next to the resources directory.
$ touch pipeline.py
The project directory should look like this:
$ tree .
.
├── pipeline.py
└── resources
└── article.md
1 directories, 2 files
We start our pipeline.py file by importing the argparse module, which we will use later for creating a simple command-line interface (CLI), the Path
object used for path representations, and the newly installed Python-Markdown package (markdown
).
# pipeline.py
import argparse
from pathlib import Path
import markdown
DOCUMENT = """
<!DOCTYPE html>
<html lang="en">
<head>
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.0.2/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-EVSTQN3/azprG1Anm3QDgpJLIm9Nao0Yz1ztcQTwFspd3yD65VohhpuuCOmLASjC" crossorigin="anonymous">
</head>
<body>
<div class="container">
{text}
</div>
</body>
</html>
"""
The second part of the code snippet at hand is the template for the resulting HTML document.
The essential parts are the loading of Bootstrap in the head as well as the placeholder {text}
inside of the body.
The placeholder will be replaced by the HTML code, which is generated by the pipeline.
Next, we create a main()
function, which accepts two parameters: A path to the source file and a path to the destination file.
# previous code in pipeline.py
def main(source: Path, destination: Path) -> None:
extensions = []
md = markdown.Markdown(extensions=extensions)
with open(source) as f:
text = f.read()
html = md.convert(text)
result = DOCUMENT.format(text=html)
with open(destination, "w") as f:
f.write(result)
Within the main()
function, we create a list of all the extensions our pipeline should use.
At the moment, this list is empty.
Subsequently, we create an instance of Python-Markdown's Markdown
object and assign it to a local variable called md
.
The next two lines of code are responsible for loading the markdown content of our source file, followed by the conversion of this content to HTML.
We inject the generated HTML code into the document template and store the result in a local variable called result
.
Last but not least, the two remaining lines of code are responsible for writing the final HTML document to its destination.
Congratulations, you created a simple Markdown to HTML conversion pipeline!
To make it more useful, we implement another function get_parser()
, which serves as a helper function to create a simple CLI using the standard library module argparse.
This article is not about argparse in specific, so I will not dive deep into what is happening here.
In essence, the following code snippet shows the definition of a simple argument parser accepting exactly two arguments: The source and destination files.
# previous code in pipeline.py
def get_parser() -> argparse.ArgumentParser:
_parser = argparse.ArgumentParser(description="Markdown Conversion Pipeline CLI")
_parser.add_argument(
"source",
type=Path,
help="The path to the source file",
)
_parser.add_argument(
"destination",
type=Path,
help="The path to the destination file",
)
return _parser
if __name__ == "__main__":
parser = get_parser()
args = parser.parse_args()
main(args.source, args.destination)
The remaining four lines at the end are only executed if the pipeline.py file is called directly from the command-line.
First, the argument parser is created by calling the function get_parser()
.
Secondly, the arguments supplied are parsed.
Lastly, the main()
function is invoked and the two paths are supplied.
We finished the pipeline for now - fantastic! You can check that it is working by running it directly from your shell:
$ python pipeline.py resources/article.md resources/article.html
There should be a new file in your resources directory called article.html. If you open the file in your browser, you should see something like this:
As you can see, the standard Markdown portions of the article.md file are rendered properly but the special alert-box part is not. To get it recognised and rendered, too, we need to create a Markdown extension.
We start by creating a new file in our project directory called alert_extension.py. To get a sense of the structure of the final extension, here is the file's content using placeholders:
import re
from textwrap import dedent
from markdown import Extension
from markdown.preprocessors import Preprocessor
SNIPPET = ""
HEADINGS = {}
class AlertExtension(Extension):
pass
class AlertBlockProcessor(Preprocessor):
pass
def makeExtension(**kwargs):
return AlertExtension(**kwargs)
Besides the import-statements, the file consists of four major parts:
AlertExtension
, which represents the alert-box extension.AlertBlockProcessor
, which will look for our pattern and replace it with the desired HTML code.makeExtension()
function is used to deliver our extension.
At this point, it is the only component, which is already complete.The first global variable is called SNIPPET
and will contain an HTML-template for our alert-box, so that we can inject the content as well as certain parameters (like "alert level").
SNIPPET = '''<div class="alert alert-{level}" role="alert">
<h4 class="alert-heading"><strong>{heading}</strong></h4>
{alert}
</div>
'''
Our alert-box pattern will be replaced by the given div
-block.
The level (e.g. info), which we extract from the pattern, will be inserted as a part of a CSS-class.
Furthermore, we will try to extract a heading from the pattern.
If none is given, we will fall back on default values store in the second global variable.
Last but not least, the content of the alert-box pattern will be injected by replacing the {alert}
part of the given snippet.
As already mentioned, the second global variable is called HEADINGS
and contains default headings per alert level.
The user will later be capable to overwrite these values, but we want to have some default values.
HEADINGS = {
"primary": "",
"secondary": "Note",
"success": "Congratulations!",
"danger": "Caution!",
"warning": "Warning",
"info": "Info",
"light": "Note",
"dark": "Note",
}
Now that we defined the global variables, let us continue by implementing the AlertExtension
itself.
The class inherits from the Extension
class provided by Python-Markdown.
We overwrite two methods: __init__()
and extendMarkdown()
.
class AlertExtension(Extension):
def __init__(self, **kwargs):
self.config = {}
super().__init__(**kwargs)
def extendMarkdown(self, md):
md.registerExtension(self)
md.preprocessors.register(
AlertBlockProcessor(md, self.getConfigs()
), 'alert_block', 25)
The __init__()
method is pretty straightforward.
In essence, we only call the parent __init__()
function.
The definition of a config
attribute is only meant to give you the idea that a basic configuration can be provided and part of the extension's implementation.
The extendMarkdown()
method takes a Markdown
object, registers the extension and signals which processor to call as well as when to do it.
In our case, we want the extension to register a new pre-processor (the later implemented AlertBlockProcessor
), give it a name (alert_block
) and when to call it.
Now, we are coming to the essential part of the extension: Recognising our pre-defined pattern and processing it. I will show you the result first and explain it in greater detail afterwards.
class AlertBlockProcessor(Preprocessor):
ALERT_BLOCK_RE = re.compile(
dedent(r'''
:: (?P<level>[^\[\s+]+)( heading=["'](?P<heading>[^"']+)["'])?
(?P<alert>[\s\S]*?)
::
'''),
)
def __init__(self, md, config):
super().__init__(md)
self.config = config
def run(self, lines):
text = "\n".join(lines)
while 1:
m = self.ALERT_BLOCK_RE.search(text)
if m:
level = 'debug'
if m.group('level'):
level = m.group('level')
heading = HEADINGS.get(level, "Note")
if m.group("heading"):
heading = m.group("heading")
alert = self.md.convert(m.group("alert"))
snippet = SNIPPET.format(level=level, alert=alert, heading=heading)
placeholder = self.md.htmlStash.store(snippet)
text = '{}\n{}\n{}'.format(
text[:m.start()],
placeholder,
text[m.end():],
)
else:
break
return text.split("\n")
In essence, the code snippet at hand consists of three parts: A regular expression [3] stored in the local variable ALERT_BLOCK_RE
, an __init__()
method, and a run()
method.
The regular expression matches the pattern, we defined earlier: It starts with two colons followed by the alert level and an optional heading=
part for overwriting the default ones.
Subsequently, every line is taken as part of the alert message or message body until the end token (two colons) is reached.
You can use RegEx101 [4] to check your regular expressions.
The __init__()
method is again pretty straightforward, as it is only invoking the parent's __init__()
method and the (empty) config to overwrite an attribute with the same name.
The run()
method is the more complex method but don't worry, we will go through it.
First, all the lines are joined together to get the whole text at once (otherwise the pattern won't be recognised).
Second, an infinite loop is started, were we search for the pattern in the text and assign the first match to the variable m
.
If we indeed found the matter, the information are extracted, put into the template stored in the global variable SNIPPET
, and injected into the text by replacing the old text (Markdown pattern).
If no new matches are found, we leave the infinite loop.
Last but not least, the text is split again into lines and these lines are returned.
You implemented your first Python-Markdown extension and have now support for bootstrap alert-boxes!
To use the extension, we need to integrate it in our pipeline. To do so, we need to make two minor adjustments. First, we need to import the extension. This is as easy as adding the following import-statement to our pipeline.py file:
from alert_extension import AlertExtension
Second, we need to add an instance of the extension to our list of extensions, which was previously empty:
extensions = [AlertExtension()]
And that's it! To see the results of our hard work, invoke the pipeline to re-generate the article.html file:
$ python pipeline.py resources/article.md resources/article.html
If you open the re-generated article.html file in a browser, you should see something like this:
As you can see, using the extension not only leads to the rendering of standard Markdown code but also our special alert-box pattern.
Congratulations, you have made it through the article! You learnt how to implement a basic Markdown to HTML conversion pipeline as well as how to extend it by creating your own Python-Markdown extension. This extension enables you to support bootstrap alert-boxes in your Markdown articles as well.
I hope you enjoyed reading the article. Make sure to share it with your friends and colleagues. If you have not already, make sure to follow me on Twitter, where I am @DahlitzF or subscribe to the newsletter.
Stay curious and keep coding!