Using the Markdown Library
The most central part of the Markdown library is the function parse
:
-
markdown.Document parse(core.Str str)
Parse a string containing Markdown text. Returns a symbolic representation of the document. Throws a
MarkdownError
on failure.
As can be seen above, the function returns a markdown.Document
on success. This class
represents the root of the Markdown document as a sequence of elements
(markdown.ElementSeq
). The Document
itself has the following members apart from the
toS
member that outputs a formatted version of the document as Markdown:
-
markdown.ElementSeq elements
Elements in the document.
-
markdown.Html toHtml()
Create a HTML representation of the document.
-
void visit(markdown.Visitor v)
Visit the entire structure using a
Visitor
. The visitor allows modifying the elements as well.
Elements
The document consists of a list of elements. Each element corresponds to a piece of the document. Some elements, such as lists, may themselves contain other elements, or lists of elements. The document thus follows a tree-like structure.
As noted above, lists of elements are typically stored as the class markdown.ElementSeq
internally, in order to share some of the logic related to traversing such lists. The class has a
member elements
that is the stored array of elements. The ElementSeq
class is not an element
itself, and is typically safe to ignore in most cases.
The class markdown.Element
contains the following members, that all other elements need
to override:
-
markdown.Element visit(markdown.Visitor v)
Visit the element. The function is expected to run
v.visit(this)
as the last step. The returned value is the element that replaces this element after the visiting operation. If nothing is changed, then the object itself is returned. -
void toHtml(markdown.Html to)
Create an HTML representation of this object. The
Html
object is essentially a string stream, but with logic for escaping text in the context of HTML.
The following elements are provided by the library (and generated by the parser):
-
markdown.Heading
Represent a heading in Markdown. The level of the heading is stored in the
level
member (1 and upward). The contained text is stored in thetext
member.Produces
<hX>
tags in HTML, whereX
is the level. -
markdown.Paragraph
A paragraph of text in Markdown. The text is stored in the
text
member.Produces a
<p>
tag around the text in HTML. -
markdown.List
Represents a list in Markdown. The member
ordered
determines if the list is ordered or not. The elements in the list are stored in theitems
array. Each item consists of a sequence of elements (represented by anElementSeq
instance).The list produces
<ol>
or<ul>
elements in HTML. Each element will then be turned into a<li>
element. If the first element in the item is a paragraph, the<p>
tags around that element are removed. -
markdown.CodeBlock
Represents a block of code in Markdown. The language is available as the member
language
. The code is stored as an array of individual lines in thecode
member.It produces a
<pre>
tag in HTML. -
markdown.CustomHtml
This element is not produced by the Markdown parser. Rather, it can be used by libraries to insert custom HTML code into the document when producing HTML output.
The element has two member variables:
code
that is a string that contains the HTML code to produce, andskipIndent
that determines whether the code insidecode
should be indented to match the surrounding context. If the HTML code contains for example a<pre>
tag, this extra indentation might change the appearence of the code andskipIndent
must be set totrue
. -
markdown.PackedElements
This element is not produced by the Markdown parser. Rather, it is mainly intended to be used in conjunction with
Visitor
s. Whenever aPackedElements
object is inserted into anElementSeq
, the contained elements are inserted one by one. Thus, thePackedElements
are essentially flattened immediately.This is useful when one wishes to replace a single element with multiple elements using a
Visitor
, for example.
Formatted Text
The class markdown.FormattedText
represents a piece of text with formatting. Most of the
elements use this class to represent their contained text. A FormattedText
instance is a list of
markdown.TextSpan
classes that each represent individual pieces of the text, each with
possibly different formatting. As we shall see, some TextSpan
s contain FormattedText
objects
themselves. This makes it possible to apply more than one formatting option to a single piece of
text.
The FormattedText
class has the following members:
-
core.Array<markdown.TextSpan> spans
Text pieces.
-
init(core.Str)
Create a text span from a string. It will contain a single span.
-
init(markdown.TextSpan)
Create from a single span piece.
-
void add(markdown.TextSpan span)
Add a span.
-
void add(core.Str span)
Add a string as a plain span.
-
void add(markdown.FormattedText other)
Add another FormattedText object.
-
void addLine(markdown.FormattedText other)
Add another FormattedText object as a line of text.
-
void toHtml(markdown.Html to)
To HTML.
The TextSpan
class is similar to the Element
class, and has the following members that
subclasses need to implement:
-
void toHtml(markdown.Html to)
Create an HTML representation of the text span.
-
markdown.TextSpan visit(markdown.Visitor v)
Visit the element recursively. The function is expected to run
v.visit(this)
as the last step. The returned value is the element that replaces this span after the visiting operation. If nothing is changed, then the object itself is returned.
The following spans are provided by the library:
-
markdown.PlainTextSpan
A span that contains plain text (stored as
text
). This is essentially the leaf nodes of a tree of formatted text, since most other nodes contain aFormattedText
instance themselves. -
markdown.ItalicText
A span that corresponds to italic text. Contains formatted text stored as
text
. Produces<em>
tags in HTML. -
markdown.BoldText
A span that corresponds to bold text. Contains formatted text stored as
text
. Produces<strong>
tags in HTML. -
markdown.InlineCode
A span that corresponds to inline code. The code is stored as the member
text
, as a string. This means that no additional formatting may be applied to the text inside the code span. Produces<code>
tags in HTML. -
markdown.Link
A span that corresponds to a link. The text of the link is stored as
text
(as formatted text), and the target of the link is stored astarget
(as aStr
). Produces an<a>
tag with thetarget
as thehref
. -
markdown.CustomText
A span that corresponds to the appearance of a
[]
element without parentheses afterwards in the Markdown source. As mentioned previously in the manual, this element is purely used as an extension mechanism. The text inside the brackets are stored astext
(as aStr
). By default, the span simply outputstext
without modification. -
markdown.CustomInlineHtml
This span is not generated by the parser. It is intended to be used by others to inject custom HTML code in the context of a span. The HTML code is stored as the member
text
. -
markdown.PackedText
This element is not generated by the parser. Rather, it is intended to be used in conjunction with the
Visitor
, to allow replacing a single span with multiple spans. As such, whenever aPackedText
span is inserted into aFormattedText
object (or aPackedText
object), it is flattened into a list.
Visitors
The classes above all have a visit
function to allow traversing the hierarchy conveniently. The
visit
functions take a Visitor
as a parameter, and calls one of the member functions for each
element. Each of the member functions return an instance of the same object that was received. The
visitor
function uses the returned object to replace the original one. In this way, it is possible
to also replace elements in the hierarchy, regardless of what element they are located inside.
If it is necessary to replace one element with multiple elements, the classes
markdown.PackedElements
and markdown.PackedText
can be used.
The Visitor
class has the following members. The default implementation of all of them is to
simply return the object passed as a parameter, so that they implement a no-op by default:
-
markdown.Element visit(markdown.Element element)
Called for all elements.
-
markdown.ElementSeq visit(markdown.ElementSeq seq)
Called for all
ElementSeq
objects. Usually it is enough to consider onlyElement
s. -
markdown.TextSpan visit(markdown.TextSpan span)
Called for all
TextSpan
objects in the hierarcy. -
markdown.FormattedText visit(markdown.FormattedText text)
Called for all
FormattedText
objects in the hierarcy. Usually it is enough to consider onlyTextSpan
s.