The Treeside Markup Language

Git repository — Latest released version 0.1.1

Treeside is a lightweight, extensible, and general markup syntax. It combines the generality of XML with significant indentation and S-expression syntax to make documents that are easy to read, write, and process. Compare a Treeside document with the equivalent version in XML:

ebnf|
  r-name| expression
  r-prod| {nt expression}
          {nt binary operation}
          {nt expression}
  r-prod| {nt number}

  r-name| binary operation
  r-prod| [+-*/]

  r-name| number
  r-prod| {gplus {nt numeral}}

  r-name| numeral
  r-prod| [0123456789]
<ebnf>
<r-name>expression</r-name>
<r-prod><nt>expression</nt><nt>binary operation</nt><nt>expression</nt></r-prod>

<r-name>binary operation</r-name>
<r-prod>[+-*/]</r-prod>

<r-name>number</r-name>
<r-prod><gplus><nt>numeral</nt></gplus></r-prod>

<r-name>numeral</r-name>
<r-prod>[0123456789]</r-prod>
</ebnf>

Treeside is written in Scheme and comes with a standard library of macros for generating table of contents, indices, automatically formatted code documentation, HTML and more. Writing your own macros is easy.

Treeside is written for the R7RS and works on CHICKEN 5, Chibi Scheme, and Gauche 0.9.15. Ports to other Schemes, and to the R6RS will occur later. Treeside is currently in the early stages of development. Things may and will probably break, and a lot of the documentation will likely be outdated.

  1. The Treeside Markup Language
    1. Installing
    2. Quickstart: Your First Treeside Document
    3. The Syntax of Treeside
      1. The Rules for Block Syntax
      2. The Rules for Inline Syntax

Installing

Download Treeside from the Git repository.

Chibi Scheme and Gauche can run Treeside out of the box: just add the lib folder to your include path.

Treeside can also be compiled using CHICKEN. In the root of the repository, run

chicken-install treeside

Treeside's test suite depends on cuprate, which in turn depends on SRFI 225. See the cuprate repository for details.

Quickstart: Your First Treeside Document

(The following is written for Treeside 0.1.)

The following is a minimal standalone Treeside document showing off many of the features of the library:

head| link| @| href| public/treeside-simple.css
               rel| stylesheet
               type| text/css

nh1| My First Document

the-table-of-contents|

Hello, world!

This line is on a separate paragraph. {b HTML markup works as you would
expect it to.}

Here is an example of {href `https://floride.moe/treeside` Treeside}
specific markup.

nh2| Other types of markup

li| Here is a list.
li| The encompassing unordered list element is autoinferred.
    This block also shows off the significant whitespace markup of
    Treeside.

Treeside has a {label `treeside label` labeling} system, like HTML
id attributes.

{index {l `example of subindexing` `label`} {ref `treeside label` mentioned}}
It also has an indexing system that can automatically generate alphabetic
indexes.

pre<pre-block
If you have a large block of text,  like
a code listing or  a  particularly  long
section, you can use a delimited  block.
A  delimited  block looks  like a  block
with  significant  indentation,  but the
tag name  ends with a <.  The text after
the tag name is the delimiter.

b<bold-block
Delimited blocks  nest. They are  closed
in order, like XML tags.
bold-block>
pre-block>

nh3| EBNF Example

indexed-ebnf|
  r-name| s expression
  r-prod| `(` {gstar {ntref `s expression`}} `)`
  r-prod| {ntref `atom`}
  r-name| atom
  r-prod| {non-terminal a number}
  r-prod| {non-terminal a symbol}

nh3| Documentation Example

feature|
  library| (scheme base)
  procedure| name| call-with-current-continuation
             args| name| proc
  procedure| name| call/cc
             args| name| proc

  The procedure {var call-with-current-continuation} (or its
  equivalent abbreviation {var call/cc}) packages the current
  continuation (see the rationale below) as an “escape procedure”
  and passes it as an argument to {var proc}.

feature|
  syntax| {name let} {ntref bindings} {non-terminal body}
  ebnf|
    r-name| bindings
    r-prod| ({gstar {ntref binding}})
    r-name| binding
    r-prod| ({non-terminal identifier} {non-terminal expression})

  Binding form.

nh2| Index
the-index|

Paste this into a file named my-first-document.tsm in the Treeside root directory. Then run one of

chibi-scheme -I lib ./bin/markup-simple.scm < my-first-document.tsm \
                                            > my-first-document.html
gosh -I lib ./bin/markup-simple.scm < my-first-document.tsm \
                                    > my-first-document.html
csi ./bin/markup-simple.scm < my-first-document.tsm \
                            > my-first-document.html

Then open the resulting HTML document in a web browser. You should see the nicely formatted Treeside document you just wrote. (The default stylesheet uses a lot of CSS3, so it may not work on old browsers.)

The Syntax of Treeside

Treeside is made up of two, orthogonal markup syntaxes. The “block” syntax and the “inline” syntax, called iqexpr for “implicitly-quoted expression”. One could be used without the other, but they are designed to complement each other.

The Rules for Block Syntax

The block syntax uses significant indentation to determine which tags go where. The rules are:

  1. Each block has a tag. A tag starts with an ASCII alphamueric or @, and is followed by a sequence of zero or more ASCII alphanumeric characters, _, @, -, ., or :.

    Examples of valid tags: p, @, 1, my-custom tag.

    Examples of invalid tags: call/cc, lámbda, %.

  2. An indented block is opened with with one or more tags, each that end with | and are separated by at least one space.

    Examples: p| , div| , code| pre|.

  3. If there is non-tag text in the line after the tags, then that text is in node with those tags.

    For example, code| pre| asdf corresponds to <code><pre>asdf</pre></code>.

  4. If there was non-tag text in the line after the tags, then text is added to the node by indenting code at least as much as the text.

    For example,

    b| This text
       is bold.

    Corresponds to the XML <b>This text&x0A;is bold.</b>

  5. Text indented to the level one past the | of a previous tag closes the tags after it, but keeps text inside of it.

    For example:

    b| i| This text is bold
          and italic.
    b| i| This text is bold and italic,
       but this text is just bold.

    Corresponds to the XML

    <b><i>This text is bold&x0A;and italic.</i></b>
    <b><i>This text is bold and italic</i>x0A;but this text is just bold</b>
  6. A line that opens blocks but does not have text after it infers its indent from the next line. If the next line is indented at least one more than the first character of the first tag, then all text with at least that indent is a part of the blocks that were opened. The blocks are all closed once text that is not indented that far is encountered.

    For example:

    ul|
     li| This is a list item.
     li| The list items are under the <ul> node.
     li|
       code| pre|
         Here is some preformatted code
         in the list element.
       This text is a part of the list element, not the code block.

    Is equivalent to the XML:

    <ul>
     <li>This is a list item.</li>
     <li>The list items are under the &lt;ul&gt; node.</li>
     <li><code><pre>Here is some preformatted code
     in the list element.</pre></code>
     This text is a part of the list element, not the code block.</li></ul>
  7. A line that opens blocks whose last tag ends with < starts a delimited block. After < is any sequence that does not contain ASCII whitespace, and there must only be whitespace between the delimiter and the end of the line. The block is closed by repeating the delimiter followed by >, which closes all indented blocks enclosing the delimited block.

    The text in a delimted block does not have to be indented. Examples: Examples:

    code<my-code-block
    This code does not have to
    be indented.
    my-code-block>

    Corresponds to the XML:

    <code>This code does not have to&x0A;be indented.</code>
  8. The special tag verb does not interpret tags inside of it.

    Example:

    code| verb|
      p| This is an example of Treeside syntax.
      This is not actually interpreted as it.

    Corresponds to the XML:

    <code><verb>p| This is an example of Treeside syntax.
    This is not actually interpreted as it.</verb></code> 

The Rules for Inline Syntax

The inline syntax is very similar to S-expressions. The major difference between S-expressions and inline syntax (iqexprs) is that iqexprs keep almost all whitespace as “atmospheric” whitespace. Since iqexprs correspond to XML elements, they can only have the equivalent to symbols as their head. Hence they are more similar to SXML markup than general sexprs.

  1. A node is started using { and closed using }.

    Examples: {b This text is bold.} corresponds to <b>This text is bold.</b>, and {br} corresponds to <br />.

  2. A tag has the same lexical syntax as the block syntax. The tag must be followed by one or more spaces, which are discarded by the parser.

    Example: {b bold} and {b bold} parse the same.

  3. An inline verbatim node is started using one or more `, and closed using the same number of `. No parsing happens in between. The number of tick marks is preserved in the node.

    Example: `verbatim` parses to <verbatim><n>1</n>verbatim</verbatim>, and ``verbatim`` parses to <verbatim><n>2</n>verbatim</verbatim>.

  4. To prevent a character from being interpreted as the start of a node or the start of an inline verbatim, prefix it with \.

    Example: \{b bold\} does not make a node.

  5. Spaces before {, before and after }, and before and after the starting/ending characters of an inline verbatim block are marked up as atmospheric spaces.

    Example: This is {b bold text } and ` verbatim ` text. becomes (with newlines for clarity)

    This is<atmospheric> </atmospheric><b>bold text<atmospheric> </atmospheric></b>
    and<atmospheric> </atmospheric>
    <verbatim><n>1</n><atmospheric> </atmospheric>verbatim
    <atmospheric> </atmospheric></verbatim>
    <atmospheric> </atmospheric>text.

    The atmospheric spaces are removed in code sections, such as the attributes of a node, and are preserved everywhere else.