prettypr

A generic pretty printer library.

A generic pretty printer library. This module uses a strict-style context passing implementation of John Hughes algorithm, described in "The design of a Pretty-printing Library". The paragraph-style formatting, empty documents, floating documents, and null strings are my own additions to the algorithm.

To get started, you should read about the document() data type; the main constructor functions: text/1, above/2, beside/2, nest/2, sep/1, and par/2; and the main layout function format/3.

If you simply want to format a paragraph of plain text, you probably want to use the text_par/2 function, as in the following example:

  prettypr:format(prettypr:text_par("Lorem ipsum dolor sit amet"), 20)

Types


document() =
            null |
            #text{s = undefined | deep_string()} |
            #nest{n = undefined | integer(), d = undefined | document()} |
            #beside{d1 = undefined | document(),
                    d2 = undefined | document()} |
            #above{d1 = undefined | document(),
                   d2 = undefined | document()} |
            #sep{ds = undefined | [document()],
                 i = integer(),
                 p = boolean()} |
            #float{d = undefined | document(),
                   h = undefined | integer(),
                   v = undefined | integer()} |
            #union{d1 = undefined | document(),
                   d2 = undefined | document()} |
            #fit{d = undefined | document()}

Functions


above(D1::document(), D2::document()) -> document()

Concatenates documents vertically. Returns a document representing the concatenation of the documents D1 and D2 such that the first line of D2 follows directly below the last line of D1, and the first character of D2 is in the same horizontal column as the first character of D1, in all possible layouts.

Examples:

     ab  cd  =>  ab
                 cd
 
                    abc
     abc   fgh  =>   de
      de    ij      fgh
                     ij

beside(D1::document(), D2::document()) -> document()

Concatenates documents horizontally. Returns a document representing the concatenation of the documents D1 and D2 such that the last character of D1 is horizontally adjacent to the first character of D2, in all possible layouts. (Note: any indentation of D2 is lost.)

Examples:

     ab  cd  =>  abcd
 
     ab  ef      ab
     cd  gh  =>  cdef
                   gh

best(D::document(), PaperWidth::integer(), LineWidth::integer()) -> empty | document()

Selects a "best" layout for a document, creating a corresponding fixed-layout document. If no layout could be produced, the atom empty is returned instead. For details about PaperWidth and LineWidth, see format/3. The function is idempotent.

One possible use of this function is to compute a fixed layout for a document, which can then be included as part of a larger document. For example:

     above(text("Example:"), nest(8, best(D, W - 12, L - 6)))

will format D as a displayed-text example indented by 8, whose right margin is indented by 4 relative to the paper width W of the surrounding document, and whose maximum individual line length is shorter by 6 than the line length L of the surrounding document.

This function is used by the format/3 function to prepare a document before being laid out as text.

break(D::document()) -> document()

Forces a line break at the end of the given document. This is a utility function; see empty/0 for details.

empty() -> document()

Yields the empty document, which has neither height nor width. (empty is thus different from an empty text string, which has zero width but height 1.)

Empty documents are occasionally useful; in particular, they have the property that above(X, empty()) will force a new line after X without leaving an empty line below it; since this is a common idiom, the utility function break/1 will place a given document in such a context.

See also: text/1.

floating(D::document()) -> document()

Equivalent to floating(D, 0, 0).

floating(D::document(), Hp::integer(), Vp::integer()) -> document()

Creates a "floating" document. The result represents the same set of layouts as D; however, a floating document may be moved relative to other floating documents immediately beside or above it, according to their relative horizontal and vertical priorities. These priorities are set with the Hp and Vp parameters; if omitted, both default to zero.

Notes: Floating documents appear to work well, but are currently less general than you might wish, losing effect when embedded in certain contexts. It is possible to nest floating-operators (even with different priorities), but the effects may be difficult to predict. In any case, note that the way the algorithm reorders floating documents amounts to a "bubblesort", so don't expect it to be able to sort large sequences of floating documents quickly.

follow(D1::document(), D2::document()) -> document()

Equivalent to follow(D1, D2, 0).

follow(D1::document(), D2::document(), Offset::integer()) -> document()

Separates two documents by either a single space, or a line break and intentation. In other words, one of the layouts

     abc def

or

     abc
      def

will be generated, using the optional offset in the latter case. This is often useful for typesetting programming language constructs.

This is a utility function; see par/2 for further details.

See also: follow/2.

format(D::document()) -> string()

Equivalent to format(D, 80).

format(D::document(), PaperWidth::integer()) -> string()

Equivalent to format(D, PaperWidth, 65).

format(D::document(), PaperWidth::integer(), LineWidth::integer()) -> string()

Computes a layout for a document and returns the corresponding text. See document() for further information. Throws no_layout if no layout could be selected.

PaperWidth specifies the total width (in character positions) of the field for which the text is to be laid out. LineWidth specifies the desired maximum width (in number of characters) of the text printed on any single line, disregarding leading and trailing white space. These parameters need to be properly balanced in order to produce good layouts. By default, PaperWidth is 80 and LineWidth is 65.

See also: best/3.

nest(N::integer(), D::document()) -> document()

Indents a document a number of character positions to the right. Note that N may be negative, shifting the text to the left, or zero, in which case D is returned unchanged.

null_text(Characters::string()) -> document()

Similar to text/1, but the result is treated as having zero width. This is regardless of the actual length of the string. Null text is typically used for markup, which is supposed to have no effect on the actual layout.

The standard example is when formatting source code as HTML to be placed within <pre>...</pre> markup, and using e.g. <i> and <b> to make parts of the source code stand out. In this case, the markup does not add to the width of the text when viewed in an HTML browser, so the layout engine should simply pretend that the markup has zero width.

See also: empty/0, text/1.

par(Docs::[document()]) -> document()

Equivalent to par(Ds, 0).

par(Docs::[document()], Offset::integer()) -> document()

Arranges documents in a paragraph-like layout. Returns a document representing all possible left-aligned paragraph-like layouts of the (nonempty) sequence Docs of documents. Elements in Docs are separated horizontally by a single space character and vertically with a single line break. All lines following the first (if any) are indented to the same left column, whose indentation is specified by the optional Offset parameter relative to the position of the first element in Docs. For example, with an offset of -4, the following layout can be produced, for a list of documents representing the numbers 0 to 15:

         0 1 2 3
     4 5 6 7 8 9
     10 11 12 13
     14 15

or with an offset of +2:

     0 1 2 3 4 5 6
       7 8 9 10 11
       12 13 14 15

The utility function text_par/2 can be used to easily transform a string of text into a par representation by splitting it into words.

Note that whenever a document in Docs contains a line break, it will be placed on a separate line. Thus, neither a layout such as

     ab cd
        ef

nor

     ab
     cd ef

will be generated. However, a useful idiom for making the former variant possible (when wanted) is beside(par([D1, text("")], N), D2) for two documents D1 and D2. This will break the line between D1 and D2 if D1 contains a line break (or if otherwise necessary), and optionally further indent D2 by N character positions. The utility function follow/3 creates this context for two documents D1 and D2, and an optional integer N.

See also: par/1, text_par/2.

sep(Docs::[document()]) -> document()

Arranges documents horizontally or vertically, separated by whitespace. Returns a document representing two alternative layouts of the (nonempty) sequence Docs of documents, such that either all elements in Docs are concatenated horizontally, and separated by a space character, or all elements are concatenated vertically (without extra separation).

Note: If some document in Docs contains a line break, the vertical layout will always be selected.

Examples:

                                  ab
     ab  cd  ef  =>  ab cd ef  |  cd
                                  ef
 
     ab           ab
     cd  ef  =>   cd
                  ef

See also: par/2.

text(Characters::string()) -> document()

Yields a document representing a fixed, unbreakable sequence of characters. The string should contain only printable characters (tabs allowed but not recommended), and not newline, line feed, vertical tab, etc. A tab character (\t) is interpreted as padding of 1-8 space characters to the next column of 8 characters within the string.

See also: empty/0, null_text/1, text_par/2.

text_par(Text::string()) -> document()

Equivalent to text_par(Text, 0).

text_par(Text::string(), Indentation::integer()) -> document()

Yields a document representing paragraph-formatted plain text. The optional Indentation parameter specifies the extra indentation of the first line of the paragraph. For example, text_par("Lorem ipsum dolor sit amet", N) could represent

     Lorem ipsum dolor
     sit amet

if N = 0, or

       Lorem ipsum
     dolor sit amet

if N = 2, or

     Lorem ipsum dolor
       sit amet

if N = -2.

(The sign of the indentation is thus reversed compared to the par/2 function, and the behaviour varies slightly depending on the sign in order to match the expected layout of a paragraph of text.)

Note that this is just a utility function, which does all the work of splitting the given string into words separated by whitespace and setting up a par with the proper indentation, containing a list of text elements.

See also: par/2, text/1, text_par/1.

Richard Carlsson carlsson.richard@gmail.com