Demonstration of DocOnce support for LaTeX code block environments

Hans Petter Langtangen [1, 2]

[1] Center for Biomedical Computing, Simula Research Laboratory
[2] Department of Informatics, University of Oslo

Apr 8, 2016


Summary. This note demonstrates the DocOnce capabilities for generating LaTeX code for verbatim blocks of computer code. These new capabilities replaces the need for both the stand-alone program ptex2tex and the simplified doconce ptex2tex utility. In fact, the new capabilities are more flexible than ptex2tex and results in much cleaner LaTeX code (especially for verbatim blocks with background color).

Blocks of computer code in LaTeX

History

Originally, DocOnce relied on generating code for ptex2tex rather than plain LaTeX, the reason being that ptex2tex offered about 40 different styles for typesetting verbatim blocks of code. The ptex2tex utility relies on a comprehensive configure file for setting the style of every code environment (pycod, fpro, sys, etc.). A simpler and quicker alternative, doconce ptex2tex was developed later such that DocOnce was not dependent on the comprehensive ptex2tex set up.

In 2015, a new implementation in DocOnce replaced ptex2tex and doconce ptex2tex, and can in fact generate LaTeX code directly. Rather than offering a range of packages for typesetting computer code, as the ptex2tex tool does, the implementation made use of only three choices: fancyvrb, minted, or listing, but these choices with all their parameters actually span a much richer way of typesetting code than what ptex2tex offers.

Quick overview of the functionality

When running doconce format latex mydoc or doconce format pdflatex mydoc, the command-line argument --latex_code_style=... specifies the typesetting of blocks of computer code. The result of the doconce format command is then a LaTeX file mydoc.tex, which can be processed by latex, pdflatex, or xelatex.

We use the term code environment for the DocOnce environments in which blocks of code are typset. For example, the pycod environment is surrounded by the !bc pycod and !ec directives. DocOnce supports a lot of such environments: pycod for code snippets in Python, pypro for complete executable Python programs, fcod and fpro for snippets and complete programs in Fortran, cppcod and cpppro for the C++ counterparts, mcod and mpro for Matlab, to mention some.

The user can choose between three well-known packages for typesetting computer code in LaTeX:

The terms vrb, pyg, lst, and any are the terms for these packages/environments on the command line when running doconce format.

In addition, the user can specify a possibly colored background for the blocks of computer code and also set the parameters in the various environments. This information can be specified for each DocOnce code environment (pycod, sys, etc.) independently, including a common default choice for the code environments that are not specified.

The simplest choice: a single LaTeX environment for all blocks

The following command speficies the Verbatim (vrb) environment for all code blocks:

Terminal> doconce format pdflatex mydoc --latex_code_style=vrb

A DocOnce demo document has been made to illustrate how the various typesettings look like. The document contains a data file with the code environment dat, a complete executable Python program in the code environment pypro, and a terminal session in the code environment sys. The result of --latex_code_style=vrb gives the most plain and standard way of typesetting verbatim code blocks in LaTeX, see the result. Or more precisely, DocOnce generates a Verbatim environment with several parameters set:

\begin{Verbatim}[numbers=none,fontsize=\\fontsize{9pt}{9pt},%
                 baselinestretch=0.95,xleftmargin=2mm]
...
\end{Verbatim}

This results in slightly smaller font and slightly squeezed lines in the code block. It matches well running text in 10pt font.

The xleftmargin=2mm parameter can be explicitly set to something else on the command line: --latex_code_leftmargin=7 (e.g.). The number is measured in mm. (Using square brackets, as shown below, it can also be set individually for different code environments.)

Using the minted (Pygments) tool

Replacing vrb by pyg switches the LaTeX environments to minted:

Terminal> doconce format pdflatex mydoc --latex_code_style=pyg

Now, the resulting PDF file has typesetting of computer code that depends on the programming language. For example, the Python program leads to

\begin{minted}[%
   fontsize=\\fontsize{9pt}{9pt},linenos=false,%
   baselinestretch=1.0,fontfamily=tt,xleftmargin=2mm]{python}
...
\end{minted}

Remember -shell-escape when compiling minted (Pygments) code! The minted LaTeX environment requires latex or pdflatex to be run with the -shell-escape option:

Terminal> pdflatex -shell-escape mydoc

The minted style to be used can be specified by the --minted_latex_style= option, e.g.,

Terminal> doconce format pdflatex mydoc --latex_code_style=vrb --minted_latex_style=perldoc

The perldoc choice changes the colors from the default (Pygments default) choice generated previously.

Using the lstlisting tool

The third package for typesetting of verbatim blocks of code is listingsutf8 and the lstlisting LaTeX environment, which in the following most plain form gives a look not much different from the Verbatim environment:

Terminal> doconce format pdflatex mydoc --latex_code_style=lst

The resulting LaTeX code is:

\begin{lstlisting}[language=Python,style=simple,xleftmargin=2mm]
...
\end{listlisting}

However, all possible lstlisting options can be set, as will be shown later.

Specifying other packages

Say you have a LaTeX package compcode with the environment ultimate that you want to use. This can be specified by

Terminal> doconce format pdflatex mydoc \
          --latex_code_style=ultimate --latex_packages=compcode

The resulting LaTeX code becomes

\begin{ultimate}
...
\end{ultimate}

Adding a colored background

In the previous example, we can add one of the predefined backgrounds in DocOnce:

doconce format pdflatex doc --latex_code_style=lst-yellow2

The yellow2 background is light yellow. For colored backgrounds, one should notice that the pro code environments (as in pypro) for complete executable programs get a 1mm slightly darker bar at the left side of the code block. This almost invisible color change indicates for the reader that the code can be copied and run as it stands. (The cod code environments are used for snippets that will not normally run unless some additional statements is supplied.)

The general specification of a background is pkg-bg, where pkg is the package specification (vrb, pyg, or lst) and bg is the DocOnce name of a background:

If you want to tailor the background color, say change the yellow1 color to have RGB values (0.95, 0.95, 0.8) rather than (0.98, 0.98, 0.8), autoedit the .tex file with a regular expression:

Terminal> doconce subst 'yellow1\}\{rgb.+' \
          'yellow1{rgb}{0.95, 0.95, 0.8} mydoc.tex

or just replace the exact text:

Terminal> doconce replace \
          'cbg_yellow1}{rgb}{.98, .98, 0.8}' \
          'cbg_yellow1}{rgb}{.95, .95, 0.8}' mydoc.tex

Setting LaTeX environment parameters

It is easy to specify parameters to the lstlisting or the two other LaTeX environments:

Terminal> doconce format pdflatex doc \
"--latex_code_style=lst-yellow2[numbers=left,
numberstyle=\\tiny,numbersep=15pt,breaklines=true]"

(but no linebreaks in the --latex_code_style command!). These parameters specify line numbers in the code blocks as well as wrapping of too long lines (breaklines=true). Note that any backslash in LaTeX command must be a double backslash on the command line!

Here is an example where we use the fictitious code enviroment ultimate from the package compcode with a yellow backgrund and with some environment parameters arg1 and arg2:

Terminal> doconce format pdflatex mydoc --latex_packages=compcode \
         '--latex_code_style=ultimate-yellow2[arg1=val1,arg2=val2]'

The resulting LaTeX code becomes

\begin{cod}{cbg_yellow2}\begin{ultimate}[arg1=val1,arg2=val2]
...
\end{ultimate}\end{cod}

In this way, you can use any environment in any package for typesetting code.

Specifying individual code environments

The colored background might be appropriate for computer code in the previous example, but maybe not so appropriate for the terminal session. Let us typeset the terminal session using the Verbatim environment, but rely on lst without line numbers as above for the other code environments:

Terminal> doconce format pdflatex doc \
          "--latex_code_style=default:lst-yellow2@sys:vrb"

That is, we specify the default choice (default) and the sys environment. The specifications are separated by @. One can add parameters to the LaTeX environments, e.g.,

Terminal> doconce format pdflatex doc \
  "--latex_code_style=default:lst-yellow2[numbers=left]@sys:vrb"

Here is a more fancy typesetting of sys environments with lines above and below and a title Terminal inside a box:

Terminal> doconce format pdflatex doc \
"--latex_code_style=default:lst-yellow2@
sys:vrb[frame=lines,label=\\fbox{{\\tiny Terminal}},framesep=2.5mm,
framerule=0.7pt,fontsize=\fontsize{9pt}{9pt}]"

(with no linebreaks though!). Here is the result.

Specifying the lst style

DocOnce comes with some predefined styles for the lstlisting LaTeX environment:

The styles with a colored background (yellow2_fb, gray, graycolor, blue1) should of course not be combined with another colored background (skip backgroundcolor specification or set it to white).

The user can also define any number of additional styles and put them in a file, say .mylststyles, and give them to doconce format through the command-line option --latex_code_lststyles=.mylststyles. Just include \lstdefinestyle{name}{...} commands in the file.

Here is an example of specifying the yellow2_fb style with yellow background, coloring of code, and a frame around all code blocks (as made famous in the FEniCS book):

Terminal> doconce format pdflatex doc \
  "--latex_code_style=default:lst[style=yellow2_fb]"

You may check out the corresponding result.

Blue background and plain verbatim non-colored code (as made famous in the Python Primer on Scientific Programming book) results from

Terminal> doconce format pdflatex doc \
"--latex_code_style=default:lst[style=blue1]@
pypro:lst[style=blue1bar]@dat:lst[style=gray]@
sys:vrb[frame=lines,label=\\fbox{{\tiny Terminal}},
framesep=2.5mm,framerule=0.7pt,fontsize=\fontsize{9pt}{9pt}]"

(but no linebreaks as here). Note that this style does not offer the thin vertical darker-colored bar for pypro enviroments (indicating complete programs) as ptex2tex offers through the BlueBar environment. Instead, there is a darker-colored top+bottom frame around pypro code specifications. (The old vertical bar is enabled by a pypro:vrb-blue1 specification, but then with a colored background that is significantly larger than the computer code block.)

The 5th edition of the mention "Primer book" features syntax highlighting, offered by the blue1_bluegreen and blue1bar_bluegreen styles:

Terminal> doconce format pdflatex doc \
"--latex_code_style=default:lst[style=blue1_bluegreen]@
pypro:lst[style=blue1bar_bluegreen]@dat:lst[style=gray]@
sys:vrb[frame=lines,label=\\fbox{{\tiny Terminal}},
framesep=2.5mm,framerule=0.7pt,fontsize=\fontsize{9pt}{9pt}]"

(but no linebreaks as here). Look at the result.

General syntax for --latex_code_style=

The --latex_code_style= option can take a set of code environment specifications separated by @. Each specification is of the form envir:pgk-bg[prms], where envir is the code environment name (pypro, sys, etc., or default), pkg is the package name (vrb, pyg, lst), bg is the DocOnce name of a potential background (can be omitted), and prms is a list of parameters for the LaTeX environment.

As an example, we may specify a default typesetting with lst and the blue1 background, using the greenblue style and numbering of lines; then we let the dat environment be typeset with the Verbatim environment with a light gray background; and finally we let sys also use Verbatim, but with many parameters for more fancy layout. The value of --latex_code_style= is then (split over several lines for increased readability - it must be one line as a terminal command!):

"--latex_code_style=default:lst-blue1[style=greenblue,
numbers=left,numberstyle=\\tiny,stepnumber=3,
numbersep=15pt,xleftmargin=1mm]@dat:vrb-gray@
sys:vrb[frame=lines,label=\\fbox{{\\tiny Terminal}},
framesep=2.5mm,framerule=0.7pt,fontsize=\fontsize{9pt}{9pt}]"

Here is the result of this detailed specification.

Size of colored background. The default colored background, as specified by (e.g.) vrb-blue, is significantly larger than the verbatim text. The surrounding space can be reduced by setting \fboxsep to a negative value, see the line right before the definition of the cod environment in the .tex file. However, if a !bbox is used in the document, \fboxsep cannot be set to a negative value without destroying those boxes. If one really wants the colored background to be about as large as the text, there is at present two options:
  1. use lst with a style (gray, yellow2_fb, blue1)
  2. use ptex2tex with the Blue and similar environments in the .ptex2tex.cfg file.
The results are very similar so there is no demand for ptex2tex.

Wrapping code environments with user-customized environments

Suppose you want to typeset your code with an environment that is not supported by DocOnce. For example, we want in LaTeX to have headline before each code snippet with a section-based counter, the filename and some title for the code. Then we want double lines before and after the code snippet:





This is fairly easy: we define a user-defined environment in DocOnce, here called code and provide a Python module that translates this environment into proper format-specific code. Details are provided in the DocOnce manual, see the section User-Defined Environments. In the present case, the syntax in a DocOnce file is to wrap every @@@CODE line or !bc environment in a !bu-code and !eu-code environment. The code in the figure above is produced by

!bu-code file=process.f Return `a` multiplied by `c`
!bc fcod
       subroutine process(a, n, c, r)
C      Return array r = c*a
       integer n
       real*8 a(n), c, r(n)
       integer i
       do i = 1,n
          r(i) = c*a(i)
       end do
       return
       end
!ec
!eu-code

The LaTX code we want is

\begin{pycode}

\Verb!process.f!:
Return a multiplied by \Verb!c!

\hrule
\vspace{1mm}
\hrule

\begin{lstlisting}[language=Fortran,style=simple,xleftmargin=2mm]
       subroutine process(a, n, c, r)
C      Return array r = c*a
       integer n
       real*8 a(n), c, r(n)
       integer i
       do i = 1,n
          r(i) = c*a(i)
       end do
       return
       end
\end{lstlisting}

\hrule
\vspace{1mm}
\hrule
\end{pycode}

We have created a special LaTeX environment ("theorem") called pycode that LaTeX will give a section-based number (just like theorems, examples, etc).

To this end, we must write a file userdef_environments.py where we define the code environment for LaTeX and other formats (usually some tailored LaTeX code and just some reasonable common DocOnce code for other formats). The userdef_environments.py file looks like this in the current example:

import re

def latex_code(text, titleline, counter, format):
    # file=myprog.py label=my:label Some title...
    label, titleline = get_label(titleline, 'label')
    filename, titleline = get_label(titleline, 'file')
    # Must be able to handle empty label and/or filename
    # (recognized by '')
    if label:
        label = 'label{%s}' % label
    if filename:
        filename = '`%s`: ' % filename
    s = r"""

\begin{pycode}
%s
%s
%s

\hrule
\vspace{1mm}
\hrule

%s

\hrule
\vspace{1mm}
\hrule
\end{pycode}
""" % (label, filename, titleline, text)
    return s

def do_code(text, titleline, counter, format):
    # file=myprog.py label=my:label Some title...
    label, titleline = get_label(titleline, 'label')
    filename, titleline = get_label(titleline, 'file')
    s = r"""

_Python code %d_: `%s`. %s

%s
""" % (counter, filename, titleline, text)
    return s


def get_label(titleline, label_text='label'):
    label = ''
    if label_text in titleline:
        pattern = r'%s=([^\s]+)' % label_text
        m = re.search(pattern, titleline)
        if m:
            label = m.group(1)
            titleline = re.sub(pattern, '', titleline).strip()
    return label, titleline

envir2format = {
    'intro': {
        'latex': r"""
\usepackage{amsthm}
\theoremstyle{definition}
\newtheorem{pycode}{Python code}[section]
""",},
    'code': {
        'latex': latex_code,
        'do': do_code,
        },
}

We grab information on the !bu-code line for a potential filename, label, and header for the code snippet.

When compiling to pdflatex, we simply use --latex_code_style=lst, but we could use more fancy styles if desired. You can inspect the PDF file for this example with a tailored, user-defined code environment.