Sample problem
The ascii notebook format
Cell delimiter lines
Include lines
Mako processing
Example on syntax
The compiled file
The generator code
Summary. This note explains how to write your own notebook generator in Python such that you can write a notebook in plain ascii in your favorite editor and also use handy tools such as preprocessors to introduce variables and other programming constructs into the text as well as to run computations.
Ascii input is particularly useful if you have LaTeX code that you want to make use of in notebooks. Then you must translate the LaTeX code to the syntax described here and run the compiler to be described.
The notebook generator will be demonstrated through a specific example. on writing a little report where we 1) present a differential equation, 2) solve it by SymPy, and 3) show Python code for the solution and some computations. We show how the SymPy calculations can be done on the fly while compiling the document: results in Python variables from the SymPy calculations are magically propagated into the text. (This functionality is quite similar to PythonTeX, but just based on a standard template language, Mako, instead of quite comprehensive LaTeX code.)
We go for a very simple format: -----
is delimiter lines between cells.
Cells are written in either plain Markdown or as a set of statements in
a programming language, depending on whether the cell is a text or
code cell.
Deliminter lines with an extension text x
, as in -----x
,
indicates code cell in language x
, where x
is a short name for
the language, typically the file extension: py
for Python,
f
for Fortran, c
for C, cpp
for C++, js
for JavaScript,
sh
for Bash or another Unix shell, sys
for the console (terminal
window),
java
for Java, tex
for LaTeX or TeX, html
for HTML, etc.
If x
is proceeded by -t
it means that the cell is not a code cell,
but a standard static Markdown code cell typeset within triple backticks
as usual in Markdown. (Sometimes one wants to show code, but it is not
intended to be executable.)
It is handy to include other files in a document so we invent the
syntax #include "filename"
at the beginning of a line to include a file
with name filename
.
#include "myprog.py" fromto: from_regex@to_regex
However, this extension is not incorporated in the first version of the notebook generator. We just mention the possibility.
It is also very handy to run the text through a preprocessor that is
a full-fledged template programming language of the type that is
popular in the web world. Here we have chosen
Mako. Running the text through Mako
enables the use of variables, if-tests, and loops, to menition the
most usual constructs. Pure Python functions can be defined inside
<%
and %>
and called
in the code. Mako applies the syntax ${var}
for variables and
${myfunc(arg1, arg2=None)}
for function calls.
<%
and %>
Mako tags is not
recommended as debugging can be a nightmare. Instead, put the code
in a file, say myprog.py
, and just include it:
<%
#include "myprog.py"
%>
Then you can debug myprog.py
as standard Python code, but call up
its functions and use its global variables in the document's text (!).
Let us show a very simple document with some code, some math, and
use of include and Mako. The task is to solve a differential equation
by SymPy on the fly in the code and use SymPy output directly
in the text. For this goal, we write the SymPy code in a separate
file where a dump
function can be used for heavily printing
of intermediate results, but a global variable allow_printing
determines whether printing is turned on and off: we want it on
when debugging, but off when compiling our document.
The document starts with an author, his address, and the date, where
author and address are Mako variables we can specify on the command line
when compiling the document. This text is a Markdown cell and therefore
starts with -----
:
-----
## Test of Jupyter Notebook generator
**${NAME}**, ${ADDRESS}
**May 14, 2015**
Note that a double ##
is a Mako comment line and it will not be a
part of the final output from Mako. IC
is another variable that must
be specified on the command line (and fed to Mako) for the initial
condition of the differential equation.
Next follows the math part where we have an included SymPy code
to solve the math problem. The SymPy is in the file .solve_dyeqy.py
:
# Solve y'=y, y(0)=2 by sympy
# This file is intended for being included via mako
# in a document, but it is much easier to debug the
# python code in a separate standard .py file.
# Then we just include this file in the document inside
# <% ... %> mako directives and set allow_printing=False.
def dump(var):
if allow_printing:
print var
def solve():
"""Solve y'=y, y(0)=2."""
import sympy as sym
t = sym.symbols('t', real=True, positive=True)
y = sym.symbols('y', cls=sym.Function)
# Solve differential equation using dsolve
eq = sym.diff(y(t), t) - y(t)
dump(eq)
sol = sym.dsolve(eq)
dump(sol)
y = sol.rhs # grab right-hand side of equation
# Determine integration constant C1 from initial condition
C1 = sym.symbols('C1')
y0 = 2
eq = y.subs(t, 0) - y0 # equation for initial condition
dump(eq)
sol = sym.solve(eq, C1) # solve wrt C1
dump(sol)
y = y.subs(C1, sol[0]) # insert C1=2 in solution
dump(y)
y_func = sym.lambdify([t], y, modules='numpy')
return sym.latex(y), sym.latex(sol[0]), y_func
if __name__ == '__main__':
allow_printing = True
solve()
This is just standard Python code. The __name__
variable equals
__builtin__
when we run this code inside Mako so then the test block
is inactive. Instead, we can define allow_printing = False
,
call `solve(), and store its output in variables such that we can access
them in the running text. Here is the syntax:
## Math
This is a test notebook where we solve the following math
problem:
$$
y' = y,\quad y(0)=${IC}
$$
## Solve the problem by SymPy
<%
## Make sure to test the Python file first!
#include ".solve_dyeqy.py"
allow_printing = False
y_expr, C_expr, y_func = solve()
%>
The equation is separable, and we find by standard methods
that
$$
y(t) = ${y_expr}.
$$
The integration constant is found from the initial condition
$y(0)=${IC}$ and equals in this case $${C_expr}$.
Note how we in the middle of math expressions use Mako variables taken
from both the command line, such as ${IC}
, and from the
Python code, such as ${y_expr}
and ${C_expr}
!
We can now define some code cells for execution. We want to
create a Python code for the solution, using the SymPy variable
y_expr
and SymPy's ability to write the expression for
a numerical Python function, here called y(t)
. Note that the
delimiter for a Python code cell is -----py
.
## Code
We implement the evaulation of $y(t)$ in Python:
<%
## We use sympy to convert y_expr to a string to
## be returned as Python code
import sympy
## Note that we have y_func which is a real Python
## function, but here we make a similar one: y(t)
## so the user can see it.
%>
-----py
from numpy import exp
def y(t):
return ${sympy.printing.lambdarepr.lambdarpr(${y_expr})}
## Try values
y(0)
-----py
y(1), 2*exp(1)
-----py
y(2), 2*exp(2)
-----py-t
instead of -----py
.
Finally, we show how to compile the ascii file into a Jupyter Notebook, using a console cell that is to be shown as plain Markdown Bash code.
## Compilation
-----
This is how we run the notebook generator on files with
extension `.aipynb`:
## console cell, but typeset as pure code (-t extension)
-----sys-t
Terminal> ipynb_generator.py myfile.aipynb MYVAR=4 GRADE='excellent'
What we have not shown here, is the ability to call Python function
in the text. We could, if it was sensible, call the solve
function
in the text, e.g., as in
...and the solution becomes ${solve()[0]}
.
We compile our example file by the following command:
Terminal> python ipynb_generator.py .test1.aipynb NAME="Core Dump" \
ADDRESS="ADDRESS=Seg. Fault Ltd and Univ. of C. Space" \
IC=2
Note that some Mako variables are supposed to be given on the command line,
here three, while others are defined in Python code within <%
and %>
tags in the document.
The output of the ipynb_generator.py
command above
is a notebook file .test1.ipynb
. The file looks like this:
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"# Test of Jupyter Notebook generator\n",
"\n",
"**Core Dump**, Seg. Fault Ltd and Univ. of C. Space\n",
"\n",
"**May 14, 2015**\n",
"\n",
"\n",
"This is a test notebook where we solve the following math\n",
"problem:\n",
"\n",
"$$\n",
"y' = y,\\quad y(0)=2\n",
"$$\n",
"\n",
"\n",
"\n",
"The equation is separable, and we find by standard methods\n",
"that\n",
"\n",
"$$\n",
"y(t) = 2 e^{t}.\n",
"$$\n",
"The integration constant is found from the initial condition\n",
"$y(0)=2$ and equals in this case $2$.\n",
"\n",
"\n",
"We implement the evaulation of $y(t)$ in Python:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"from numpy import exp\n",
"\n",
"def y(t):\n",
" return 2*exp(t)\n",
"\n",
"# Try values\n",
"y(0)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"y(1), 2*exp(1)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"y(2), 2*exp(2)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"This is how we run the notebook generator on files with\n",
"extension `.aipynb`:\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"```Python\n",
"\n",
"Terminal> ipynb_generator.py myfile.aipynb\n",
"```\n"
]
}
],
"metadata": {},
"nbformat": 4,
"nbformat_minor": 0
}
The notebook file resides in GitHub and can be automatically rendered there.
We shall now list the code that translates the ascii input, with the special syntax explained, into a notebook. The algorithmic steps are
cells
list of all cells., i.e., detect
the beginning of a new cell by the delimiter line -----
.
The cells
list consists of elements of 3-lists, where each
3-list has the cell type, a description, and all the lines of
the cell as its three elements.cells
, see if the delimiter
line has a language specification and therefore is a code cell,
or if it is a plain Markdown code cell.cells
and join separate lines in each cell into a string.IPython.nbformat.v4
for translating
the information in the cells
list into a cell list nb_cells
suitable
for the nootebook.nb_cells
list to JSON format.read
function looks as follows.
def read(text, argv=sys.argv[2:]):
lines = text.splitlines()
# First read all include statements
for i in range(len(lines)):
if lines[i].startswith('#include "'):
filename = lines[i].split('"')[1]
with open(filename, 'r') as f:
include_text = f.read()
lines[i] = include_text
text = '\n'.join(lines)
# Run Mako
mako_kwargs = {}
for arg in argv:
key, value = arg.split('=')
mako_kwargs[key] = value
encoding = 'utf-8'
try:
import mako
has_mako = True
except ImportError:
print 'Cannot import mako - mako is not run'
has_mako = False
if has_mako:
from mako.template import Template
from mako.lookup import TemplateLookup
lookup = TemplateLookup(directories=[os.curdir])
text = unicode(text, encoding)
temp = Template(text=text, lookup=lookup,
strict_undefined=True)
text = temp.render(**mako_kwargs)
# Parse the cells
lines = text.splitlines()
cells = []
inside = None # indicates which type of cell we are inside
fullname = None # full language name in code cells
for line in lines:
if line.startswith('-----'):
# New cell, what type?
m = re.search(r'-----([a-z0-9-]+)?', line)
if m:
shortname = m.group(1)
if shortname:
# Check if code is to be typeset as static
# Markdown code (e.g., shortname=py-t)
astext = shortname[-2:] == '-t'
if astext:
# Markdown
shortname = shortname[:-2]
inside = 'markdown'
cells.append(['markdown', 'code', ['\n']])
cells[-1][2].append('```%s\n' % fullname)
else:
# Code cell
if shortname in shortname2language:
fullname = shortname2language[shortname]
inside = 'codecell'
cells.append(['codecell', fullname, []])
else:
# Markdown cell
inside = 'markdown'
cells.append(['markdown', 'text', ['\n']])
else:
raise SyntaxError(
'Wrong syntax of cell delimiter:\n%s'
% repr(line))
else:
# Ordinary line in a cell
if inside in ('markdown', 'codecell'):
cells[-1][2].append(line)
else:
raise SyntaxError(
'line\n %s\nhas not beginning cell delimiter'
% line)
# Merge the lines in each cell to a string
for i in range(len(cells)):
if cells[i][0] == 'markdown' and cells[i][1] == 'code':
# Add an ending ``` of code
cells[i][2].append('```\n')
cells[i][2] = '\n'.join(cells[i][2])
import pprint
return cells
The line fullname = shortname2language[shortname]
is not easy
to understand unless we have the definition of the dictionary
# Mapping of shortnames like py to full language
# name like python used by markdown/pandoc
shortname2language = dict(
py='Python', ipy='Python', pyshell='Python', cy='Python',
c='C', cpp='Cpp', f='Fortran', f95='Fortran95',
rb='Ruby', pl='Perl', sh='Shell', js='JavaScript', html='HTML',
tex='Tex', sys='Bash',
)
The translation from a cells
list to the similar list needed
by the IPython notebook writing functions is taken care of
in the following function:
def write(cells):
"""Turn cells list into valid IPython notebook code."""
# Use IPython.nbformat functionality for writing the notebook
from IPython.nbformat.v4 import (
new_code_cell, new_markdown_cell, new_notebook)
nb_cells = []
for cell_tp, language, block in cells:
if cell_tp == 'markdown':
nb_cells.append(new_markdown_cell(source=block))
elif cell_tp == 'codecell':
nb_cells.append(new_code_cell(source=block))
nb = new_notebook(cells=nb_cells)
from IPython.nbformat import writes
filestr = writes(nb, version=4)
return filestr
A driver or main program is needed:
def driver():
"""Compile a document and its variables."""
try:
filename = sys.argv[1]
with open(filename, 'r') as f:
text = f.read()
except (IndexError, IOError) as e:
print 'Usage: %s filename' % (sys.argv[0])
print e
sys.exit(1)
cells = read(text, argv=sys.argv[2:])
filestr = write(cells, 3)
filename = filename[-5:] + '.ipynb'
with open(filename, 'w') as f:
f.write(filestr)
The true file has support for notebook format version 3 and 4
and contains also a lot of logging
statements to aid
debugging.
IPython.nbformat
functions can be used for writing notebooks.