$$ \hat{A}{\bf x}={\bf b},$$
where \( \hat{A} \) is a matrix and \( {\bf x} \) and \( {\bf b} \) are vectors. The vector \( {\bf x} \) is
the unknown.
It is an iterative scheme where we start with a guess for the unknown, and after \( k+1 \) iterations we have
$$ {\bf x}^{(k+1)}= \hat{D}^{-1}({\bf b}-(\hat{L}+\hat{U}){\bf x}^{(k)}),$$
with \( \hat{A}=\hat{D}+\hat{U}+\hat{L} \) and
\( \hat{D} \) being a diagonal matrix, \( \hat{U} \) an upper triangular matrix and \( \hat{L} \) a lower triangular
matrix.
If the matrix \( \hat{A} \) is positive definite or diagonally dominant, one can show that this method will always converge to the exact solution.
$$
\begin{align}
x_1^{(1)} =&(b_1-a_{12}x_2^{(0)} -a_{13}x_3^{(0)} - a_{14}x_4^{(0)})/a_{11} \nonumber \\
x_2^{(1)} =&(b_2-a_{21}x_1^{(0)} - a_{23}x_3^{(0)} - a_{24}x_4^{(0)})/a_{22} \nonumber \\
x_3^{(1)} =&(b_3- a_{31}x_1^{(0)} -a_{32}x_2^{(0)} -a_{34}x_4^{(0)})/a_{33} \nonumber \\
x_4^{(1)}=&(b_4-a_{41}x_1^{(0)} -a_{42}x_2^{(0)} - a_{43}x_3^{(0)})/a_{44}, \nonumber
\end{align}
$$
which after \( k+1 \) iterations reads
$$
\begin{align}
x_1^{(k+1)} =&(b_1-a_{12}x_2^{(k)} -a_{13}x_3^{(k)} - a_{14}x_4^{(k)})/a_{11} \nonumber \\
x_2^{(k+1)} =&(b_2-a_{21}x_1^{(k)} - a_{23}x_3^{(k)} - a_{24}x_4^{(k)})/a_{22} \nonumber \\
x_3^{(k+1)} =&(b_3- a_{31}x_1^{(k)} -a_{32}x_2^{(k)} -a_{34}x_4^{(k)})/a_{33} \nonumber \\
x_4^{(k+1)}=&(b_4-a_{41}x_1^{(k)} -a_{42}x_2^{(k)} - a_{43}x_3^{(k)})/a_{44}, \nonumber
\end{align}
$$
$$
x_i^{(k+1)}=(b_i-\sum_{j=1, j\ne i}^{n}a_{ij}x_j^{(k)})/a_{ii}
$$
or in an even more compact form as
$$ {\bf x}^{(k+1)}= \hat{D}^{-1}({\bf b}-(\hat{L}+\hat{U}){\bf x}^{(k)}),$$
with \( \hat{A}=\hat{D}+\hat{U}+\hat{L} \) and
\( \hat{D} \) being a diagonal matrix, \( \hat{U} \) an upper triangular matrix and \( \hat{L} \) a lower triangular
matrix.
$$
\begin{align}
x_1^{(k+1)} =&(b_1-a_{12}x_2^{(k)} -a_{13}x_3^{(k)} - a_{14}x_4^{(k)})/a_{11} \nonumber \\
x_2^{(k+1)} =&(b_2-a_{21}x_1^{(k)} - a_{23}x_3^{(k)} - a_{24}x_4^{(k)})/a_{22} \nonumber \\
x_3^{(k+1)} =&(b_3- a_{31}x_1^{(k)} -a_{32}x_2^{(k)} -a_{34}x_4^{(k)})/a_{33} \nonumber \\
x_4^{(k+1)}=&(b_4-a_{41}x_1^{(k)} -a_{42}x_2^{(k)} - a_{43}x_3^{(k)})/a_{44}, \nonumber
\end{align}
$$
can be rewritten as
$$
\begin{align}
x_1^{(k+1)} =&(b_1-a_{12}x_2^{(k)} -a_{13}x_3^{(k)} - a_{14}x_4^{(k)})/a_{11} \nonumber \\
x_2^{(k+1)} =&(b_2-a_{21}x_1^{(k+1)} - a_{23}x_3^{(k)} - a_{24}x_4^{(k)})/a_{22} \nonumber \\
x_3^{(k+1)} =&(b_3- a_{31}x_1^{(k+1)} -a_{32}x_2^{(k+1)} -a_{34}x_4^{(k)})/a_{33} \nonumber \\
x_4^{(k+1)}=&(b_4-a_{41}x_1^{(k+1)} -a_{42}x_2^{(k+1)} - a_{43}x_3^{(k+1)})/a_{44}, \nonumber
\end{align}
$$
which allows us to utilize the preceding solution (forward substitution). This improves normally the convergence
behavior and leads to the Gauss-Seidel method!
$$
\begin{align}
x_1^{(k+1)} =&(b_1-a_{12}x_2^{(k)} -a_{13}x_3^{(k)} - a_{14}x_4^{(k)})/a_{11} \nonumber \\
x_2^{(k+1)} =&(b_2-a_{21}x_1^{(k+1)} - a_{23}x_3^{(k)} - a_{24}x_4^{(k)})/a_{22} \nonumber \\
x_3^{(k+1)} =&(b_3- a_{31}x_1^{(k+1)} -a_{32}x_2^{(k+1)} -a_{34}x_4^{(k)})/a_{33} \nonumber \\
x_4^{(k+1)}=&(b_4-a_{41}x_1^{(k+1)} -a_{42}x_2^{(k+1)} - a_{43}x_3^{(k+1)})/a_{44}, \nonumber
\end{align}
$$
to the following form
$$
x^{(k+1)}_i = \frac{1}{a_{ii}} \left(
b_i - \sum_{j > i} a_{ij}x^{(k)}_j - \sum_{j < i}a_{ij}x^{(k+1)}_j
\right), \quad i=1,2,\ldots,n.
$$
The procedure is generally continued until the changes made by an iteration are below some tolerance.
The convergence properties of the Jacobi method and the Gauss-Seidel method are dependent on the matrix \( \hat{A} \). These methods converge when the matrix is symmetric positive-definite, or is strictly or irreducibly diagonally dominant. Both methods sometimes converge even if these conditions are not satisfied.
$$
\hat{A}\mathbf x = \mathbf b
$$
where:
$$
\hat{A}=\begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\a_{n1} & a_{n2} & \cdots & a_{nn} \end{bmatrix}, \qquad \mathbf{x} = \begin{bmatrix} x_{1} \\ x_2 \\ \vdots \\ x_n \end{bmatrix} , \qquad \mathbf{b} = \begin{bmatrix} b_{1} \\ b_2 \\ \vdots \\ b_n \end{bmatrix}.
$$
$$
\hat{A} =\hat{D} + \hat{L} + \hat{U},
$$
where
$$
D = \begin{bmatrix} a_{11} & 0 & \cdots & 0 \\ 0 & a_{22} & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\0 & 0 & \cdots & a_{nn} \end{bmatrix}, \quad L = \begin{bmatrix} 0 & 0 & \cdots & 0 \\ a_{21} & 0 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\a_{n1} & a_{n2} & \cdots & 0 \end{bmatrix}, \quad U = \begin{bmatrix} 0 & a_{12} & \cdots & a_{1n} \\ 0 & 0 & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\0 & 0 & \cdots & 0 \end{bmatrix}.
$$
The system of linear equations may be rewritten as:
$$
(D+\omega L) \mathbf{x} = \omega \mathbf{b} - [\omega U + (\omega-1) D ] \mathbf{x}
$$
for a constant \( \omega > 1 \).
$$
\mathbf{x}^{(k+1)} = (D+\omega L)^{-1} \big(\omega \mathbf{b} - [\omega U + (\omega-1) D ] \mathbf{x}^{(k)}\big).
$$
However, by taking advantage of the triangular form of \( (D+\omega L) \), the elements of \( x^{(k+1)} \) can be computed sequentially using forward substitution:
$$
x^{(k+1)}_i = (1-\omega)x^{(k)}_i + \frac{\omega}{a_{ii}}
\left(b_i - \sum_{j>i} a_{ij}x^{(k)}_j -
\sum_{j < i} a_{ij}x^{(k+1)}_j \right),\quad i=1,2,\ldots,n.
$$
The choice of relaxation factor is not necessarily easy, and depends upon the properties of the coefficient matrix. For symmetric, positive-definite matrices it can be proven that \( 0 < \omega < 2 \) will lead to convergence, but we are generally interested in faster convergence rather than just convergence.
A spline function consists of polynomial pieces defined on subintervals. The different subintervals are connected via various continuity relations.
Assume we have at our disposal \( n+1 \) points \( x_0, x_1, \dots x_n \) arranged so that \( x_0 < x_1 < x_2 < \dots x_{n-1} < x_n \) (such points are called knots). A spline function \( s \) of degree \( k \) with \( n+1 \) knots is defined as follows
$$
s(x)=\left\{\begin{array}{cc} s_0(x)=a_0x+b_0 & x\in [x_0, x_1) \\
s_1(x)=a_1x+b_1 & x\in [x_1, x_2) \\
\dots & \dots \\
s_{n-1}(x)=a_{n-1}x+b_{n-1} & x\in
[x_{n-1}, x_n] \end{array}\right.
$$
In this case the polynomial consists of series of straight lines connected to each other at every endpoint. The number of continuous derivatives is then \( k-1=0 \), as expected when we deal with straight lines. Such a polynomial is quite easy to construct given \( n+1 \) points \( x_0, x_1, \dots x_n \) and their corresponding function values.
$$
s_{i-1}(x_i)= y_i = s_i(x_i),
$$
with \( 1 \le i \le n-1 \). In total we have \( n \) polynomials of the
type
$$
s_i(x)=a_{i0}+a_{i1}x+a_{i2}x^2+a_{i2}x^3,
$$
yielding \( 4n \) coefficients to determine.
$$
y_i = s(x_i),
$$
and
$$
s(x_{i+1})= y_{i+1},
$$
to be fulfilled. If we also assume that \( s' \) and \( s'' \) are continuous,
then
$$
s'_{i-1}(x_i)= s'_i(x_i),
$$
yields \( n-1 \) conditions. Similarly,
$$
s''_{i-1}(x_i)= s''_i(x_i),
$$
results in additional \( n-1 \) conditions. In total we have \( 4n \) coefficients
and \( 4n-2 \) equations to determine them, leaving us with \( 2 \) degrees of
freedom to be determined.
$$
s''_{i}(x_i)= f_i,
$$
and
$$
s''_{i}(x_{i+1})= f_{i+1},
$$
and setting up a straight line between \( f_i \) and \( f_{i+1} \) we have
$$
s_i''(x) = \frac{f_i}{x_{i+1}-x_i}(x_{i+1}-x)+
\frac{f_{i+1}}{x_{i+1}-x_i}(x-x_i),
$$
and integrating twice one obtains
$$
s_i(x) = \frac{f_i}{6(x_{i+1}-x_i)}(x_{i+1}-x)^3+
\frac{f_{i+1}}{6(x_{i+1}-x_i)}(x-x_i)^3
+c(x-x_i)+d(x_{i+1}-x).
$$
$$
\begin{align}
s_i(x) =&\frac{f_i}{6(x_{i+1}-x_i)}(x_{i+1}-x)^3+
\frac{f_{i+1}}{6(x_{i+1}-x_i)}(x-x_i)^3 \nonumber \\
+&(\frac{y_{i+1}}{x_{i+1}-x_i}-\frac{f_{i+1}(x_{i+1}-x_i)}{6})
(x-x_i)+
(\frac{y_{i}}{x_{i+1}-x_i}-\frac{f_{i}(x_{i+1}-x_i)}{6})
(x_{i+1}-x).
\end{align}
$$
$$
s'_{i-1}(x_i)= s'_i(x_i),
$$
and set \( x=x_i \). Defining \( h_i=x_{i+1}-x_i \) we obtain finally
the following expression
$$
h_{i-1}f_{i-1}+2(h_{i}+h_{i-1})f_i+h_if_{i+1}=
\frac{6}{h_i}(y_{i+1}-y_i)-\frac{6}{h_{i-1}}(y_{i}-y_{i-1}),
$$
and introducing the shorthands \( u_i=2(h_{i}+h_{i-1}) \),
\( v_i=\frac{6}{h_i}(y_{i+1}-y_i)-\frac{6}{h_{i-1}}(y_{i}-y_{i-1}) \),
we can reformulate the problem as a set of linear equations to be
solved through e.g., Gaussian elemination
$$
\left[\begin{array}{cccccccc} u_1 & h_1 &0 &\dots & & & & \\
h_1 & u_2 & h_2 &0 &\dots & & & \\
0 & h_2 & u_3 & h_3 &0 &\dots & & \\
\dots& & \dots &\dots &\dots &\dots &\dots & \\
&\dots & & &0 &h_{n-3} &u_{n-2} &h_{n-2} \\
& && & &0 &h_{n-2} &u_{n-1} \end{array}\right]
\left[\begin{array}{c} f_1 \\
f_2 \\
f_3\\
\dots \\
f_{n-2} \\
f_{n-1} \end{array} \right] =
\left[\begin{array}{c} v_1 \\
v_2 \\
v_3\\
\dots \\
v_{n-2}\\
v_{n-1} \end{array} \right].
$$
Note that this is a set of tridiagonal equations and can be solved
through only \( O(n) \) operations.
spline(double x[], double y[], int n, double yp1,
double yp2, double y2[])
This function takes as input \( x[0,..,n - 1] \) and \( y[0,..,n - 1] \) containing a tabulation \( y_i = f(x_i) \) with \( x_0 < x_1 < .. < x_{n - 1} \) together with the first derivatives of \( f(x) \) at \( x_0 \) and \( x_{n-1} \), respectively. Then the function returns \( y2[0,..,n-1] \) which contains the second derivatives of \( f(x_i) \) at each point \( x_i \). \( n \) is the number of points. This function provides the cubic spline interpolation for all subintervals and is called only once.
splint(double x[], double y[], double y2a[], int n,
double x, double *y)
which takes as input the tabulated values \( x[0,..,n - 1] \) and \( y[0,..,n - 1] \) and the output y2a[0,..,n - 1] from \( spline \). It returns the value \( y \) corresponding to the point \( x \).
$$
{\bf A}{\bf x^{(\nu)}} = \lambda^{(\nu)}{\bf x^{(\nu)}},
$$
where \( \lambda^{(\nu)} \) are the eigenvalues and \( {\bf x^{(\nu)}} \) the corresponding eigenvectors. Unless otherwise stated, when we use the wording eigenvector we mean the right eigenvector. The left eigenvector is defined as
$$
{\bf x^{(\nu)}}_L{\bf A} = \lambda^{(\nu)}{\bf x^{(\nu)}}_L
$$
The above right eigenvector problem is equivalent to a set of \( n \) equations with \( n \) unknowns
\( x_i \).
$$
\left( {\bf A}-\lambda^{(\nu)} I \right) {\bf x^{(\nu)}} = 0,
$$
with \( I \) being the unity matrix. This equation provides a solution to the problem if and only if the determinant is zero, namely
$$
\left| {\bf A}-\lambda^{(\nu)}{\bf I}\right| = 0,
$$
which in turn means that the determinant is a polynomial of degree \( n \) in \( \lambda \) and in general we will have \( n \) distinct zeros.
$$
P(\lambda) = det(\lambda{\bf I}-{\bf A}),
$$
or
$$
P(\lambda)= \prod_{i=1}^{n}\left(\lambda_i-\lambda\right).
$$
The set of these roots is called the spectrum and is denoted as
\( \lambda({\bf A}) \).
If \( \lambda({\bf A})=\left\{\lambda_1,\lambda_2,\dots ,\lambda_n\right\} \) then we have
$$
det({\bf A})= \lambda_1\lambda_2\dots\lambda_n,
$$
and if we define the trace of \( {\bf A} \) as
$$
Tr({\bf A})=\sum_{i=1}^n a_{ii}$$
then
\( Tr({\bf A})=\lambda_1+\lambda_2+\dots+\lambda_n \).
The content of this theorem is frequently misunderstood. It does not assert that higher-degree polynomial equations are unsolvable. In fact, if the polynomial has real or complex coefficients, and we allow complex solutions, then every polynomial equation has solutions; this is the fundamental theorem of algebra. Although these solutions cannot always be computed exactly with radicals, they can be computed to any desired degree of accuracy using numerical methods such as the Newton-Raphson method or Laguerre method, and in this way they are no different from solutions to polynomial equations of the second, third, or fourth degrees.
The theorem only concerns the form that such a solution must take. The content of the theorem is that the solution of a higher-degree equation cannot in all cases be expressed in terms of the polynomial coefficients with a finite number of operations of addition, subtraction, multiplication, division and root extraction. Some polynomials of arbitrary degree, of which the simplest nontrivial example is the monomial equation \( ax^n = b \), are always solvable with a radical.
Today, in the modern algebraic context, we say that second, third and fourth degree polynomial equations can always be solved by radicals because the symmetric groups \( S_2, S_3 \) and \( S_4 \) are solvable groups, whereas \( S_n \) is not solvable for \( n \ge 5 \).
$$
{\bf D}= \left( \begin{array}{ccccccc} \lambda_1 & 0 & 0 & 0 & \dots &0 & 0 \\
0 & \lambda_2 & 0 & 0 & \dots &0 &0 \\
0 & 0 & \lambda_3 & 0 &0 &\dots & 0\\
\dots & \dots & \dots & \dots &\dots &\dots & \dots\\
0 & \dots & \dots & \dots &\dots &\lambda_{n-1} & \\
0 & \dots & \dots & \dots &\dots &0 & \lambda_n
\end{array} \right).
$$
If \( {\bf A} \) is real and symmetric then there exists a real orthogonal matrix \( {\bf S} \) such that
$$
{\bf S}^T {\bf A}{\bf S}= \mbox{diag}(\lambda_1,\lambda_2,\dots ,\lambda_n),
$$
and for \( j=1:n \) we have \( {\bf A}{\bf S}(:,j) = \lambda_j {\bf S}(:,j) \).
We say that a matrix \( {\bf B} \) is a similarity transform of \( {\bf A} \) if
$$
{\bf B}= {\bf S}^T {\bf A}{\bf S}, \quad \mbox{where}\quad
{\bf S}^T{\bf S}={\bf S}^{-1}{\bf S} ={\bf I}.
$$
The importance of a similarity transformation lies in the fact that the resulting matrix has the same eigenvalues, but the eigenvectors are in general different.
$$
{\bf Ax}=\lambda{\bf x} \mbox{ and }
{\bf B}= {\bf S}^T {\bf A}{\bf S}.
$$
We multiply the first equation on the left by \( {\bf S}^T \) and insert \( {\bf S}^{T}{\bf S} ={\bf I} \) between \( {\bf A} \) and \( {\bf x} \). Then we get
$$
\begin{equation}
({\bf S^TAS})({\bf S^Tx})=\lambda{\bf S^Tx} ,
\end{equation}
$$
which is the same as
$$
{\bf B} \left ( {\bf S^Tx} \right ) = \lambda \left ({\bf S^Tx}
\right ).
$$
The variable \( \lambda \) is an eigenvalue of \( {\bf B} \) as well, but with eigenvector \( {\bf S^Tx} \).
either apply subsequent similarity transformations (direct method) so that
$$
\begin{equation}
{\bf S_N^T\dots S_1^TAS_1\dots S_N }={\bf D} ,
\end{equation}
$$
or apply subsequent similarity transformations so that \( \bf A \) becomes tridiagonal (Householder) or upper/lower triangular (\( \bf QR \) method). Thereafter, techniques for obtaining eigenvalues from tridiagonal matrices can be used.
or use so-called power methods
or use iterative methods (Krylov, Lanczos, Arnoldi). These methods are popular for huge matrix problems.
$$
-\frac{u_{i+1} -2u_i +u_{i-i}}{h^2}=f(x_i,u(x_i)),
$$
with \( i=1,2,\dots, n \). We need to add to this system the two boundary conditions \( u(a) =u_0 \) and \( u(b) = u_{n+1} \).
If we define a matrix
$$
{\bf A} = \frac{1}{h^2}\left(\begin{array}{cccccc}
2 & -1 & & & & \\
-1 & 2 & -1 & & & \\
& -1 & 2 & -1 & & \\
& \dots & \dots &\dots &\dots & \dots \\
& & &-1 &2& -1 \\
& & & &-1 & 2 \\
\end{array} \right)
$$
and the corresponding vectors \( {\bf u} = (u_1, u_2, \dots,u_n)^T \) and
\( {\bf f}({\bf u}) = f(x_1,x_2,\dots, x_n,u_1, u_2, \dots,u_n)^T \) we can rewrite the differential equation
including the boundary conditions as a system of linear equations with a large number of unknowns
$$
{\bf A}{\bf u} = {\bf f}({\bf u}).
$$
$$
-\frac{\hbar^2}{2 m} \left ( \frac{1}{r^2} \frac{d}{dr} r^2
\frac{d}{dr} - \frac{l (l + 1)}{r^2} \right )R(r)
+ V(r) R(r) = E R(r).
$$
In our case \( V(r) \) is the harmonic oscillator potential \( (1/2)kr^2 \) with
\( k=m\omega^2 \) and \( E \) is
the energy of the harmonic oscillator in three dimensions.
The oscillator frequency is \( \omega \) and the energies are
$$
E_{nl}= \hbar \omega \left(2n+l+\frac{3}{2}\right),
$$
with \( n=0,1,2,\dots \) and \( l=0,1,2,\dots \).
Since we have made a transformation to spherical coordinates it means that \( r\in [0,\infty) \). The quantum number \( l \) is the orbital momentum of the electron.
Then we substitute \( R(r) = (1/r) u(r) \) and obtain
$$
-\frac{\hbar^2}{2 m} \frac{d^2}{dr^2} u(r)
+ \left ( V(r) + \frac{l (l + 1)}{r^2}\frac{\hbar^2}{2 m}
\right ) u(r) = E u(r) .
$$
The boundary conditions are \( u(0)=0 \) and \( u(\infty)=0 \).
$$
-\frac{\hbar^2}{2 m \alpha^2} \frac{d^2}{d\rho^2} u(\rho)
+ \left ( V(\rho) + \frac{l (l + 1)}{\rho^2}
\frac{\hbar^2}{2 m\alpha^2} \right ) u(\rho) = E u(\rho) .
$$
In project 2 we set \( l=0 \). Inserting \( V(\rho) = (1/2) k \alpha^2\rho^2 \) we end up with
$$
-\frac{\hbar^2}{2 m \alpha^2} \frac{d^2}{d\rho^2} u(\rho)
+ \frac{k}{2} \alpha^2\rho^2u(\rho) = E u(\rho) .
$$
We multiply thereafter with \( 2m\alpha^2/\hbar^2 \) on both sides and obtain
$$
-\frac{d^2}{d\rho^2} u(\rho)
+ \frac{mk}{\hbar^2} \alpha^4\rho^2u(\rho) = \frac{2m\alpha^2}{\hbar^2}E u(\rho) .
$$
$$
-\frac{d^2}{d\rho^2} u(\rho)
+ \frac{mk}{\hbar^2} \alpha^4\rho^2u(\rho) = \frac{2m\alpha^2}{\hbar^2}E u(\rho) .
$$
The constant \( \alpha \) can now be fixed
so that
$$
\frac{mk}{\hbar^2} \alpha^4 = 1,
$$
or
$$
\alpha = \left(\frac{\hbar^2}{mk}\right)^{1/4}.
$$
Defining
$$
\lambda = \frac{2m\alpha^2}{\hbar^2}E,
$$
we can rewrite Schr\"odinger's equation as
$$
-\frac{d^2}{d\rho^2} u(\rho) + \rho^2u(\rho) = \lambda u(\rho) .
$$
This is the first equation to solve numerically. In three dimensions
the eigenvalues for \( l=0 \) are
\( \lambda_0=3,\lambda_1=7,\lambda_2=11,\dots . \)
$$
\begin{equation}
u''=\frac{u(\rho+h) -2u(\rho) +u(\rho-h)}{h^2} +O(h^2),
\tag{1}
\end{equation}
$$
where \( h \) is our step.
Next we define minimum and maximum values for the variable \( \rho \),
\( \rho_{\mbox{min}}=0 \) and \( \rho_{\mbox{max}} \), respectively.
You need to check your results for the energies against different values
\( \rho_{\mbox{max}} \), since we cannot set
\( \rho_{\mbox{max}}=\infty \).
$$
h=\frac{\rho_{\mbox{max}}-\rho_{\mbox{min}} }{n_{\mbox{step}}}.
$$
Define an arbitrary value of \( \rho \) as
$$
\rho_i= \rho_{\mbox{min}} + ih\quad i=0,1,2,\dots , n_{\mbox{step}}
$$
we can rewrite the Schr\"odinger equation for \( \rho_i \) as
$$
-\frac{u(\rho_i+h) -2u(\rho_i) +u(\rho_i-h)}{h^2}+\rho_i^2u(\rho_i) = \lambda u(\rho_i),
$$
or in a more compact way
$$
-\frac{u_{i+1} -2u_i +u_{i-1}}{h^2}+\rho_i^2u_i=-\frac{u_{i+1} -2u_i +u_{i-1} }{h^2}+V_iu_i = \lambda u_i,
$$
where \( V_i=\rho_i^2 \) is the harmonic oscillator potential.
$$
d_i=\frac{2}{h^2}+V_i,
$$
and the non-diagonal matrix element
$$
e_i=-\frac{1}{h^2}.
$$
In this case the non-diagonal matrix elements are given by a mere constant.
All non-diagonal matrix elements are equal.
With these definitions the Schr\"odinger equation takes the following form
$$
d_iu_i+e_{i-1}u_{i-1}+e_{i+1}u_{i+1} = \lambda u_i,
$$
where \( u_i \) is unknown. We can write the
latter equation as a matrix eigenvalue problem
$$
\begin{equation}
\left( \begin{array}{ccccccc} d_1 & e_1 & 0 & 0 & \dots &0 & 0 \\
e_1 & d_2 & e_2 & 0 & \dots &0 &0 \\
0 & e_2 & d_3 & e_3 &0 &\dots & 0\\
\dots & \dots & \dots & \dots &\dots &\dots & \dots\\
0 & \dots & \dots & \dots &\dots &d_{n_{\mbox{step}}-2} & e_{n_{\mbox{step}}-1}\\
0 & \dots & \dots & \dots &\dots &e_{n_{\mbox{step}}-1} & d_{n_{\mbox{step}}-1}
\end{array} \right) \left( \begin{array}{c} u_{1} \\
u_{2} \\
\dots\\ \dots\\ \dots\\
u_{n_{\mbox{step}}-1}
\end{array} \right)=\lambda \left( \begin{array}{c} u_{1} \\
u_{2} \\
\dots\\ \dots\\ \dots\\
u_{n_{\mbox{step}}-1}
\end{array} \right)
\tag{2}
\end{equation}
$$
$$
\begin{equation}
\left( \begin{array}{ccccccc} \frac{2}{h^2}+V_1 & -\frac{1}{h^2} & 0 & 0 & \dots &0 & 0 \\
-\frac{1}{h^2} & \frac{2}{h^2}+V_2 & -\frac{1}{h^2} & 0 & \dots &0 &0 \\
0 & -\frac{1}{h^2} & \frac{2}{h^2}+V_3 & -\frac{1}{h^2} &0 &\dots & 0\\
\dots & \dots & \dots & \dots &\dots &\dots & \dots\\
0 & \dots & \dots & \dots &\dots &\frac{2}{h^2}+V_{n_{\mbox{step}}-2} & -\frac{1}{h^2}\\
0 & \dots & \dots & \dots &\dots &-\frac{1}{h^2} & \frac{2}{h^2}+V_{n_{\mbox{step}}-1}
\end{array} \right)
\tag{3}
\end{equation}
$$
Recall that the solutions are known via the boundary conditions at \( i=n_{\mbox{step}} \) and at the other end point, that is for \( \rho_0 \). The solution is zero in both cases.
$$
-\frac{\hbar^2}{2 m} \frac{d^2}{dr^2} u(r)
+ \frac{1}{2}k r^2u(r) = E^{(1)} u(r),
$$
where \( E^{(1)} \) stands for the energy with one electron only.
For two electrons with no repulsive Coulomb interaction, we have the following
Schr\"odinger equation
$$
\left( -\frac{\hbar^2}{2 m} \frac{d^2}{dr_1^2} -\frac{\hbar^2}{2 m} \frac{d^2}{dr_2^2}+ \frac{1}{2}k r_1^2+ \frac{1}{2}k r_2^2\right)u(r_1,r_2) = E^{(2)} u(r_1,r_2) .
$$
With no interaction this can be written out as the product of two single-electron wave functions, that is we have a solution on closed form.
We introduce the relative coordinate \( {\bf r} = {\bf r}_1-{\bf r}_2 \) and the center-of-mass coordinate \( {\bf R} = 1/2({\bf r}_1+{\bf r}_2) \). With these new coordinates, the radial Schr\"odinger equation reads
$$
\left( -\frac{\hbar^2}{m} \frac{d^2}{dr^2} -\frac{\hbar^2}{4 m} \frac{d^2}{dR^2}+ \frac{1}{4} k r^2+ kR^2\right)u(r,R) = E^{(2)} u(r,R).
$$
$$
E^{(2)}=E_r+E_R.
$$
We add then the repulsive Coulomb interaction between two electrons, namely a term
$$
V(r_1,r_2) = \frac{\beta e^2}{|{\bf r}_1-{\bf r}_2|}=\frac{\beta e^2}{r},
$$
with \( \beta e^2=1.44 \) eVnm.
$$
\left( -\frac{\hbar^2}{m} \frac{d^2}{dr^2}+ \frac{1}{4}k r^2+\frac{\beta e^2}{r}\right)\psi(r) = E_r \psi(r).
$$
This equation is similar to the one we had previously in parts (a) and (b)
and we introduce
again a dimensionless variable \( \rho = r/\alpha \). Repeating the same
steps, we arrive at
$$
-\frac{d^2}{d\rho^2} \psi(\rho)
+ \frac{mk}{4\hbar^2} \alpha^4\rho^2\psi(\rho)+\frac{m\alpha \beta e^2}{\rho\hbar^2}\psi(\rho) =
\frac{m\alpha^2}{\hbar^2}E_r \psi(\rho) .
$$
$$
\omega_r^2=\frac{1}{4}\frac{mk}{\hbar^2} \alpha^4,
$$
and fix the constant \( \alpha \) by requiring
$$
\frac{m\alpha \beta e^2}{\hbar^2}=1
$$
or
$$
\alpha = \frac{\hbar^2}{m\beta e^2}.
$$
$$
\lambda = \frac{m\alpha^2}{\hbar^2}E,
$$
we can rewrite Schr\"odinger's equation as
$$
-\frac{d^2}{d\rho^2} \psi(\rho) + \omega_r^2\rho^2\psi(\rho) +\frac{1}{\rho}\psi(\rho) = \lambda \psi(\rho).
$$
Here we will study the cases \( \omega_r = 0.01 \), \( \omega_r = 0.5 \), \( \omega_r =1 \), and \( \omega_r = 5 \) for the ground state only, that is the lowest-lying state.
With no repulsive Coulomb interaction you should get a result which corresponds to the relative energy of a non-interacting system. Make sure your results are stable as functions of \( \rho_{\mbox{max}} \) and the number of steps.
We are only interested in the ground state with \( l=0 \). We omit the center-of-mass energy.
For specific oscillator frequencies, the above equation has closed answers, see the article by M. Taut, Phys. Rev. A 48, 3561 - 3566 (1993). The article can be retrieved from the following web address http://prola.aps.org/abstract/PRA/v48/i5/p3561_1.
The first is the formal method, involving determinants and the \textit{characteristic polynomial}. This proves how many eigenvalues there are, and is the way most of you learned about how to solve the eigenvalue problem, but for matrices of dimensions greater than 2 or 3, it is rather impractical.
The other general approach is to use similarity or unitary tranformations to reduce a matrix to diagonal form. Almost always this is done in two steps: first reduce to for example a \textit{tridiagonal} form, and then to diagonal form. The main algorithms we will discuss in detail, Jacobi's and Householder's (so-called direct method) and Lanczos algorithms (an iterative method), follow this methodology.
...require for matrices of dimensionality \( n\times n \) typically \( O(n^3) \) operations. These methods are normally called standard methods and are used for dimensionalities \( n \sim 10^5 \) or smaller. A brief historical overview
Year | \( n \) | |
---|---|---|
1950 | \( n=20 \) | (Wilkinson) |
1965 | \( n=200 \) | (Forsythe et al.) |
1980 | \( n=2000 \) | Linpack |
1995 | \( n=20000 \) | Lapack |
2012 | \( n\sim 10^5 \) | Lapack |
shows that in the course of 60 years the dimension that direct diagonalization methods can handle has increased by almost a factor of \( 10^4 \). However, it pales beside the progress achieved by computer hardware, from flops to petaflops, a factor of almost \( 10^{15} \). We see clearly played out in history the \( O(n^3) \) bottleneck of direct matrix algorithms. Sloppily speaking, when \( n\sim 10^4 \) is cubed we have \( O(10^{12}) \) operations, which is smaller than the \( 10^{15} \) increase in flops.
If the matrix to diagonalize is large and sparse, direct methods simply become impractical, also because many of the direct methods tend to destroy sparsity. As a result large dense matrices may arise during the diagonalization procedure. The idea behind iterative methods is to project the $n-$dimensional problem in smaller spaces, so-called Krylov subspaces. Given a matrix \( \hat{A} \) and a vector \( \hat{v} \), the associated Krylov sequences of vectors (and thereby subspaces) \( \hat{v} \), \( \hat{A}\hat{v} \), \( \hat{A}^2\hat{v} \), \( \hat{A}^3\hat{v},\dots \), represent successively larger Krylov subspaces.
Matrix | \( \hat{A}\hat{x}=\hat{b} \) | \( \hat{A}\hat{x}=\lambda\hat{x} \) |
---|---|---|
\( \hat{A}=\hat{A}^* \) | Conjugate gradient | Lanczos |
\( \hat{A}\ne \hat{A}^* \) | GMRES etc | Arnoldi |
$$
{\bf S}=
\left(
\begin{array}{cccccccc}
1 & 0 & \dots & 0 & 0 & \dots & 0 & 0 \\
0 & 1 & \dots & 0 & 0 & \dots & 0 & 0 \\
\dots & \dots & \dots & \dots & \dots & \dots & 0 & \dots \\
0 & 0 & \dots & \cos\theta & 0 & \dots & 0 & \sin\theta \\
0 & 0 & \dots & 0 & 1 & \dots & 0 & 0 \\
\dots & \dots & \dots & \dots & \dots & \dots & 0 & \dots \\
0 & 0 & \dots & -\sin\theta & 0 & \dots & 1 & \cos\theta \\
0 & 0 & \dots & 0 & \dots & \dots & 0 & 1
\end{array}
\right)
$$
with property \( {\bf S^{T}} = {\bf S^{-1}} \). It performs a plane rotation around an angle \( \theta \) in the Euclidean $n-$dimensional space.
$$
s_{kk}= s_{ll}=\cos\theta,
s_{kl}=-s_{lk}= -\sin\theta,
s_{ii}=-s_{ii}=1\quad i\ne k \quad i \ne l,
$$
A similarity transformation
$$
{\bf B}= {\bf S}^T {\bf A}{\bf S},
$$
results in
$$
\begin{align*}
b_{ik} &= a_{ik}\cos\theta - a_{il}\sin\theta , i \ne k, i \ne l \\
b_{il} &= a_{il}\cos\theta + a_{ik}\sin\theta , i \ne k, i \ne l \nonumber\\
b_{kk} &= a_{kk}\cos^2\theta - 2a_{kl}\cos\theta \sin\theta +a_{ll}\sin^2\theta\nonumber\\
b_{ll} &= a_{ll}\cos^2\theta +2a_{kl}\cos\theta sin\theta +a_{kk}\sin^2\theta\nonumber\\
b_{kl} &= (a_{kk}-a_{ll})\cos\theta \sin\theta +a_{kl}(\cos^2\theta-\sin^2\theta)\nonumber
\end{align*}
$$
The angle \( \theta \) is arbitrary. The recipe is to choose \( \theta \) so that all non-diagonal matrix elements \( b_{kl} \) become zero.
$$
\mbox{off}({\bf A}) = \sqrt{\sum_{i=1}^n\sum_{j=1,j\ne i}^n a_{ij}^2}.
$$
To demonstrate the algorithm, we consider the simple \( 2\times 2 \) similarity transformation
of the full matrix. The matrix is symmetric, we single out $ 1\le k < l \le n$ and
use the abbreviations \( c=\cos\theta \) and \( s=\sin\theta \) to obtain
$$
\left( \begin{array}{cc} b_{kk} & 0 \\
0 & b_{ll} \\\end{array} \right) = \left( \begin{array}{cc} c & -s \\
s &c \\\end{array} \right) \left( \begin{array}{cc} a_{kk} & a_{kl} \\
a_{lk} &a_{ll} \\\end{array} \right) \left( \begin{array}{cc} c & s \\
-s & c \\\end{array} \right).
$$
$$
a_{kl}(c^2-s^2)+(a_{kk}-a_{ll})cs = b_{kl} = 0.
$$
If \( a_{kl}=0 \) one sees immediately that \( \cos\theta = 1 \) and \( \sin\theta=0 \).
The Frobenius norm of an orthogonal transformation is always preserved. The Frobenius norm is defined as
$$
||{\bf A}||_F = \sqrt{\sum_{i=1}^n\sum_{j=1}^n |a_{ij}|^2}.
$$
This means that for our \( 2\times 2 \) case we have
$$
2a_{kl}^2+a_{kk}^2+a_{ll}^2 = b_{kk}^2+b_{ll}^2,
$$
which leads to
$$
\mbox{off}({\bf B})^2 = ||{\bf B}||_F^2-\sum_{i=1}^nb_{ii}^2=\mbox{off}({\bf A})^2-2a_{kl}^2,
$$
since
$$
||{\bf B}||_F^2-\sum_{i=1}^nb_{ii}^2=||{\bf A}||_F^2-\sum_{i=1}^na_{ii}^2+(a_{kk}^2+a_{ll}^2 -b_{kk}^2-b_{ll}^2).
$$
This results means that the matrix \( {\bf A} \) moves closer to diagonal form for each transformation.
$$\cot 2\theta=\tau = \frac{a_{ll}-a_{kk}}{2a_{kl}},
$$
we obtain the quadratic equation (using \( \cot 2\theta=1/2(\cot \theta-\tan\theta) \)
$$
t^2+2\tau t-1= 0,
$$
resulting in
$$
t = -\tau \pm \sqrt{1+\tau^2},
$$
and \( c \) and \( s \) are easily obtained via
$$
c = \frac{1}{\sqrt{1+t^2}},
$$
and \( s=tc \). Choosing \( t \) to be the smaller of the roots ensures that \( |\theta| \le \pi/4 \) and has the
effect of minimizing the difference between the matrices \( {\bf B} \) and \( {\bf A} \) since
$$
||{\bf B}-{\bf A}||_F^2=4(1-c)\sum_{i=1,i\ne k,l}^n(a_{ik}^2+a_{il}^2) +\frac{2a_{kl}^2}{c^2}.
$$
Step 2.
Setup a while
-test where one compares the norm of the newly computed off-diagonal matrix elements
$$ \mbox{off}({\bf A}) = \sqrt{\sum_{i=1}^n\sum_{j=1,j\ne i}^n a_{ij}^2} > \epsilon.
$$
Step 3. Now choose the matrix elements \( a_{kl} \) so that we have those with largest value, that is \( |a_{kl}|=\mbox{max}_{i\ne j} |a_{ij}| \).
Step 4. Compute thereafter \( \tau = (a_{ll}-a_{kk})/2a_{kl} \), \( \tan\theta \), \( \cos\theta \) and \( \sin\theta \).
Step 5. Compute thereafter the similarity transformation for this set of values \( (k,l) \), obtaining the new matrix \( {\bf B}= {\bf S}(k,l,\theta)^T {\bf A}{\bf S}(k,l,\theta) \).
Step 6. Compute the new norm of the off-diagonal matrix elements and continue till you have satisfied \( \mbox{off}({\bf B}) \le \epsilon \)
The convergence rate of the Jacobi method is however poor, one needs typically \( 3n^2-5n^2 \) rotations and each rotation requires \( 4n \) operations, resulting in a total of \( 12n^3-20n^3 \) operations in order to zero out non-diagonal matrix elements.
$$
{\bf B} =
\left( \begin{array}{ccc}
1 & 0 & 0 \\
0 & c & -s \\
0 & s & c
\end{array} \right)\left( \begin{array}{ccc}
a_{11} & a_{12} & a_{13} \\
a_{21} & a_{22} & a_{23} \\
a_{31} & a_{32} & a_{33}
\end{array} \right)
\left( \begin{array}{ccc}
1 & 0 & 0 \\
0 & c & s \\
0 & -s & c
\end{array} \right).
$$
We will choose the angle \( \theta \) in order to have \( a_{23}=a_{32}=0 \).
We get (symmetric matrix)
$$
{\bf B} =\left( \begin{array}{ccc}
a_{11} & a_{12}c -a_{13}s& a_{12}s+a_{13}c \\
a_{12}c -a_{13}s & a_{22}c^2+a_{33}s^2 -2a_{23}sc& (a_{22}-a_{33})sc +a_{23}(c^2-s^2) \\
a_{12}s+a_{13}c & (a_{22}-a_{33})sc +a_{23}(c^2-s^2) & a_{22}s^2+a_{33}c^2 +2a_{23}sc
\end{array} \right).
$$
Note that \( a_{11} \) is unchanged! As it should.
$$
{\bf B} =\left( \begin{array}{ccc}
a_{11} & a_{12}c -a_{13}s& a_{12}s+a_{13}c \\
a_{12}c -a_{13}s & a_{22}c^2+a_{33}s^2 -2a_{23}sc& (a_{22}-a_{33})sc +a_{23}(c^2-s^2) \\
a_{12}s+a_{13}c & (a_{22}-a_{33})sc +a_{23}(c^2-s^2) & a_{22}s^2+a_{33}c^2 +2a_{23}sc
\end{array} \right).
$$
or
$$
\begin{align*}
b_{11} &= a_{11} \\
b_{12} &= a_{12}\cos\theta - a_{13}\sin\theta , 1 \ne 2, 1 \ne 3 \\
b_{13} &= a_{13}\cos\theta + a_{12}\sin\theta , 1 \ne 2, 1 \ne 3 \nonumber\\
b_{22} &= a_{22}\cos^2\theta - 2a_{23}\cos\theta \sin\theta +a_{33}\sin^2\theta\nonumber\\
b_{33} &= a_{33}\cos^2\theta +2a_{23}\cos\theta \sin\theta +a_{22}\sin^2\theta\nonumber\\
b_{23} &= (a_{22}-a_{33})\cos\theta \sin\theta +a_{23}(\cos^2\theta-\sin^2\theta)\nonumber
\end{align*}
$$
We will fix the angle \( \theta \) so that \( b_{23}=0 \).
$$
{\bf B} =\left( \begin{array}{ccc}
b_{11} & b_{12}& b_{13} \\
b_{12}& b_{22}& 0 \\
b_{13}& 0& a_{33}
\end{array} \right).
$$
We repeat then assuming that \( b_{12} \) is the largest non-diagonal matrix element and get a
new matrix
$$
{\bf C} =
\left( \begin{array}{ccc}
c & -s & 0 \\
s & c & 0 \\
0 & 0 & 1
\end{array} \right)\left( \begin{array}{ccc}
b_{11} & b_{12} & b_{13} \\
b_{12} & b_{22} & 0 \\
b_{13} & 0 & b_{33}
\end{array} \right)
\left( \begin{array}{ccc}
c & s & 0 \\
-s & c & 0 \\
0 & 0 & 1
\end{array} \right).
$$
We continue this process till all non-diagonal matrix elements are zero (ideally).
You will notice that performing the above operations that the matrix element
\( b_{23} \) which was previous zero becomes different from zero. This is one of the problems which slows
down the jacobi procedure.
$$
\begin{align*}
b_{ii} &= a_{ii}, i \ne k, i \ne l \\
b_{ik} &= a_{ik}\cos\theta - a_{il}\sin\theta , i \ne k, i \ne l \\
b_{il} &= a_{il}\cos\theta + a_{ik}\sin\theta , i \ne k, i \ne l \nonumber\\
b_{kk} &= a_{kk}\cos^2\theta - 2a_{kl}\cos\theta \sin\theta +a_{ll}\sin^2\theta\nonumber\\
b_{ll} &= a_{ll}\cos^2\theta +2a_{kl}\cos\theta \sin\theta +a_{kk}\sin^2\theta\nonumber\\
b_{kl} &= (a_{kk}-a_{ll})\cos\theta \sin\theta +a_{kl}(\cos^2\theta-\sin^2\theta)\nonumber
\end{align*}
$$
This is what we will need to code.
// we have defined a matrix A and a matrix R for the eigenvector, both of dim n x n
// The final matrix R has the eigenvectors in its row elements, it is set to one
// for the diagonal elements in the beginning, zero else.
....
double tolerance = 1.0E-10;
int iterations = 0;
while ( maxnondiag > tolerance && iterations <= maxiter)
{
int p, q;
maxnondiag = offdiag(A, p, q, n);
Jacobi_rotate(A, R, p, q, n);
iterations++;
...
// the offdiag function
double offdiag(double **A, int p, int q, int n);
{
double max;
for (int i = 0; i < n; ++i)
{
for ( int j = i+1; j < n; ++j)
{
double aij = fabs(A[i][j]);
if ( aij > max)
{
max = aij; p = i; q = j;
return max;
...
void Jacobi_rotate ( double ** A, double ** R, int k, int l, int n )
{
double s, c;
if ( A[k][l] != 0.0 ) {
double t, tau;
tau = (A[l][l] - A[k][k])/(2*A[k][l]);
if ( tau >= 0 ) {
t = 1.0/(tau + sqrt(1.0 + tau*tau));
} else {
t = -1.0/(-tau +sqrt(1.0 + tau*tau));
c = 1/sqrt(1+t*t);
s = c*t;
} else {
c = 1.0;
s = 0.0;
double a_kk, a_ll, a_ik, a_il, r_ik, r_il;
a_kk = A[k][k];
a_ll = A[l][l];
A[k][k] = c*c*a_kk - 2.0*c*s*A[k][l] + s*s*a_ll;
A[l][l] = s*s*a_kk + 2.0*c*s*A[k][l] + c*c*a_ll;
A[k][l] = 0.0; // hard-coding non-diagonal elements by hand
A[l][k] = 0.0; // same here
for ( int i = 0; i < n; i++ ) {
if ( i != k && i != l ) {
a_ik = A[i][k];
a_il = A[i][l];
A[i][k] = c*a_ik - s*a_il;
A[k][i] = A[i][k];
A[i][l] = c*a_il + s*a_ik;
A[l][i] = A[i][l];
r_ik = R[i][k];
r_il = R[i][l];
R[i][k] = c*r_ik - s*r_il;
R[i][l] = c*r_il + s*r_ik;
return;
} // end of function jacobi_rotate
$$
{\bf S}={\bf S}_1{\bf S}_2\dots{\bf S}_{n-2},
$$
each of which successively transforms one row and one column of \( {\bf A} \) into the required tridiagonal form. Only \( n-2 \) transformations are required, since the last two elements are already in tridiagonal form. In order to determine each \( {\bf S_i} \) let us see what happens after the first multiplication, namely,
$$
{\bf S}_1^T{\bf A}{\bf S}_1= \left( \begin{array}{ccccccc} a_{11} & e_1 & 0 & 0 & \dots &0 & 0 \\
e_1 & a'_{22} &a'_{23} & \dots & \dots &\dots &a'_{2n} \\
0 & a'_{32} &a'_{33} & \dots & \dots &\dots &a'_{3n} \\
0 & \dots &\dots & \dots & \dots &\dots & \\
0 & a'_{n2} &a'_{n3} & \dots & \dots &\dots &a'_{nn} \\
\end{array} \right)
$$
where the primed quantities represent a matrix \( {\bf A}' \) of dimension \( n-1 \) which will subsequentely be transformed by \( {\bf S_2} \).
$$
\left ({\bf S}_{1}{\bf S}_{2} \right )^{T} {\bf A}{\bf S}_{1} {\bf S}_{2}
= \left( \begin{array}{ccccccc} a_{11} & e_1 & 0 & 0 & \dots &0 & 0 \\
e_1 & a'_{22} &e_2 & 0 & \dots &\dots &0 \\
0 & e_2 &a''_{33} & \dots & \dots &\dots &a''_{3n} \\
0 & \dots &\dots & \dots & \dots &\dots & \\
0 & 0 &a''_{n3} & \dots & \dots &\dots &a''_{nn} \\
\end{array} \right)
$$
{\bf Note that the effective size of the matrix on which we apply the transformation reduces for every new step. In the previous Jacobi method each similarity transformation is in principle performed on the full size of the original matrix.}
$$
a_{11}, a'_{22}, a''_{33}\dots a^{n-1}_{nn},
$$
and off-diagonal matrix elements
$$
e_1, e_2,e_3, \dots, e_{n-1}.
$$
The resulting matrix reads
$$
{\bf S}^{T} {\bf A} {\bf S} =
\left( \begin{array}{ccccccc} a_{11} & e_1 & 0 & 0 & \dots &0 & 0 \\
e_1 & a'_{22} & e_2 & 0 & \dots &0 &0 \\
0 & e_2 & a''_{33} & e_3 &0 &\dots & 0\\
\dots & \dots & \dots & \dots &\dots &\dots & \dots\\
0 & \dots & \dots & \dots &\dots &a^{(n-1)}_{n-2} & e_{n-1}\\
0 & \dots & \dots & \dots &\dots &e_{n-1} & a^{(n-1)}_{n-1}
\end{array} \right) .
$$
$$
{\bf S_{1}} = \left( \begin{array}{cc} 1 & {\bf 0^{T}} \\
{\bf 0}& {\bf P} \end{array} \right),
$$
with \( {\bf 0^{T}} \) being a zero row vector, \( {\bf 0^{T}} = \{0,0,\cdots\} \) of dimension \( (n-1) \). The matrix \( {\bf P} \) is symmetric with dimension (\( (n-1) \times (n-1) \)) satisfying \( {\bf P}^2={\bf I} \) and \( {\bf P}^T={\bf P} \). A possible choice which fullfils the latter two requirements is
$$
{\bf P}={\bf I}-2{\bf u}{\bf u}^T,
$$
where \( {\bf I} \) is the \( (n-1) \) unity matrix and \( {\bf u} \) is an \( n-1 \) column vector with norm \( {\bf u}^T{\bf u} \) (inner product).
$$
P_{ij}=\delta_{ij}-2u_iu_j,
$$
where \( i \) and \( j \) range from \( 1 \) to \( n-1 \). Applying the transformation \( {\bf S}_1 \) results in
$$
{\bf S}_1^T{\bf A}{\bf S}_1 = \left( \begin{array}{cc} a_{11} & ({\bf Pv})^T \\
{\bf Pv}& {\bf A}' \end{array} \right) ,
$$
where \( {\bf v^{T}} = \{a_{21}, a_{31},\cdots, a_{n1}\} \) and {\bf P} must satisfy (\( {\bf Pv})^{T} = \{k, 0, 0,\cdots \} \). Then
$$
\begin{equation}
{\bf Pv} = {\bf v} -2{\bf u}( {\bf u}^T{\bf v})= k {\bf e},
\tag{4}
\end{equation}
$$
with \( {\bf e^{T}} = \{1,0,0,\dots 0\} \).
$$
({\bf Pv})^T{\bf Pv} = k^{2} = {\bf v}^T{\bf v}=
|v|^2 = \sum_{i=2}^{n}a_{i1}^2,
$$
which determines the constant $ k = \pm v$.
$$
{\bf v} - k{\bf e} = 2{\bf u}( {\bf u}^T{\bf v}),
$$
and taking the scalar product of this equation with itself and obtain
$$
\begin{equation}
2( {\bf u}^T{\bf v})^2=(v^2\pm a_{21}v),
\tag{5}
\end{equation}
$$
which finally determines
$$
{\bf u}=\frac{{\bf v}-k{\bf e}}{2( {\bf u}^T{\bf v})}.
$$
In solving Eq. (5) great care has to be exercised so as to choose
those values which make the right-hand largest in order to avoid loss of numerical
precision.
The above steps are then repeated for every transformations till we have a
tridiagonal matrix suitable for obtaining the eigenvalues.
$$
{\bf A} =
\left( \begin{array}{cccc}
d_{1} & e_{1} & 0 & 0 \\
e_{1} & d_{2} & e_{2} & 0 \\
0 & e_{2} & d_{3} & e_{3} \\
0 & 0 & e_{3} & d_{4}
\end{array} \right).
$$
As a first observation, if any of the elements \( e_{i} \) are zero the matrix can be separated into smaller pieces before diagonalization. Specifically, if \( e_{1} = 0 \) then \( d_{1} \) is an eigenvalue.
$$
{\bf S_{1}} =
\left( \begin{array}{cccc}
\cos \theta & 0 & 0 & \sin \theta\\
0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 \\
\cos \theta & 0 & 0 & \cos \theta
\end{array} \right)
$$
$$
{\bf S_{1}^{T} A S_{1}} = {\bf A'} =
\left( \begin{array}{cccc}
d'_{1} & e'_{1} & 0 & 0 \\
e'_{1} & d_{2} & e_{2} & 0 \\
0 & e_{2} & d_{3} & e'{3} \\
0 & 0 & e'_{3} & d'_{4}
\end{array} \right)
$$
produces a matrix where the primed elements in \( {\bf A'} \) have been changed by the transformation whereas the unprimed elements are unchanged. If we now choose \( \theta \) to give the element \( a_{21}^{'} = e^{'}= 0 \) then we have the first eigenvalue \( = a_{11}^{'} = d_{1}^{'} \). (This is actually what you are doing in project 2!!)
What we see here is just a special case of the more general procedure developed by Francis in two articles in 1961 and 1962. Using Jacobi's method is not very efficient ether.
The algorithm is based on the so-called {\bf QR} method (or just {\bf QR}-algorithm). It follows from a theorem by Schur which states that any square matrix can be written out in terms of an orthogonal matrix \( \hat{Q} \) and an upper triangular matrix \( \hat{U} \). Historically \( R \) was used instead of \( U \) since the wording right triangular matrix was first used.
Schur's theorem
$$
\hat{A} = \hat{Q}\hat{U},
$$
is used to rewrite any square matrix into a unitary matrix times an upper triangular matrix.
We say that a square matrix is similar to a triangular matrix.
Householder's algorithm which we have derived is just a special case of the general Householder algorithm. For a symmetric square matrix we obtain a tridiagonal matrix.
There is a corollary to Schur's theorem which states that every Hermitian matrix is unitarily similar to a diagonal matrix.
$$
\hat{A}\hat{Q} = \hat{Q}\hat{U}\hat{Q},
$$
and multiply from the left with \( \hat{Q}^{-1} \) we get
$$
\hat{Q}^{-1}\hat{A}\hat{Q} = \hat{B}=\hat{U}\hat{Q},
$$
where the matrix \( \hat{B} \) is a similarity transformation of \( \hat{A} \) and has the same eigenvalues
as \( \hat{B} \).
$$
\hat{A} = \hat{Q}\hat{U},
$$
and multiply from the left with \( \hat{Q}^{-1} \) resulting in
$$
\hat{Q}^{-1}\hat{A} = \hat{U}.
$$
Suppose that \( \hat{Q} \) consists of a series of planar Jacobi like rotations acting on sub blocks
of \( \hat{A} \) so that all elements below the diagonal are zeroed out
$$
\hat{Q}=\hat{R}_{12}\hat{R}_{23}\dots\hat{R}_{n-1,n}.
$$
$$
\hat{R}_{12} =
\left( \begin{array}{ccccccccc}
c&s &0 &0 &0 & \dots &0 & 0 & 0\\
-s&c &0 &0 &0 & \dots &0 & 0 & 0\\
0&0 &1 &0 &0 & \dots &0 & 0 & 0\\
\dots&\dots &\dots &\dots &\dots &\dots \\
0&0 &0 & 0 & 0 & \dots &1 &0 &0 \\
0&0 &0 & 0 & 0 & \dots &0 &1 &0 \\
0&0 &0 & 0 & 0 & \dots &0 &0 & 1
\end{array} \right)
$$
$$
\hat{U} =
\left( \begin{array}{ccccccccc}
x&x &x &0 &0 & \dots &0 & 0 & 0\\
0&x &x &x &0 & \dots &0 & 0 & 0\\
0&0 &x &x &x & \dots &0 & 0 & 0\\
\dots&\dots &\dots &\dots &\dots &\dots \\
0&0 &0 & 0 & 0 & \dots &x &x &x \\
0&0 &0 & 0 & 0 & \dots &0 &x &x \\
0&0 &0 & 0 & 0 & \dots &0 &0 & x
\end{array} \right)
$$
which has a second superdiagonal.
$$
\hat{Q}^{-1}\hat{A}\hat{Q} = \hat{B},
$$
from Schur's theorem the matrix \( \hat{B} \) is triangular and the eigenvalues the same as those of
\( \hat{A} \) and are given by the diagonal matrix elements of
\( \hat{B} \). Why?
Our matrix \( \hat{B}=\hat{U}\hat{Q} \).
The eigenvalues of a matrix can be obtained using the characteristic polynomial
$$
P(\lambda) = det(\lambda{\bf I}-{\bf A})= \prod_{i=1}^{n}\left(\lambda_i-\lambda\right),
$$
which rewritten in matrix form reads
$$
P(\lambda)= \left( \begin{array}{ccccccc} d_1-\lambda & e_1 & 0 & 0 & \dots &0 & 0 \\
e_1 & d_2-\lambda & e_2 & 0 & \dots &0 &0 \\
0 & e_2 & d_3-\lambda & e_3 &0 &\dots & 0\\
\dots & \dots & \dots & \dots &\dots &\dots & \dots\\
0 & \dots & \dots & \dots &\dots &d_{N_{\mbox{step}}-2}-\lambda & e_{N_{\mbox{step}}-1}\\
0 & \dots & \dots & \dots &\dots &e_{N_{\mbox{step}}-1} & d_{N_{\mbox{step}}-1}-\lambda
\end{array} \right)
$$
$$
P_k(\lambda)=(d_k-\lambda)P_{k-1}(\lambda)-e_{k-1}^2P_{k-2}(\lambda).
$$
Together with the starting values \( P_1(\lambda) \) and \( P_2(\lambda) \) and good root searching methods
we arrive at an efficient computational scheme for finding the roots of \( P_n(\lambda) \).
However, for large matrices this algorithm is rather inefficient and time-consuming.
C++:
void trd2(double $**$a, int n, double d[], double e[])
void tqli(double d[], double[], int n, double $**$z)
Fortran:
CALL tred2(a, n, d, e)
CALL tqli(d, e, n, z)
#include <iostream>
#define MAX 10
using namespace std;
int main(){
// Values needed for dgesv
int n;
int nrhs = 1;
double a[MAX][MAX];
double b[1][MAX];
int lda = MAX;
int ldb = MAX;
int ipiv[MAX];
int info;
// Other values
int i,j;
// Read the values of the matrix
cout << "Enter n \n";
cin >> n;
cout << "On each line type a row of the matrix A followed by one element of b:\n";
for(i = 0; i < n; i++){
cout << "row " << i << " ";
for(j = 0; j < n; j++)std::cin >> a[j][i];
cin >> b[0][i];
// Solve the linear system
dgesv(n, nrhs, &a[0][0], lda, ipiv, &b[0][0], ldb, &info);
// Check for success
if(info == 0)
{
// Write the answer
cout << "The answer is\n";
for(i = 0; i < n; i++)
cout << "b[" << i << "]\t" << b[0][i] << "\n";
else
{
// Write an error message
cerr << "dgesv returned error " << info << "\n";
return info;
Lanczos' algorithm generates a sequence of real tridiagonal matrices \( T_k \) of dimension \( k\times k \) with \( k\le n \), with the property that the extremal eigenvalues of \( T_k \) are progressively better estimates of \( \hat{A} \)' extremal eigenvalues.
The method converges to the extremal eigenvalues.
The similarity transformation is
$$
\hat{T}= \hat{Q}^{T}\hat{A}\hat{Q},
$$
with the first vector \( \hat{Q}\hat{e}_1=\hat{q}_1 \).
$$
\hat{T}= \hat{Q}^{T}\hat{A}\hat{Q},
$$
with the first vector \( \hat{Q}\hat{e}_1=\hat{q}_1 \).
We can write out the matrix \( \hat{Q} \) in terms of its column vectors
$$
\hat{Q}=\left[\hat{q}_1\hat{q}_2\dots\hat{q}_n\right].
$$
$$
\hat{T}= \hat{Q}^{T}\hat{A}\hat{Q},
$$
can be written as
$$
\hat{T} = \left(\begin{array}{cccccc}
\alpha_1& \beta_1 & 0 &\dots & \dots &0 \\
\beta_1 & \alpha_2 & \beta_2 &0 &\dots &0 \\
0& \beta_2 & \alpha_3 & \beta_3 & \dots &0 \\
\dots& \dots & \dots &\dots &\dots & 0 \\
\dots& & &\beta_{n-2} &\alpha_{n-1}& \beta_{n-1} \\
0& \dots &\dots &0 &\beta_{n-1} & \alpha_{n} \\
\end{array} \right)
$$
$$
\hat{T}= \hat{Q}^{T}\hat{A}\hat{Q},
$$
as
$$
\hat{Q}\hat{T}= \hat{A}\hat{Q},
$$
and if we equate columns (recall from the previous slide)
$$
\hat{T} = \left(\begin{array}{cccccc}
\alpha_1& \beta_1 & 0 &\dots & \dots &0 \\
\beta_1 & \alpha_2 & \beta_2 &0 &\dots &0 \\
0& \beta_2 & \alpha_3 & \beta_3 & \dots &0 \\
\dots& \dots & \dots &\dots &\dots & 0 \\
\dots& & &\beta_{n-2} &\alpha_{n-1}& \beta_{n-1} \\
0& \dots &\dots &0 &\beta_{n-1} & \alpha_{n} \\
\end{array} \right)
$$
we obtain
$$
\hat{A}\hat{q}_k=\beta_{k-1}\hat{q}_{k-1}+\alpha_k\hat{q}_k+\beta_k\hat{q}_{k+1}.
$$
$$
\hat{A}\hat{q}_k=\beta_{k-1}\hat{q}_{k-1}+\alpha_k\hat{q}_k+\beta_k\hat{q}_{k+1},
$$
with \( \beta_0\hat{q}_0=0 \) for \( k=1:n-1 \).
Remember that the vectors \( \hat{q}_k \) are orthornormal and this implies
$$
\alpha_k=\hat{q}_k^T\hat{A}\hat{q}_k,
$$
and these vectors are called Lanczos vectors.
$$
\hat{A}\hat{q}_k=\beta_{k-1}\hat{q}_{k-1}+\alpha_k\hat{q}_k+\beta_k\hat{q}_{k+1},
$$
with \( \beta_0\hat{q}_0=0 \) for \( k=1:n-1 \) and
$$
\alpha_k=\hat{q}_k^T\hat{A}\hat{q}_k.
$$
If
$$
\hat{r}_k=(\hat{A}-\alpha_k\hat{I})\hat{q}_k-\beta_{k-1}\hat{q}_{k-1},
$$
is non-zero, then
$$
\hat{q}_{k+1}=\hat{r}_{k}/\beta_k,
$$
with \( \beta_k=\pm ||\hat{r}_{k}||_2 \).
r_0 = q_1; beta_0=1; q_0=0; int k = 0;
while (beta_k != 0)
q_{k+1} = r_k/beta_k
k = k+1
alpha_k = q_k^T A q_k
r_k = (A-alpha_k I) q_k -beta_{k-1}q_{k-1}
beta_k = || r_k||_2
end while
$$
\begin{equation}
\frac{dy}{dt}=f(t,y).
\end{equation}
$$
This equation is of first order and \( f \) is an arbitrary function.
A second-order equation goes typically like
$$
\begin{equation}
\frac{d^2y}{dt^2}=f(t,\frac{dy}{dt},y).
\end{equation}
$$
A well-known second-order equation is Newton's second law
$$
\begin{equation}
m\frac{d^2x}{dt^2}=-kx,
\tag{6}
\end{equation}
$$
where \( k \) is the force constant. ODE depend only on one
variable
$$
\begin{equation}
i\hbar\frac{\partial \psi({\bf x},t)}{\partial t}=
-\frac{\hbar^2}{2m}\left( \frac{\partial^2 \psi({\bf r},t)}{\partial x^2} +
\frac{\partial^2 \psi({\bf r},t)}{\partial y^2}+
\frac{\partial^2 \psi({\bf r},t)}{\partial z^2}\right) + V({\bf x})\psi({\bf x},t),
\end{equation}
$$
may depend on several variables. In certain cases, like the above
equation, the wave function can be factorized in functions of the separate
variables, so that the Schr\"odinger equation
can be rewritten in terms of sets of ordinary differential equations.
These equations are discussed in chapter 10. Involve boundary conditions in addition to initial conditions.
$$
\begin{equation}
\frac{dy}{dt}=g^3(t)y(t),
\end{equation}
$$
is an example of a linear equation, while
$$
\begin{equation}
\frac{dy}{dt}=g^3(t)y(t)-g(t)y^2(t),
\end{equation}
$$
is a non-linear ODE.
$$
\frac{dm}{dr}=4\pi r^{2}\rho (r)/c^2,
$$
and
$$
\frac{dP}{dr}=-\frac{Gm(r)}{r^{2}}\rho (r)/c^2.
$$
where \( \rho \) is the mass-energy density.
The initial conditions are dictated by the mass being
zero at the center of the star, i.e., when \( r=0 \),
yielding \( m(r=0)=0 \). The other condition is that
the pressure vanishes at the surface of the star.
In the solution of the Schr\"odinger equation for a particle in a potential, we may need to apply boundary conditions as well, such as demanding continuity of the wave function and its derivative.
$$
\begin{equation}
\frac{dy^{(1)}(t)}{dt}=\frac{dx(t)}{dt}=y^{(2)}(t),
\end{equation}
$$
we can rewrite Newton's second law as two coupled first-order
differential equations
$$
\begin{equation}
m\frac{dy^{(2)}(t)}{dt}=-kx(t)=-ky^{(1)}(t),
\tag{7}
\end{equation}
$$
and
$$
\begin{equation}
\frac{dy^{(1)}(t)}{dt}=y^{(2)}(t).
\tag{8}
\end{equation}
$$
$$
\begin{equation}
y_0=y(t=t_0).
\end{equation}
$$
We are interested in solving a differential equation in a region
in space [a,b]. We define a step \( h \) by splitting the interval
in \( N \) sub intervals, so that we have
$$
\begin{equation}
h=\frac{b-a}{N}.
\end{equation}
$$
With this step and the derivative of \( y \) we can construct the
next value of the function \( y \) at
$$
\begin{equation}
y_1=y(t_1=t_0+h),
\end{equation}
$$
and so forth.
$$
\begin{equation}
y_{i+1}=y(t=t_i+h)=y(t_i) + h\Delta(t_i,y_i(t_i)) + O(h^{p+1}),
\end{equation}
$$
where \( O(h^{p+1}) \) represents the truncation error. To determine
\( \Delta \), we Taylor expand our function \( y \)
$$
\begin{equation}
y_{i+1}=y(t=t_i+h)=y(t_i) + h(y'(t_i)+\dots +y^{(p)}(t_i)\frac{h^{p-1}}{p!}) + O(h^{p+1}),
\tag{9}
\end{equation}
$$
where we will associate the derivatives in the parenthesis with
$$
\begin{equation}
\Delta(t_i,y_i(t_i))=(y'(t_i)+\dots +y^{(p)}(t_i)\frac{h^{p-1}}{p!}).
\tag{10}
\end{equation}
$$
$$
\begin{equation}
y'(t_i)=f(t_i,y_i)
\end{equation}
$$
and if we truncate \( \Delta \) at the first derivative, we have
$$
\begin{equation}
y_{i+1}=y(t_i) + hf(t_i,y_i) + O(h^2),
\tag{11}
\end{equation}
$$
which when complemented with \( t_{i+1}=t_i+h \) forms
the algorithm for the well-known Euler method.
Note that at every step we make an approximation error
of the order of \( O(h^2) \), however the total error is the sum over all
steps \( N=(b-a)/h \), yielding thus a global error which goes like
\( NO(h^2)\approx O(h) \).
$$
f'_{2c}(x)= \frac{f(x+h)-f(x)}{h}+O(h),
$$
we can enter into roundoff error problems when we subtract
two almost equal numbers \( f(x+h)-f(x)\approx 0 \).
Euler's method is not recommended for precision calculation,
although it is handy to use in order to get a first
view on how a solution may look like. As an example,
consider Newton's equation rewritten in Eqs.
(7) and (8). We define \( y_0=y^{(1)}(t=0) \)
an \( v_0=y^{(2)}(t=0) \). The first steps in Newton's equations
are then
$$
\begin{equation}
y^{(1)}_1=y_0+hv_0+O(h^2)
\end{equation}
$$
and
$$
\begin{equation}
y^{(2)}_1=v_0-hy_0k/m+O(h^2).
\end{equation}
$$
$$
\begin{equation}
y^{(1)}_{n+1}=y^{(1)}_{n}+h y^{(2)}_{n+1}+O(h^2)
\end{equation}
$$
and
$$
\begin{equation}
y^{(2)}_{n+1}=y^{(2)}_{n}+h a_{n}+O(h^2).
\end{equation}
$$
The acceleration \( a_n \) is a function of \( a_n(y^{(1)}_{n}, y^{(2)}_{n},t) \) and needs to be evaluated
as well. This is the Euler-Cromer method.
$$
\begin{equation}
\Delta(t_i,y_i(t_i))=f(t_i)+\frac{h}{2}\frac{df(t_i,y_i)}{dt}+O(h^3).
\end{equation}
$$
The second derivative can be rewritten as
$$
\begin{equation}
y''=f'=\frac{df}{dt}=\frac{\partial f}{\partial t}+\frac{\partial f}{\partial y}\frac{\partial y}{\partial t}=\frac{\partial f}{\partial t}+\frac{\partial f}{\partial y}f
\tag{12}
\end{equation}
$$
and we can rewrite Eq. (9) as
$$
\begin{equation}
y_{i+1}=y(t=t_i+h)=y(t_i) +hf(t_i)+
\frac{h^2}{2}\left(\frac{\partial f}{\partial t}+\frac{\partial f}{\partial y}f\right) + O(h^{3 }),
\end{equation}
$$
which has a local approximation error \( O(h^{3 }) \) and a global
error \( O(h^{2}) \).
$$
\begin{equation}
y_{i+1}=y(t=t_i+h)=y(t_i) + h(f(t_i,y_i)+\dots f^{(p-1)}(t_i,y_i)
\frac{h^{p-1}}{p!}) + O(h^{p+1}).
\end{equation}
$$
These methods, based on higher-order derivatives, are in general not used
in numerical computation, since they rely on evaluating
derivatives several times. Unless one has analytical expressions
for these, the risk of roundoff errors is large.
$$
\begin{equation}
y^{(1)}_{n+1}=y^{(1)}_{n}+\frac{h}{2}\left(y^{(2)}_{n+1}+y^{(2)}_{n}\right)+O(h^2)
\end{equation}
$$
and
$$
\begin{equation}
y^{(2)}_{n+1}=y^{(2)}_{n}+h a_{n}+O(h^2),
\end{equation}
$$
yielding
$$
\begin{equation}
y^{(1)}_{n+1}=y^{(1)}_{n}+hy^{(2)}_{n}+\frac{h^2}{2}a_n+O(h^3)
\end{equation}
$$
implying that the local truncation error in the position is now \( O(h^3) \), whereas Euler's or Euler-Cromer's
methods have a local error of \( O(h^2) \).
One method that avoids this is the so-called half-step method. Here we define
$$
\begin{equation}
y^{(2)}_{n+1/2}=y^{(2)}_{n-1/2}+h a_{n}+O(h^2),
\end{equation}
$$
and
$$
\begin{equation}
y^{(1)}_{n+1}=y^{(1)}_{n}+hy^{(2)}_{n+1/2} +O(h^2).
\end{equation}
$$
Note that this method needs the calculation of \( y^{(2)}_{1/2} \). This is done using
e.g., Euler's method
$$
\begin{equation}
y^{(2)}_{1/2}=y^{(2)}_{0}+h a_{0}+O(h^2).
\end{equation}
$$
As this method is numerically stable, it is often used instead of Euler's method.
$$
\begin{equation}
y^{(2)}_{n+1}=y^{(2)}_{n}+h a_{n+1/2}+O(h^2),
\tag{13}
\end{equation}
$$
and
$$
\begin{equation}
\tag{14}
y^{(1)}_{n+1}=y^{(1)}_{n}+hy^{(2)}_{n+1/2} +O(h^2).
\end{equation}
$$
The program program2.cpp includes all of the above methods.
To see this, consider first the following definitions
$$
\begin{equation}
\frac{dy}{dt}=f(t,y),
\end{equation}
$$
and
$$
\begin{equation}
y(t)=\int f(t,y) dt,
\end{equation}
$$
and
$$
\begin{equation}
y_{i+1}=y_i+ \int_{t_i}^{t_{i+1}} f(t,y) dt.
\end{equation}
$$
$$
\begin{equation}
\int_{t_i}^{t_{i+1}} f(t,y) dt \approx hf(t_{i+1/2},y_{i+1/2}) +O(h^3).
\end{equation}
$$
This means in turn that we have
$$
\begin{equation}
y_{i+1}=y_i + hf(t_{i+1/2},y_{i+1/2}) +O(h^3).
\end{equation}
$$
$$
\begin{equation}
y_{(i+1/2)}=y_i + \frac{h}{2}\frac{dy}{dt} =
y(t_i) + \frac{h}{2}f(t_i,y_i).
\end{equation}
$$
This means that we can define the following algorithm for
the second-order Runge-Kutta method, RK2.
$$
\begin{equation}
k_1=hf(t_i,y_i),
\end{equation}
$$
$$
\begin{equation}
k_2=hf(t_{i+1/2},y_i+k_1/2),
\end{equation}
$$
with the final value
$$
\begin{equation}
y_{i+i}\approx y_i + k_2 +O(h^3).
\end{equation}
$$
$$
\begin{equation}
k_1=hf(t_i,y_i),
\end{equation}
$$
$$
\begin{equation}
k_2=hf(t_i+h/2,y_i+k_1/2),
\end{equation}
$$
$$
\begin{equation}
k_3=hf(t_i+h/2,y_i+k_2/2)
\end{equation}
$$
$$
\begin{equation}
k_4=hf(t_i+h,y_i+k_3)
\end{equation}
$$
with the final value
$$
\begin{equation}
y_{i+1}=y_i +\frac{1}{6}\left( k_1 +2k_2+2k_3+k_4\right).
\end{equation}
$$
Thus, the algorithm consists in first calculating \( k_1 \)
with \( t_i \), \( y_1 \) and \( f \) as inputs. Thereafter, we increase the step
size by \( h/2 \) and calculate \( k_2 \), then \( k_3 \) and finally \( k_4 \). Global error
as \( O(h^4) \).
$$
F=-kx.
$$
$$
m\frac{d^2x}{dt^2}=-kx,
$$
or we could rephrase it as
$$
\frac{d^2x}{dt^2}=-\frac{k}{m}x=-\omega_0^2x,
\tag{15}
$$
with the angular frequency \( \omega_0^2=k/m \).
The above differential equation has the advantage that it can be solved analytically with solutions on the form
$$
x(t)=Acos(\omega_0t+\nu),
$$
where \( A \) is the amplitude and \( \nu \) the phase constant.
This provides in turn an important test for the numerical
solution and the development of a program for more complicated cases
which cannot be solved analytically.
$$
\frac{dx(t)}{dt}=v(t),
$$
and
$$
\frac{dv(t)}{dt}=-\omega_0^2x(t).
$$
We are now going to solve these equations using the Runge-Kutta method to fourth order discussed previously.
Since functions like \( cos \) are periodic with a period \( 2\pi \), then the solution \( x(t) \) has also to be periodic. This means that
$$
x(t+T)=x(t),
$$
with \( T \) the period defined as
$$
T=\frac{2\pi}{\omega_0}=\frac{2\pi}{\sqrt{k/m}}.
$$
Observe that \( T \) depends only on \( k/m \) and not on the amplitude of the solution.
Suppose we choose the initial conditions
$$
x(t=0)=1\quad \mbox{m}\quad v(t=0)=0\quad\mbox{m/s},
$$
meaning that block is at rest at \( t=0 \) but with a potential energy
$$
E_0=\frac{1}{2}kx(t=0)^2=\frac{1}{2}k.
$$
The total energy at any time \( t \) has however to be conserved, meaning
that our solution has to fulfil the condition
$$
E_0=\frac{1}{2}kx(t)^2+\frac{1}{2}mv(t)^2.
$$
y[0] = initial_x; // initial position
y[1] = initial_v; // initial velocity
t=0.; // initial time
E0 = 0.5*y[0]*y[0]+0.5*y[1]*y[1]; // the initial total energy
// now we start solving the differential
// equations using the RK4 method
while (t <= tmax){
derivatives(t, y, dydt); // initial derivatives
runge_kutta_4(y, dydt, n, t, h, yout, derivatives);
for (i = 0; i < n; i++) {
y[i] = yout[i];
t += h;
output(t, y, E0); // write to file
// this function sets up the derivatives for this special case
void derivatives(double t, double *y, double *dydt)
{
dydt[0]=y[1]; // derivative of x
dydt[1]=-y[0]; // derivative of v
} // end of function derivatives
void runge_kutta_4(double *y, double *dydx, int n,
double x, double h,
double *yout, void (*derivs)(double, double *, double *))
{
int i;
double xh,hh,h6;
double *dym, *dyt, *yt;
// allocate space for local vectors
dym = new double [n];
dyt = new double [n];
yt = new double [n];
hh = h*0.5;
h6 = h/6.;
xh = x+hh;
for (i = 0; i < n; i++) {
yt[i] = y[i]+hh*dydx[i];
(*derivs)(xh,yt,dyt); // computation of k2
for (i = 0; i < n; i++) {
yt[i] = y[i]+hh*dyt[i];
(*derivs)(xh,yt,dym); // computation of k3
for (i=0; i < n; i++) {
yt[i] = y[i]+h*dym[i];
dym[i] += dyt[i];
(*derivs)(x+h,yt,dyt); // computation of k4
// now we upgrade y in the array yout
for (i = 0; i < n; i++){
yout[i] = y[i]+h6*(dydx[i]+dyt[i]+2.0*dym[i]);
delete []dym;
delete [] dyt;
delete [] yt;
} // end of function Runge-kutta 4
$$
\begin{equation}
ml\frac{d^2\theta}{dt^2}+mgsin(\theta)=0,
\end{equation}
$$
with an angular velocity and acceleration given by
$$
\begin{equation}
v=l\frac{d\theta}{dt},
\end{equation}
$$
and
$$
\begin{equation}
a=l\frac{d^2\theta}{dt^2}.
\end{equation}
$$
$$
\begin{equation}
ml\frac{d^2\theta}{dt^2}+\nu\frac{d\theta}{dt} +mgsin(\theta)=0,
\tag{16}
\end{equation}
$$
where \( \nu \) is now a positive constant parameterizing the viscosity
of the medium in question. In order to maintain the motion against
viscosity, it is necessary to add some external driving force.
We choose here a periodic driving force. The last equation becomes then
$$
\begin{equation}
ml\frac{d^2\theta}{dt^2}+\nu\frac{d\theta}{dt} +mgsin(\theta)=Asin(\omega t),
\tag{17}
\end{equation}
$$
with \( A \) and \( \omega \) two constants representing the amplitude and
the angular frequency respectively. The latter is called the driving frequency.
$$
\omega_0=\sqrt{g/l},
$$
the so-called natural frequency
and the new dimensionless quantities
$$
\hat{t}=\omega_0t,
$$
with the dimensionless driving frequency
$$
\hat{\omega}=\frac{\omega}{\omega_0},
$$
and introducing the quantity \( Q \), called the quality factor,
$$
Q=\frac{mg}{\omega_0\nu},
$$
and the dimensionless amplitude
$$
\hat{A}=\frac{A}{mg}
$$
$$
\frac{d^2\theta}{d\hat{t}^2}+\frac{1}{Q}\frac{d\theta}{d\hat{t}}
+sin(\theta)=\hat{A}cos(\hat{\omega}\hat{t}).
$$
This equation can in turn be recast in terms of two coupled first-order differential equations as follows
$$
\frac{d\theta}{d\hat{t}}=\hat{v},
$$
and
$$
\frac{d\hat{v}}{d\hat{t}}=-\frac{\hat{v}}{Q}-sin(\theta)+\hat{A}cos(\hat{\omega}\hat{t}).
$$
These are the equations to be solved. The factor \( Q \) represents the number of oscillations of the undriven system that must occur before its energy is significantly reduced due to the viscous drag. The amplitude \( \hat{A} \) is measured in units of the maximum possible gravitational torque while \( \hat{\omega} \) is the angular frequency of the external torque measured in units of the pendulum's natural frequency.
program2.cpp
of chapter 8 we have canned the following methods
void euler();
void euler_cromer();
void midpoint();
void euler_richardson();
void half_step();
void rk2(); //runge-kutta-second-order
void rk4_step(double,double*,double*,double); // we need it in function rk4() and asc()
void rk4(); //runge-kutta-fourth-order
void asc(); //runge-kutta-fourth-order with adaptive stepsize control
class pendulum
{
private:
double Q, A_roof, omega_0, omega_roof,g; //
double y[2]; //for the initial-values of phi and v
int n; // how many steps
double delta_t,delta_t_roof;
public:
void derivatives(double,double*,double*);
void initialise();
void euler();
void euler_cromer();
void midpoint();
void euler_richardson();
void half_step();
void rk2(); //runge-kutta-second-order
void rk4_step(double,double*,double*,double); // we need it in function rk4() and asc()
void rk4(); //runge-kutta-fourth-order
void asc(); //runge-kutta-fourth-order with adaptive stepsize control
};
void pendulum::derivatives(double t, double* in, double* out)
{ /* Here we are calculating the derivatives at (dimensionless) time t
'in' are the values of phi and v, which are used for the calculation
The results are given to 'out' */
out[0]=in[1]; //out[0] = (phi)' = v
if(Q)
out[1]=-in[1]/((double)Q)-sin(in[0])+A_roof*cos(omega_roof*t); //out[1] = (phi)''
else
out[1]=-sin(in[0])+A_roof*cos(omega_roof*t); //out[1] = (phi)''
int main()
{
pendulum testcase;
testcase.initialise();
testcase.euler();
testcase.euler_cromer();
testcase.midpoint();
testcase.euler_richardson();
testcase.half_step();
testcase.rk2();
testcase.rk4();
return 0;
} // end of main function
MODULE pendulum
USE CONSTANTS
IMPLICIT NONE
REAL(DP), PRIVATE :: Q, A_roof, omega_0, omega_roof,g
REAL(DP), PRIVATE :: y(2) ! for the initial-values of phi and v
INTEGER, PRIVATE :: n ! how many steps
REAL(DP), PRIVATE :: delta_t,delta_t_roof
CONTAINS
SUBROUTINE derivatives(..)
SUBROUTINE initialise(..)
SUBROUTINE euler(..)
SUBROUTINE euler_cromer(..)
SUBROUTINE midpoint(..)
etc
END MODULE pendulum
You don't need to answer all questions in a chronological order. When you write the introduction you could focus on the following aspects
$$
\tilde{x}=x_1+Ch^{M+1}+O(h^{M+2}),
$$
and
$$
\tilde{x}=x_2+2C(h/2)^{M+1}+O(h^{M+2}),
$$
with \( C \) a constant. Note that we calculate two halves in the last equation. We get then
$$
|x_1-x_2| = Ch^{M+1}(1-\frac{1}{2^M}).
$$
yielding
$$
C=\frac{|x_1-x_2|}{(1-2^{-M})h^{M+1}}.
$$
We rewrite
$$
\tilde{x}=x_2+\epsilon+O((h)^{M+2}),
$$
with
$$
\epsilon = \frac{|x_1-x_2|}{2^M-1}.
$$
$$
\tilde{x}=x_2+\epsilon+O((h)^{6}),
$$
with
$$
\epsilon = \frac{|x_1-x_2|}{15}.
$$
The estimate is one order higher than the original RK4. But this method is normally rather inefficient since it requires a lot of computations. We solve typically the equation three times at each time step.
However, we can compare the estimate \( \epsilon \) with some by us given accuracy \( \xi \).
We can then ask the question: what is, with a given \( x_j \) and \( t_j \), the largest possible step size \( \tilde{h} \) that leads to a truncation error below \( \xi \)?
We want
$$
C\tilde{h} \le \xi,
$$
which leads to
$$
\left(\frac{\tilde{h}}{h}\right)^{M+1}\frac{|x_1-x_2|}{(1-2^{-M})}\le \xi,
$$
meaning that
$$
\tilde{h}=h\left(\frac{\xi}{\epsilon}\right)^{1+1/M}.
$$
$$
\tilde{h}=h\left(\frac{\xi}{\epsilon}\right)^{1+1/M}.
$$
we can design the following algorithm:
$$
k_1 = h f (t_k , y_k ),
$$
$$
k_2 = h f (t_k + \frac{1}{4}h, y_k + \frac{1}{4}k_1) ,
$$
$$
k_3 = h f (t_k + \frac{3}{8}h, y_k + \frac{3}{32}k_1 + \frac{9}{32}k_2) ,
$$
$$
k_4 = h f (t_k + \frac{12}{13}h, y_k + \frac{1932}{2197}k_1 + \frac{7200}{2197}k_2+\frac{7296}{2197}k_3),
$$
$$
k_5 = h f (t_k + h, y_k + \frac{439}{216}k_1 -8k_2+ \frac{3680}{513}k_3+\frac{845}{4104}k_4),
$$
$$
k_6 = h f (t_k + \frac{1}{2}h, y_k - \frac{8}{27}k_1 + 2k_2-\frac{3544}{2565}k_2+\frac{1859}{4104}k_4-+\frac{11}{40}k_5).
$$
$$
y_{k+1} = y_k + \frac{25}{216}k_1+\frac{1408}{2565}k_3 +\frac{2197}{4101}k_4-\frac{1}{5}k_5,
$$
where the four function values \( k_1 \) , \( k_3 \) , \( k_4 \) , and \( k_5 \) are used. Notice that \( k_2 \) is not used here.
A better value for the solution is determined using a Runge-Kutta
method of order 5:
$$
z_{k+1} = y_k + \frac{16}{135}k_1+\frac{6656}{12825}k_3 +\frac{28561}{56430}k_4-\frac{9}{50}k_5+\frac{2}{55}k_6.
$$
The optimal time step \( \alpha h \) is then determined by
$$
\alpha = \left( \frac{\xi h}{2|z_{k+1}-y_{k+1}|}\right)^{1/4},
$$
with \( \xi \) our defined tolerance.
$$
A(x,y)\frac{\partial^2 U}{\partial x^2}+B(x,y)\frac{\partial^2 U}{\partial x\partial y}
+C(x,y)\frac{\partial^2 U}{\partial y^2}=F(x,y,U,\frac{\partial U}{\partial x}, \frac{\partial U}{\partial y})
$$
Examples
$$
B=C=0,
$$
give e.g., 1+1-dim diffusion equation
$$
A\frac{\partial^2 U}{\partial x^2}=\frac{\partial U}{\partial t}
$$
and is an example of a parabolic PDE
$$
A\frac{\partial^2 U}{\partial x^2}+C\frac{\partial^2 U}{\partial y^2}=\frac{\partial^2 U}{\partial t^2}
$$
Poisson's (Laplace's \( \rho =0 \)) equation
$$
\nabla^2 U({\bf x})=-4\pi \rho({\bf x}).
$$
$$
\frac{\kappa}{C\rho}\nabla^2 T({\bf x},t) =\frac{\partial T({\bf x},t)}{\partial t}
$$
$$
\frac{\kappa}{C\rho({\bf x},t)}\nabla^2 T({\bf x},t) =\frac{\partial T({\bf x},t)}{\partial t}
$$
$$
\nabla^2 u(x,t) =\frac{\partial u(x,t)}{\partial t},
$$
or
$$
u_{xx} = u_t,
$$
with initial conditions, i.e., the conditions at \( t=0 \),
$$
u(x,0)= g(x) \quad 0 \le x \le L
$$
with \( L=1 \) the length of the \( x \)-region of interest. The
boundary conditions are
$$
u(0,t)= a(t) \quad t \ge 0,
$$
and
$$
u(L,t)= b(t) \quad t \ge 0,
$$
where \( a(t) \) and \( b(t) \) are two functions which depend on time only, while
\( g(x) \) depends only on the position \( x \).
$$
u_t\approx \frac{u_{i,j+1}-u_{i,j}}{\Delta t},
$$
and
$$
u_{xx}\approx \frac{u_{i+i,j}-2u_{i,j}+u_{i-1,j}}{\Delta x^2}.
$$
The one-dimensional diffusion equation can then be rewritten in its
discretized version as
$$
\frac{u_{i,j+1}-u_{i,j}}{\Delta t}=\frac{u_{i+i,j}-2u_{i,j}+u_{i-1,j}}{\Delta x^2}.
$$
Defining \( \alpha = \Delta t/\Delta x^2 \) results in the explicit scheme
$$
\tag{18}
u_{i,j+1}= \alpha u_{i-1,j}+(1-2\alpha)u_{i,j}+\alpha u_{i+1,j}.
$$
$$
V_{j+1} = AV_{j}
$$
with
$$
A=\left(\begin{array}{cccc}1-2\alpha&\alpha&0& 0\dots\\
\alpha&1-2\alpha&\alpha & 0\dots \\
\dots & \dots & \dots & \dots \\
0\dots & 0\dots &\alpha& 1-2\alpha\end{array}
\right)
$$
yielding
$$
V_{j+1} = AV_{j}=\dots = A^jV_0
$$
The explicit scheme, although being rather simple to implement has a very weak
stability condition given by
$$
\Delta t/\Delta x^2 \le 1/2
$$
$$
u_t\approx \frac{u(x_i,t_j)-u(x_i,t_j-k)}{k}
$$
and
$$
u_{xx}\approx \frac{u(x_i+h,t_j)-2u(x_i,t_j)+u(x_i-h,t_j)}{h^2}
$$
Define \( \alpha = k/h^2 \). Gives
$$
u_{i,j-1}= -\alpha u_{i-1,j}+(1-2\alpha)u_{i,j}-\alpha u_{i+1,j}
$$
Here \( u_{i,j-1} \) is the only unknown quantity.
Have
$$
AV_{j} = V_{j-1}
$$
with
$$
A=\left(\begin{array}{cccc}1+2\alpha&-\alpha&0& 0\dots\\
-\alpha&1+2\alpha&-\alpha & 0\dots \\
\dots & \dots & \dots & \dots \\
0\dots & 0\dots &-\alpha& 1+2\alpha\end{array}
\right)
$$
which gives
$$
V_{j} = A^{-1}V_{j-1}=\dots = A^{-j}V_0
$$
Need only to invert a matrix
! now invert the matrix
CALL matinv( a, ndim, det)
DO i = 1, m
DO l=1, ndim
u(l) = DOT_PRODUCT(a(l,:),v(:))
ENDDO
v = u
t = i*k
DO j=1, ndim
WRITE(6,*) t, j*h, v(j)
ENDDO
ENDDO
$$
u_t\approx \frac{u(x,t+\Delta t)-u(x,t)}{\Delta t}=\frac{u(x_i,t_j+\Delta t)-u(x_i,t_j)}{\Delta t}
$$
The implicit method can be applied in a brute force way as well as long as the element of the matrix are constants.
$$
u_t\approx \frac{u(x,t)-u(x,t-\Delta t)}{\Delta t}=\frac{u(x_i,t_j)-u(x_i,t_j-\Delta t)}{\Delta t}
$$
However, it is more efficient to use a linear algebra solver for tridiagonal matrices.
$$
\frac{\theta}{\Delta x^2}\left(u_{i-1,j}-2u_{i,j}+u_{i+1,j}\right)+
\frac{1-\theta}{\Delta x^2}\left(u_{i+1,j-1}-2u_{i,j-1}+u_{i-1,j-1}\right)=
\frac{1}{\Delta t}\left(u_{i,j}-u_{i,j-1}\right),
$$
which for \( \theta=0 \) yields the forward formula for the first derivative and
the explicit scheme, while \( \theta=1 \) yields the backward formula and the implicit
scheme. These two schemes are called the backward and forward Euler schemes, respectively.
For \( \theta = 1/2 \) we obtain a new scheme after its inventors, Crank and Nicolson.
$$
-\alpha u_{i-1,j}+\left(2+2\alpha\right)u_{i,j}-\alpha u_{i+1,j}=
\alpha u_{i-1,j-1}+\left(2-2\alpha\right)u_{i,j-1}+\alpha u_{i+1,j-1},
$$
or in matrix-vector form as
$$
\left(2\hat{I}+\alpha\hat{B}\right)V_{j}=
\left(2\hat{I}-\alpha\hat{B}\right)V_{j-1},
$$
where the vector \( V_{j} \) is the same as defined in the implicit case while the matrix
\( \hat{B} \) is
$$
\hat{B}=\left(\begin{array}{cccc}2&-1&0& 0\dots\\
-1&2&-1 & 0\dots \\
\dots & \dots & \dots & \dots \\
0\dots & 0\dots && 2\end{array}
\right)
$$
$$
\begin{align}
u(x+\Delta x,t)&=u(x,t)+\frac{\partial u(x,t)}{\partial x} \Delta x+\frac{\partial^2 u(x,t)}{2\partial x^2}\Delta x^2+\mathcal{O}(\Delta x^3), \nonumber\\
u(x-\Delta x,t)&=u(x,t)-\frac{\partial u(x,t)}{\partial x}\Delta x+\frac{\partial^2 u(x,t)}{2\partial x^2} \Delta x^2+\mathcal{O}(\Delta x^3), \nonumber\\
u(x,t+\Delta t)&=u(x,t)+\frac{\partial u(x,t)}{\partial t}\Delta t+ \mathcal{O}(\Delta t^2).
\tag{19}
\end{align}
$$
$$
\begin{align}
&\left[\frac{\partial u(x,t)}{\partial t}\right]_{\text{approx}}
=\frac{\partial u(x,t)}{\partial t}+\mathcal{O}(\Delta t) , \nonumber\\
&\left[\frac{\partial^2 u(x,t)}{\partial x^2}\right]_{\text{approx}}
=\frac{\partial^2 u(x,t)}{\partial x^2}+\mathcal{O}(\Delta x^2).
\tag{20}
\end{align}
$$
It is easy to convince oneself that the backward Euler method must have the same truncation errors as the forward Euler scheme.
$$
\begin{align}
u(x+\Delta x, t+\Delta t)&=u(x,t')+\frac{\partial u(x,t')}{\partial x}\Delta x+\frac{\partial u(x,t')}{\partial t} \frac{\Delta t}{2} +\frac{\partial^2 u(x,t')}{2\partial x^2}\Delta x^2+\frac{\partial^2 u(x,t')}{2\partial t^2}\frac{\Delta t^2}{4} +\notag \\ \nonumber
&\frac{\partial^2 u(x,t')}{\partial x\partial t}\frac{\Delta t}{2} \Delta x+ \mathcal{O}(\Delta t^3)\\ \nonumber
u(x-\Delta x, t+\Delta t)&=u(x,t')-\frac{\partial u(x,t')}{\partial x}\Delta x+\frac{\partial u(x,t')}{\partial t} \frac{\Delta t}{2} +\frac{\partial^2 u(x,t')}{2\partial x^2}\Delta x^2+\frac{\partial^2 u(x,t')}{2\partial t^2}\frac{\Delta t^2}{4} - \notag\\ \nonumber
&\frac{\partial^2 u(x,t')}{\partial x\partial t}\frac{\Delta t}{2} \Delta x+ \mathcal{O}(\Delta t^3)\\
u(x+\Delta x,t)&=u(x,t')+\frac{\partial u(x,t')}{\partial x}\Delta x-\frac{\partial u(x,t')}{\partial t} \frac{\Delta t}{2} +\frac{\partial^2 u(x,t')}{2\partial x^2}\Delta x^2+\frac{\partial^2 u(x,t')}{2\partial t^2}\frac{\Delta t^2}{4} -\notag \\ \nonumber
&\frac{\partial^2 u(x,t')}{\partial x\partial t}\frac{\Delta t}{2} \Delta x+ \mathcal{O}(\Delta t^3)\\ \nonumber
u(x-\Delta x,t)&=u(x,t')-\frac{\partial u(x,t')}{\partial x}\Delta x-\frac{\partial u(x,t')}{\partial t} \frac{\Delta t}{2} +\frac{\partial^2 u(x,t')}{2\partial x^2}\Delta x^2+\frac{\partial^2 u(x,t')}{2\partial t^2}\frac{\Delta t^2}{4} +\notag \\ \nonumber
&\frac{\partial^2 u(x,t')}{\partial x\partial t}\frac{\Delta t}{2} \Delta x+ \mathcal{O}(\Delta t^3)\\ \nonumber
u(x,t+\Delta t)&=u(x,t')+\frac{\partial u(x,t')}{\partial t}\frac{\Delta_t}{2} +\frac{\partial ^2 u(x,t')}{2\partial t^2}\Delta t^2 + \mathcal{O}(\Delta t^3) \nonumber\\
u(x,t)&=u(x,t')-\frac{\partial u(x,t')}{\partial t}\frac{\Delta t}{2}+\frac{\partial ^2 u(x,t')}{2\partial t^2}\Delta t^2 + \mathcal{O}(\Delta t^3)
\tag{21}
\end{align}
$$
$$
\begin{align}
&\left[\frac{\partial u(x,t')}{\partial t}\right]_{\text{approx}} =\frac{\partial u(x,t')}{\partial t}+\mathcal{O}(\Delta t^2) , \\ \nonumber
&\left[\frac{\partial^2 u(x,t')}{\partial x^2}\right]_{\text{approx}}=\frac{\partial^2 u(x,t')}{\partial x^2}+\mathcal{O}(\Delta x^2).
\end{align}
$$
Scheme: | Truncation Error: | Stability requirements: |
---|---|---|
Crank-Nicolson | \( \mathcal{O}(\Delta x^2,\Delta t^2) \) | Stable for all \( \Delta t \) and \( \Delta x \) |
Backward Euler | \( \mathcal{O}(\Delta x^2,\Delta t) \) | Stable for all \( \Delta t \) and \( \Delta x \) |
Forward Euler | \( \mathcal{O}(\Delta x^2,\Delta t) \) | \( \Delta t\leq \frac{1}{2}\Delta x^2 \) |
$$
\nabla^2 u(x,t) =\frac{\partial u(x,t)}{\partial t},
$$
with initial conditions
$$
u(x,0)= g(x) \quad 0 < x < L.
$$
$$
u(0,t)= 0 \quad t \ge 0, \quad u(L,t)= 0 \quad t \ge 0,
$$
We assume that we have solutions of the form (separation of variable)
$$
\begin{equation}
u(x,t)=F(x)G(t),
\end{equation}
$$
which inserted in the partial differential equation results in
$$
\begin{equation}
\frac{F''}{F}=\frac{G'}{G},
\end{equation}
$$
where the derivative is with respect to \( x \) on the left hand side and with respect to \( t \) on right hand side.
This equation should hold for all \( x \) and \( t \). We must require the rhs and lhs to be equal to a constant.
$$
\begin{equation}
F''+\lambda^2F=0; \quad G'=-\lambda^2G,
\end{equation}
$$
with general solutions
$$
\begin{equation}
F(x)=A\sin(\lambda x)+B\cos(\lambda x); \quad G(t)=Ce^{-\lambda^2t}.
\end{equation}
$$
$$
\begin{equation}
u(x,t)=A_n\sin(n\pi x/L)e^{-n^2\pi^2 t/L^2}.
\end{equation}
$$
But there are infinitely many possible \( n \) values (infinite number of solutions). Moreover,
the diffusion equation is linear and because of this we know that a superposition of solutions
will also be a solution of the equation. We may therefore write
$$
\begin{equation}
u(x,t)=\sum_{n=1}^{\infty} A_n \sin(n\pi x/L) e^{-n^2\pi^2 t/L^2}.
\end{equation}
$$
$$
\begin{equation}
u(x,0)=g(x)=\sum_{n=1}^{\infty} A_n \sin(n\pi x/L).
\end{equation}
$$
The coefficient \( A_n \) is the Fourier coefficients for the function \( g(x) \). Because of this, \( A_n \) is given by (from the theory on Fourier series)
$$
\begin{equation}
A_n=\frac{2}{L}\int_0^L g(x)\sin(n\pi x/L) \mbox{d}x.
\end{equation}
$$
Different \( g(x) \) functions will obviously result in different results for \( A_n \).
$$
\nabla^2 u({\bf x})=u_{xx}+u_{yy}=0.
$$
with possible boundary conditions
\( u(x,y) = g(x,y) \) on the border. There is no time-dependence.
Choosing equally many steps in both directions we have a quadratic or rectangular
grid, depending on whether we choose equal steps lengths or not in the \( x \) and
the \( y \) directions. Here we set \( \Delta x = \Delta y = h \) and obtain
a discretized version
$$
u_{xx}\approx \frac{u(x+h,y)-2u(x,y)+u(x-h,y)}{h^2},
$$
and
$$
u_{yy}\approx \frac{u(x,y+h)-2u(x,y)+u(x,y-h)}{h^2},
$$
$$
u_{xx}\approx \frac{u_{i+1,j}-2u_{i,j}+u_{i-1,j}}{h^2},
$$
and
$$
u_{yy}\approx \frac{u_{i,j+1}-2u_{i,j}+u_{i,j-1}}{h^2},
$$
which gives when inserted in Laplace's equation
$$
\begin{equation}
\tag{22}
u_{i,j}= \frac{1}{4}\left[u_{i,j+1}+u_{i,j-1}+u_{i+1,j}+u_{i-1,j}\right].
\end{equation}
$$
This is our final numerical scheme for solving Laplace's equation.
Poisson's equation adds only a minor complication
to the above equation since in this case we have
$$
u_{xx}+u_{yy}=-\rho({\bf x}),
$$
and we need only to add a discretized version of \( \rho({\bf x}) \)
resulting in
$$
\begin{equation}
\tag{23}
u_{i,j}= \frac{1}{4}\left[u_{i,j+1}+u_{i,j-1}+u_{i+1,j}+u_{i-1,j}\right]
+\rho_{i,j}.
\end{equation}
$$
int DiffusionJacobi(int N, double dx, double dt,
double **A, double **q, double abstol){
int i,j,k;
int maxit = 100000;
double sum;
double ** Aold = CreateMatrix(N,N);
double D = dt/(dx*dx);
for(i=1; i<N-1; i++)
for(j=1;j<N-1;j++)
Aold[i][j] = 1.0;
/* Boundary Conditions -- all zeros */
for(i=0;i<N;i++){
A[0][i] = 0.0;
A[N-1][i] = 0.0;
A[i][0] = 0.0;
A[i][N-1] = 0.0;
for(k=0; k<maxit; k++){
for(i = 1; i<N-1; i++){
for(j=1; j<N-1; j++){
A[i][j] = dt*q[i][j] + Aold[i][j] +
D*(Aold[i+1][j] + Aold[i][j+1] - 4.0*Aold[i][j] +
Aold[i-1][j] + Aold[i][j-1]);
sum = 0.0;
for(i=0;i<N;i++){
for(j=0;j<N;j++){
sum += (Aold[i][j]-A[i][j])*(Aold[i][j]-A[i][j]);
Aold[i][j] = A[i][j];
if(sqrt(sum)<abstol){DestroyMatrix(Aold,N,N);
return k;
$$
\left\{\begin{array}{cc} \lambda\left(\frac{\partial^2u}{\partial x^2}+\frac{\partial^2u}{\partial y^2}\right) = \frac{\partial^2u}{\partial t^2}& x,y\in[0,1], t \ge 0 \\
u(x,y,0) = sin(\pi x)sin(2\pi y)& x,y\in (0,1) \\
u = 0 \quad \mbox{boundary} & t \ge 0\\
\partial u/\partial t|_{t=0}=0 & x,y\in (0,1)\\
\end{array}\right. .
$$
The boundary is defined by \( x=0 \), \( x=1 \), \( y=0 \) and \( y=1 \).
Here we set \( \lambda = 1 \).
$$
\left\{\begin{array}{cc} t_l=l\Delta t& l \ge 0 \\
x_i=i\Delta x& 0 \le i \le n_x\\
y_j=j\Delta y& 0 \le j \le n_y\end{array}\right. ,
$$
and we will let \( \Delta x=\Delta y = h \) and \( n_x=n_y \) for the sake of
simplicity.
We have now the following discretized partial derivatives
$$
u_{xx}\approx \frac{u_{i+1,j}^l-2u_{i,j}^l+u_{i-1,j}^l}{h^2},
$$
and
$$
u_{yy}\approx \frac{u_{i,j+1}^l-2u_{i,j}^l+u_{i,j-1}^l}{h^2},
$$
and
$$
u_{tt}\approx \frac{u_{i,j}^{l+1}-2u_{i,j}^{l}+u_{i,j}^{l-1}}{\Delta t^2}.
$$
$$
\tag{24}
u_{i,j}^{l+1}
=2u_{i,j}^{l}-u_{i,j}^{l-1}+\frac{\Delta t^2}{h^2}\left(u_{i+1,j}^l-4u_{i,j}^l+u_{i-1,j}^l+u_{i,j+1}^l+u_{i,j-1}^l\right),
$$
where again we have an explicit scheme with \( u_{i,j}^{l+1} \) as the only
unknown quantity.
It is easy to account for different step lengths for \( x \) and \( y \).
The partial derivative is treated in much the same way
as for the one-dimensional case, except that we now have an additional
index due to the extra spatial dimension, viz., we need to compute
\( u_{i,j}^{-1} \) through
$$
u_{i,j}^{-1}=u_{i,j}^0+\frac{\Delta t}{2h^2}\left(u_{i+1,j}^0-4u_{i,j}^0+u_{i-1,j}^0+u_{i,j+1}^0+u_{i,j-1}^0\right),
$$
in our setup of the initial conditions.
// After initializations and declaration of variables
for ( int i = 0; i < n; i++ ) {
u[i] = new double [n];
uLast[i] = new double [n];
uNext[i] = new double [n];
x[i] = i*h;
y[i] = x[i];
// initializing
for ( int i = 0; i < n; i++ ) { // setting initial step
for ( int j = 0; j < n; j++ ) {
uLast[i][j] = sin(PI*x[i])*sin(2*PI*y[j]);
for ( int i = 1; i < (n-1); i++ ) { // setting first step using the initial derivative
for ( int j = 1; j < (n-1); j++ ) {
u[i][j] = uLast[i][j] - ((tStep*tStep)/(2.0*h*h))*
(4*uLast[i][j] - uLast[i+1][j] - uLast[i-1][j] - uLast[i][j+1] - uLast[i][j-1]);
u[i][0] = 0; // setting boundaries once and for all
u[i][n-1] = 0;
u[0][i] = 0;
u[n-1][i] = 0;
uNext[i][0] = 0;
uNext[i][n-1] = 0;
uNext[0][i] = 0;
uNext[n-1][i] = 0;
// iterating in time
double t = 0.0 + tStep;
int iter = 0;
while ( t < tFinal ) {
iter ++;
t = t + tStep;
for ( int i = 1; i < (n-1); i++ ) { // computing next step
for ( int j = 1; j < (n-1); j++ ) {
uNext[i][j] = 2*u[i][j] - uLast[i][j] - ((tStep*tStep)/(h*h))*
(4*u[i][j] - u[i+1][j] - u[i-1][j] - u[i][j+1] - u[i][j-1]);
for ( int i = 1; i < (n-1); i++ ) { // shifting results down
for ( int j = 1; j < (n-1); j++ ) {
uLast[i][j] = u[i][j];
u[i][j] = uNext[i][j];
$$
\left\{\begin{array}{cc} c^2(u_{xx}+u_{yy}) = u_{tt}& x,y\in(0,L), t>0 \\
u(x,y,0) = f(x,y) & x,y\in (0,L) \\
u(0,0,t)=u(L,L,t)=0 & t > 0\\
\partial u/\partial t|_{t=0}= g(x,y) & x,y\in (0,L)\\
\end{array}\right. .
$$
$$
u(x,y,t) = F(x,y) G(t),
$$
resulting in the equation
$$
FG_{tt}= c^2(F_{xx}G+F_{yy}G),
$$
or
$$
\frac{G_{tt}}{c^2G} = \frac{1}{F}(F_{xx}+F_{yy}) = -\nu^2.
$$
The lhs and rhs are independent of each other and we obtain two differential equations
$$
F_{xx}+F_{yy}+F\nu^2=0,
$$
and
$$
G_{tt} + Gc^2\nu^2 = G_{tt} + G\lambda^2 = 0,
$$
with \( \lambda = c\nu \).
$$
F(x,y) = H(x)Q(y),
$$
which results in
$$
\frac{1}{H}H_{xx} = -\frac{1}{Q}(Q_{yy}+Q\nu^2)= -\kappa^2.
$$
Since the lhs and rhs are again independent of each other, we can separate the latter equation into two independent
equations, one for \( x \) and one for \( y \), namely
$$
H_{xx} + \kappa^2H = 0,
$$
and
$$
Q_{yy} + \rho^2Q = 0,
$$
with \( \rho^2= \nu^2-\kappa^2 \).
$$
H(x) = A\cos(\kappa x)+B\sin(\kappa x),
$$
and
$$
Q(y) = C\cos(\rho y)+D\sin(\rho y).
$$
The boundary conditions require that \( F(x,y) = H(x)Q(y) \) are zero at the boundaries, meaning that
\( H(0)=H(L)=Q(0)=Q(L)=0 \). This yields the solutions
$$
H_m(x) = \sin(\frac{m\pi x}{L}) \quad Q_n(y) = \sin(\frac{n\pi y}{L}),
$$
or
$$
F_{mn}(x,y) = \sin(\frac{m\pi x}{L})\sin(\frac{n\pi y}{L}).
$$
$$
G_{mn}(t) = B_{mn}\cos(\lambda_{mn} t)+D_{mn}\sin(\lambda_{mn} t),
$$
with the general solution of the form
$$
u(x,y,t) = \sum_{mn=1}^{\infty} u_{mn}(x,y,t) = \sum_{mn=1}^{\infty}F_{mn}(x,y)G_{mn}(t).
$$
$$
B_{mn} = \frac{2}{L}\int_0^L\int_0^L dxdy f(x,y) \sin(\frac{m\pi x}{L})\sin(\frac{n\pi y}{L}),
$$
and
$$
D_{mn} = \frac{2}{L}\int_0^L\int_0^L dxdy g(x,y) \sin(\frac{m\pi x}{L})\sin(\frac{n\pi y}{L}).
$$
Inserting the particular functional forms of \( f(x,y) \) and \( g(x,y) \) one obtains the final analytic expressions.
$$
\Delta t \le \frac{1}{\sqrt{\lambda}}\left(\frac{1}{\Delta x^2}+\frac{1}{\Delta y^2}\right)^{-1/2}
$$
where \( \Delta t \), \( \Delta x \) and \( \Delta y \) are the chosen step lengths. In our case
\( \Delta x=\Delta y=h \). How do we find this condition? In one dimension we can proceed
as we did for the diffusion equation.
$$
u(x,y,t) = A \exp{ (i(k_xx+k_yy-\omega t))}
$$
Then from
$$
u_{xx}\approx \frac{u_{i+1,j}^l-2u_{i,j}^l+u_{i-1,j}^l}{\Delta x^2},
$$
we get, with \( u_i=\exp{ikx_i} \)
$$
u_{xx}\approx \frac{u_i}{\Delta x^2}\left(\exp{ik\Delta x}-2+\exp{(-ik\Delta x)}\right),
$$
or
$$
u_{xx}\approx 2\frac{u_i}{\Delta x^2}\left(cos(k\Delta x)-1\right)=
-4\frac{u_i}{\Delta x^2}sin^2(k\Delta x/2)
$$
We get similar results for \( t \) and \( y \).
$$
\lambda\left(\frac{\partial^2u}{\partial x^2}+\frac{\partial^2u}{\partial y^2}\right) = \frac{\partial^2u}{\partial t^2},
$$
resulting in
$$
\lambda\left(-4\frac{u_{ij}^l}{\Delta x^2}\sin^2{(k_x\Delta x/2)}-4\frac{u_{ij}^l}{\Delta y^2}\sin^2{(k_y\Delta y/2)}\right)=-4\frac{u_{ij}^l}{\Delta t^2}\sin^2{(\omega\Delta t/2)},
$$
resulting in
$$
\sin{(\omega\Delta t/2)}=\pm \sqrt{\lambda}\Delta t\left(\frac{1}{\Delta x^2}\sin^2{(k_x\Delta x/2)}+\frac{1}{\Delta y^2}\sin^2{(k_y\Delta y/2)}\right)^{1/2}.
$$
The squared sine functions can at most be unity. The frequency \( \omega \) is real and our wave is neither damped nor amplified.
$$
\sin{(\omega\Delta t/2)}=\pm \sqrt{\lambda}\Delta t\left(\frac{1}{\Delta x^2}\sin^2{(k_x\Delta x/2)}+\frac{1}{\Delta y^2}\sin^2{(k_y\Delta y/2)}\right)^{1/2}.
$$
The squared sine functions can at most be unity. \( \omega \) is real and our wave is neither damped
nor amplified. The numerical \( \omega \) must also be real which is the case when
\( \sin{(\omega\Delta t/2)} \) is less than or equal to unity, meaning that
$$
\Delta t \le \frac{1}{\sqrt{\lambda}}\left(\frac{1}{\Delta x^2}+\frac{1}{\Delta y^2}\right)^{-1/2}.
$$
$$
\frac{\partial^2 u}{\partial t^2} = \nabla\cdot (\lambda(x,y) \nabla u).
$$
If \( \lambda \) is constant, we obtain the standard wave equation discussed in the two previous points.
The solution \( u(x,y,t) \) could represent a model for water waves. It represents then the surface elevation from still water.
We can model \( \lambda \) as
$$
\lambda = gH(x,y),
$$
with \( g \) being the acceleration of gravity and \( H(x,y) \) is the still water depth.
The function \( H(x,y) \) simulates the water depth using for example measurements of still water depths in say a fjord or the north sea. The boundary conditions are then determined by the coast lines as discussed in point d) below. We have assumed that the vertical motion is negligible and that we deal with long wavelenghts \( \tilde{\lambda} \) compared with the depth of the sea \( H \), that is \( \tilde{\lambda}/H \gg 1 \). We neglect normally Coriolis effects in such calculations.
$$
\nabla \cdot (\lambda(x,y) \nabla u)= \frac{\partial }{\partial x}\left(\lambda(x,y)\frac{\partial u}{\partial x}\right)+
\frac{\partial }{\partial y}\left(\lambda(x,y)\frac{\partial u}{\partial y}\right),
$$
as follows using again a quadratic domain for \( x \) and \( y \):
$$
\frac{\partial }{\partial x}\left(\lambda(x,y)\frac{\partial u}{\partial x}\right)\approx
\frac{1}{\Delta x} \left(\lambda_{i+1/2,j}\left[\frac{u_{i+1,j}^l-u_{i,j}^l}{\Delta x}\right]
-\lambda_{i-1/2,j}\left[\frac{u_{i,j}^l-u_{i-1,j}^l}{\Delta x}\right]\right),
$$
and
$$
\frac{\partial }{\partial y}\left(\lambda(x,y)\frac{\partial u}{\partial y}\right)\approx
\frac{1}{\Delta y} \left(\lambda_{i,j+1/2}\left[\frac{u_{i,j+1}^l-u_{i,j}^l}{\Delta y}\right]
-\lambda_{i,j-1/2}\left[\frac{u_{i,j}^l-u_{i,j-1}^l}{\Delta y}\right]\right).
$$
$$
\frac{d}{dx}\left(\lambda(x)\frac{du}{dx}\right)|_{x=x_i} \approx
\frac{1}{\Delta x}\left(\lambda\frac{du}{dx}|_{x=x_{i+1/2}}-\lambda\frac{du}{dx}|_{x=x_{i-1/2}}\right),
$$
where we approximated it at the midpoint by going half a step to the right and half a step to
the left. Then we approximate
$$
\lambda\frac{du}{dx}|_{x=x_{i+1/2}}\approx \lambda_{x_{i+1/2}}\frac{u_{i+1}-u_i}{\Delta x},
$$
and similarly for \( x = x_i-1/2 \).
Choose a step size
$$
h=\frac{b-a}{N}
$$
where \( N \) is the number of steps and \( a \) and \( b \) the lower and upper limits
of integration.
Choose then to stop the Taylor expansion of the function \( f(x) \) at a certain derivative.
With these approximations to \( f(x) \) perform the integration.
$$
\int_a^bf(x) dx= \int_a^{a+2h}f(x)dx + \int_{a+2h}^{a+4h}f(x)dx+\dots \int_{b-2h}^{b}f(x)dx.
$$
The strategy then is to find a reliable Taylor expansion for \( f(x) \) in the smaller
sub intervals. Consider e.g., evaluating \( \int_{-h}^{+h}f(x)dx \)
Taylor expansion
$$
f(x)=f_0 + \frac{f_h-f_0}{h}x+O(x^2),
$$
for \( x=x_0 \) to \( x=x_0+h \) and
$$
f(x)=f_0 + \frac{f_0-f_{-h}}{h}x+O(x^2),
$$
for \( x=x_0-h \) to \( x=x_0 \). The error goes like \( O(x^2) \).
If we then evaluate the integral we obtain
$$
\int_{-h}^{+h}f(x)dx=\frac{h}{2}\left(f_h + 2f_0 + f_{-h}\right)+O(h^3),
$$
which is the well-known trapezoidal rule. Local error
\( O(h^3)=O((b-a)^3/N^3) \), and the global error goes like \( \approx O(h^2) \).
Easy to implement numerically through the following simple algorithm
double trapezoidal_rule(double a, double b, int n,
double (*func)(double))
{
double trapez_sum;
double fa, fb, x, step;
int j;
step=(b-a)/((double) n);
fa=(*func)(a)/2. ;
fb=(*func)(b)/2. ;
trapez_sum=0.;
for (j=1; j <= n-1; j++){
x=j*step+a;
trapez_sum+=(*func)(x);
trapez_sum=(trapez_sum+fb+fa)*step;
return trapez_sum;
} // end function for trapezoidal rule
double trapezoidal_rule(double a, double b, int n,
double (*func)(double))
We call this function simply as something like this
integral = trapezoidal_rule(a, b, n, mysuperduperfunction);
The first and second derivatives are given by
$$
\frac{f_h-f_{-h}}{2h}=f'_0+\sum_{j=1}^{\infty}\frac{f_0^{(2j+1)}}{(2j+1)!}h^{2j},
$$
and
$$
\frac{ f_h -2f_0 +f_{-h}}{h^2}=f_0''+2\sum_{j=1}^{\infty}\frac{f_0^{(2j+2)}}{(2j+2)!}h^{2j},
$$
results in
\( f(x)=f_0 + \frac{f_h-f_{-h}}{2h}x + \frac{ f_h -2f_0 +f_{-h}}{h^2}x^2 +O(x^3) \).
Inserting this formula in the integral
$$
\int_{-h}^{+h}f(x)dx=\frac{h}{3}\left(f_h + 4f_0 + f_{-h}\right)+O(h^5),
$$
which is Simpson's rule.
Note that the improved accuracy in the evaluation of the derivatives gives a better error approximation, \( O(h^5) \) vs. \( O(h^3) \) . But this is just the local error approximation. Using Simpson's rule we arrive at the composite rule
$$
I=\int_a^bf(x) dx=\frac{h}{3}\left(f(a) + 4f(a+h) +2f(a+2h)+
\dots +4f(b-h)+ f_{b}\right),
\tag{25}
$$
with a global error which goes like \( O(h^4) \).
Algo
$$
I=\int_a^bf(x)dx \approx \sum_{i=1}^N\omega_if(x_i),
$$
where \( \omega \) and \( x \) are the weights and the chosen mesh points, respectively.
Simpson's rule gives
$$
\omega : \left\{h/3,4h/3,2h/3,4h/3,\dots,4h/3,h/3\right\},
$$
for the weights, while the trapezoidal rule resulted in
$$
\omega : \left\{h/2,h,h,\dots,h,h/2\right\}.
$$
In general, an integration formula which is based on a Taylor series using \( N \) points,
will integrate exactly a polynomial \( P \) of degree \( N-1 \). That is, the \( N \) weights
\( \omega_n \) can be chosen to satisfy \( N \) linear equations
$$
p_n(x_j) = y_j\quad j=0,\dots,n
$$
In the Lagrange representation this interpolation polynomial is given by
$$
p_n = \sum_{k=0}^nl_ky_k,
$$
with the Lagrange factors
$$
l_k(x) = \prod_{\begin{array}{c}i=0 \\ i\ne k\end{array}}^n\frac{x-x_i}{x_k-x_i}\quad k=0,\dots,n
$$
Example: \( n=1 \)
$$
p_1(x) = y_0\frac{x-x_1}{x_0-x_1}+y_1\frac{x-x_0}{x_1-x_0}=\frac{y_1-y_0}{x_1-x_0}x-\frac{y_1x_0+y_0x_1}{x_1-x_0},
$$
which we recognize as the equation for a straight line.
$$
\int_a^bf(x)dx \approx \int_a^bp_n(x)dx = \sum_{k=0}^nw_kf(x_k)
$$
with
$$
w_k = h\frac{(-1)^{n-k}}{k!(n-k)!}\int_0^n\prod_{\begin{array}{c}j=0 \\ j\ne k\end{array}}^n(z-j)dz,
$$
for \( k=0,\dots,n \).
$$
\int_a^bf(x)dx -\frac{b-a}{2}\left[f(a)+f(b)\right]=-\frac{h^3}{12}f^{(2)}(\xi),
$$
and the global error (composite formula)
$$
\int_a^bf(x)dx -T_h(f)=-\frac{b-a}{12}h^2f^{(2)}(\xi).
$$
For Simpson's rule we have
$$
\int_a^bf(x)dx -\frac{b-a}{6}\left[f(a)+4f((a+b)/2)+f(b)\right]=-\frac{h^5}{90}f^{(4)}(\xi),
$$
and the global error
$$
\int_a^bf(x)dx -S_h(f)=-\frac{b-a}{180}h^4f^{(4)}(\xi).
$$
with \( \xi\in[a,b] \).
$$
f(x)\approx P_{n}(x),
$$
with \( n+1 \) mesh points we should be able to integrate exactly the
polynomial \( P_{n} \).
Gaussian quadrature methods promise more than this. We can get a better polynomial approximation with order greater than \( n+1 \) to \( f(x) \) and still get away with only \( n+1 \) mesh points. More precisely, we approximate
$$
f(x) \approx P_{2n+1}(x),
$$
and with only \( n+1 \) mesh points these methods promise that
$$
\int f(x)dx \approx \int P_{2n+1}(x)dx=\sum_{i=0}^{n} P_{2n+1}(x_i)\omega_i,
$$
A greater precision for a given amount of numerical work can be achieved if we are willing to give up the requirement of equally spaced integration points. In Gaussian quadrature (hereafter GQ), both the mesh points and the weights are to be determined. The points will not be equally spaced The theory behind GQ is to obtain an arbitrary weight \( \omega \) through the use of so-called orthogonal polynomials. These polynomials are orthogonal in some interval say e.g., [-1,1]. Our points \( x_i \) are chosen in some optimal sense subject only to the constraint that they should lie in this interval. Together with the weights we have then \( 2(n+1) \) (\( n+1 \) the number of points) parameters at our disposal.
$$
I=\int_a^bf(x)dx =\int_a^bW(x)g(x)dx\approx \sum_{i=0}^n\omega_ig(x_i),
$$
where \( g \) is smooth and \( W \) is the weight function, which is to be associated with a given
orthogonal polynomial.
The weight function \( W \) is non-negative in the integration interval \( x\in [a,b] \) such that for any \( n \ge 0 \) $\int_a^b |x|^n W(x) dx$ is integrable. The naming weight function arises from the fact that it may be used to give more emphasis to one part of the interval than another.
Weight function | Interval | Polynomial |
---|---|---|
\( W(x)=1 \) | \( x\in [a,b] \) | Legendre |
\( W(x)=e^{-x^2} \) | \( -\infty \le x \le \infty \) | Hermite |
\( W(x)=e^{-x} \) | \( 0 \le x \le \infty \) | Laguerre |
\( W(x)=1/(\sqrt{1-x^2}) \) | \( -1 \le x \le 1 \) | Chebyshev |
$$
I=\int_{-1}^{1}f(x)dx
$$
$$
C(1-x^2)P-m_l^2P+(1-x^2)\frac{d}{dx}\left((1-x^2)\frac{dP}{dx}\right)=0.
$$
\( C \) is a constant. For \( m_l=0 \) we obtain the Legendre polynomials
as solutions, whereas \( m_l \ne 0 \) yields the so-called associated Legendre
polynomials.
The corresponding polynomials \( P \) are
$$
L_k(x)=\frac{1}{2^kk!}\frac{d^k}{dx^k}(x^2-1)^k \quad k=0,1,2,\dots,
$$
which, up to a factor, are the Legendre polynomials \( L_k \).
The latter fulfil the orthorgonality relation
$$
\int_{-1}^1L_i(x)L_j(x)dx=\frac{2}{2i+1}\delta_{ij},
$$
and the recursion relation
$$
(j+1)L_{j+1}(x)+jL_{j-1}(x)-(2j+1)xL_j(x)=0.
$$
$$
I=\int_0^{\infty}f(x)dx =\int_0^{\infty}x^{\alpha}e^{-x}g(x)dx.
$$
These polynomials arise from the solution of the differential
equation
$$
\left(\frac{d^2 }{dx^2}-\frac{d }{dx}+\frac{\lambda}{x}-\frac{l(l+1)}{x^2}\right){\cal L}(x)=0,
$$
where \( l \) is an integer \( l\ge 0 \) and \( \lambda \) a constant.
They fulfil the orthorgonality relation
$$
\int_{-\infty}^{\infty}e^{-x}{\cal L}_n(x)^2dx=1,
$$
and the recursion relation
$$
(n+1){\cal L}_{n+1}(x)=(2n+1-x){\cal L}_{n}(x)-n{\cal L}_{n-1}(x).
$$
$$
I=\int_{-\infty}^{\infty}f(x)dx =\int_{-\infty}^{\infty}e^{-x^2}g(x)dx.
$$
we could use the Hermite polynomials in order to extract weights and mesh points.
The Hermite polynomials are the solutions of the following differential
equation
$$
\frac{d^2H(x)}{dx^2}-2x\frac{dH(x)}{dx}+
(\lambda-1)H(x)=0.
\tag{26}
$$
They fulfil the orthorgonality relation
$$
\int_{-\infty}^{\infty}e^{-x^2}H_n(x)^2dx=2^nn!\sqrt{\pi},
$$
and the recursion relation
$$
H_{n+1}(x)=2xH_{n}(x)-2nH_{n-1}(x).
$$
$$ \int_a^bW(x)f(x)dx \approx \sum_{i=0}^n\omega_if(x_i), $$
with \( n+1 \) distinct quadrature points (mesh points) is a called a Gaussian quadrature
formula if it integrates all polynomials \( p\in P_{2n+1} \) exactly, that is
$$ \int_a^bW(x)p(x)dx =\sum_{i=0}^n\omega_ip(x_i), $$
It is assumed that \( W(x) \) is continuous and positive and that the integral
$$ \int_a^bW(x)dx , $$
exists. Note that the replacement of \( f\rightarrow Wg \) is normally a better approximation
due to the fact that we may isolate possible singularities of \( W \) and its
derivatives at the endpoints of the interval.
$$
I= \int_0^{\infty} x\exp{(-x)}\sin{x}=\frac{1}{2},
$$
using brute force Trapezoidal rule, Simpson's rule, Gauss-Legendre, Gauss-Laguerre and Gauss-Legendre again but with a smarter mapping.
$$ \int_0^{\infty}f(x)dx \approx \int_0^{\Lambda}f(x)dx
$$
int n;
double a, b, alf, xx;
cout << "Read in the number of integration points" << endl;
cin >> n;
cout << "Read in integration limits" << endl;
cin >> a >> b;
// reserve space in memory for vectors containing the mesh points
// weights and function values for the use of the gauss-legendre
// method
double *x = new double [n];
double *w = new double [n];
// Gauss-Laguerre is old-fashioned translation of F77 --> C++
// arrays start at 1 and end at n
double *xgl = new double [n+1];
double *wgl = new double [n+1];
// set up the mesh points and weights
gauleg(a, b,x,w, n);
// set up the mesh points and weights
alf = 1.0; // <--- Note alf
gauss_laguerre(xgl,wgl, n, alf);
// evaluate the integral with the Gauss-Legendre method
// Note that we initialize the sum. Here brute force gauleg
double int_gauss = 0.;
for ( int i = 0; i < n; i++){
int_gauss+=w[i]*int_function(x[i]);
// evaluate the integral with the Gauss-Laguerre method
// Note that we initialize the sum
double int_gausslag = 0.;
for ( int i = 1; i <= n; i++){
int_gausslag+=wgl[i]*sin(xgl[i]); }
// evaluate the integral with the Gauss-Laguerre method
// Here we change the mesh points with a mapping
// Need to call gauleg from -1 to + 1
gauleg(-1.0, 1.0,x,w, n);
double pi_4 = acos(-1.0)*0.25;
for ( int i = 0; i < n; i++){
xx=pi_4*(x[i]+1.0);
r[i]= tan(xx);
s[i]=pi_4/(cos(xx)*cos(xx))*w[i];
double int_gausslegimproved = 0.;
for ( int i = 0; i < n; i++){
int_gausslegimproved += s[i]*int_function(r[i]);
$$
\Psi({\bf r}_1,{\bf r}_2) = e^{-\alpha (r_1+r_2)}.
$$
Note that it is not possible to find a closed form solution to Schr\"odinger's equation for
two interacting electrons in the helium atom.
The integral we need to solve is the quantum mechanical expectation value of the correlation energy between two electrons, namely
$$
\begin{equation}label{eq:correlationenergy}
\langle \frac{1}{|{\bf r}_1-{\bf r}_2|} \rangle =
\int_{-\infty}^{\infty} d{\bf r}_1d{\bf r}_2 e^{-2\alpha (r_1+r_2)}\frac{1}{|{\bf r}_1-{\bf r}_2|}=\frac{5\pi^2}{16^2}=0.192765711.
\end{equation}
$$
Note that our wave function is not normalized. There is a normalization factor missing.
double *x = new double [N];
double *w = new double [N];
// set up the mesh points and weights
gauleg(a,b,x,w, N);
// evaluate the integral with the Gauss-Legendre method
// Note that we initialize the sum
double int_gauss = 0.;
for (int i=0;i<N;i++){
for (int j = 0;j<N;j++){
for (int k = 0;k<N;k++){
for (int l = 0;l<N;l++){
for (int m = 0;m<N;m++){
for (int n = 0;n<N;n++){
int_gauss+=w[i]*w[j]*w[k]*w[l]*w[m]*w[n]
*int_function(x[i],x[j],x[k],x[l],x[m],x[n]);
}}}}}
}
// this function defines the function to integrate
double int_function(double x1, double y1, double z1, double x2, double y2, double z2)
{
double alpha = 2.;
// evaluate the different terms of the exponential
double exp1=-2*alpha*sqrt(x1*x1+y1*y1+z1*z1);
double exp2=-2*alpha*sqrt(x2*x2+y2*y2+z2*z2);
double deno=sqrt(pow((x1-x2),2)+pow((y1-y2),2)+pow((z1-z2),2));
if(deno <pow(10.,-6.)) { return 0;}
else return exp(exp1+exp2)/deno;
} // end of function to evaluate
$$
d{\bf r}_1d{\bf r}_2 = r_1^2dr_1 r_2^2dr_2 dcos(\theta_1)dcos(\theta_2)d\phi_1d\phi_2,
$$
and
$$
\frac{1}{r_{12}}= \frac{1}{\sqrt{r_1^2+r_2^2-2r_1r_2cos(\beta)}}
$$
with
$$
\cos(\beta) = \cos(\theta_1)\cos(\theta_2)+\sin(\theta_1)\sin(\theta_2)\cos(\phi_1-\phi_2))
$$
$$
\int_0^{\infty} r_1^2dr_1 \int_0^{\infty}r_2^2dr_2 \int_0^{\pi}dcos(\theta_1)\int_0^{\pi}dcos(\theta_2)\int_0^{2\pi}d\phi_1\int_0^{2\pi}d\phi_2 \times
$$
$$
\frac{e^{-2\alpha (r_1+r_2)}}{\sqrt{r_1^2+r_2^2-2r_1r_2\cos(\theta_1)\cos(\theta_2)+\sin(\theta_1)\sin(\theta_2)\cos(\phi_1-\phi_2)) }}
$$
since
$$
\frac{1}{r_{12}}= \frac{1}{\sqrt{r_1^2+r_2^2-2r_1r_2cos(\beta)}}
$$
with
$$
\cos(\beta) = \cos(\theta_1)\cos(\theta_2)+\sin(\theta_1)\sin(\theta_2)\cos(\phi_1-\phi_2))
$$
The above methods are all based on a defined step length, normally provided by the user, dividing the integration domain with a fixed number of subintervals. This is rather simple to implement may be inefficient, in particular if the integrand varies considerably in certain areas of the integration domain. In these areas the number of fixed integration points may not be adequate. In other regions, the integrand may vary slowly and fewer integration points may be needed.
Step 1.
We compute our first approximation by computing the integral for the full domain. We label this as \( I^{(0)} \). It is obtained by calling our previously discussed function trapezoidal_rule
as
I0 = trapezoidal_rule(a, b, n, function);
Step 2. We split the integration in two, with \( c= (a+b)/2 \). We compute then the two integrals \( I^{(1L)} \) and \( I^{(1R)} \)
I1L = trapezoidal_rule(a, c, n, function);
and
I1R = trapezoidal_rule(c, b, n, function);
With a given defined tolerance, being a small number provided by us, we estimate the difference \( |I^{(1L)}+I^{(1R)}-I^{(0)}| < \mbox{tolerance} \). If this test is satisfied, our first approximation is satisfactory. If not, we can set up a recursive procedure where the integral is split into subsequent subintervals until our tolerance is satisfied.
// Simple recursive function that implements the
// adaptive integration using the trapezoidal rule
const int maxrecursions = 50;
const double tolerance = 1.0E-10;
// Takes as input the integration limits, number of points, function to integrate
// and the number of steps
void adaptive_integration(double a, double b, double *Integral, int n, int steps, double (*func)(double))
if ( steps > maxrecursions){
cout << 'Too many recursive steps, the function varies too much' << endl;
break;
double c = (a+b)*0.5;
// the whole integral
double I0 = trapezoidal_rule(a, b,n, func);
// the left half
double I1L = trapezoidal_rule(a, c,n, func);
// the right half
double I1R = trapezoidal_rule(c, b,n, func);
if (fabs(I1L+I1R-I0) < tolerance ) integral = I0;
else
{
adaptive_integration(a, c, integral, int n, ++steps, func)
adaptive_integration(c, b, integral, int n, ++steps, func)
// end function Adaptive integration
The variables {\bf Integral} and {\bf steps} should be initialized to zero by the function that calls the adaptive procedure.
{\bf Task parallelism}: the work of a global problem can be divided into a number of independent tasks, which rarely need to synchronize. Monte Carlo simulation or integrations are examples of this. It is almost embarrassingly trivial to parallelize Monte Carlo and numerical integration codes.
We will use MPI=Message Passing Interface. MPI is a message-passing library where all the routines have corresponding C/C++-binding
MPI_Command_name
and Fortran-binding (routine names are in uppercase, but can also be in lower case)
MPI_COMMAND_NAME
MPI is a specification, not a particular implementation. MPI programs should be able to run on all possible machines and run all MPI implementetations without change.
An MPI computation is a collection of processes communicating with messages.
See chapter 5.5 of lecture notes for more details.
Development of the capacity for single-processor computers can hardly keep up with the pace of scientific computing:
MPI_Command_name
and Fortran-binding (routine names are in uppercase, but can also be in lower case)
MPI_COMMAND_NAME
The discussion in these slides focuses on the C++ binding.
MPI_ Init
- initiate an MPI computationMPI_Finalize
- terminate the MPI computation and clean upMPI_Comm_size
- how many processes participate in a given MPI communicator?MPI_Comm_rank
- which one am I? (A number between 0 and size-1.)MPI_Reduce(Allreduce)
- Collect data from all nodes and either sum them up in one or all (Allreduce)
. Useful for numerical integrationMPI_Send
- send a message to a particular process within an MPI communicatorMPI_Recv
- receive a message from a particular process within an MPI communicator
MPI_COMM_WORLD
declaration MPI_COMM_WORLD
contains all the MPI processes.
using namespace std;
#include <mpi.h>
#include <iostream>
int main (int nargs, char* args[])
{
int numprocs, my_rank;
// MPI initializations
MPI_Init (&nargs, &args);
MPI_Comm_size (MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank (MPI_COMM_WORLD, &my_rank);
cout << "Hello world, I have rank " << my_rank << " out of "
<< numprocs << endl;
// End MPI
MPI_Finalize ();
PROGRAM hello
INCLUDE "mpif.h"
INTEGER:: size, my_rank, ierr
CALL MPI_INIT(ierr)
CALL MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierr)
CALL MPI_COMM_RANK(MPI_COMM_WORLD, my_rank, ierr)
WRITE(*,*)"Hello world, I've rank ",my_rank," out of ",size
CALL MPI_FINALIZE(ierr)
END PROGRAM hello
MPI_Barrier
int main (int nargs, char* args[])
{
int numprocs, my_rank, i;
MPI_Init (&nargs, &args);
MPI_Comm_size (MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank (MPI_COMM_WORLD, &my_rank);
for (i = 0; i < numprocs; i++) {}
MPI_Barrier (MPI_COMM_WORLD);
if (i == my_rank) {
cout << "Hello world, I have rank " << my_rank <<
" out of " << numprocs << endl;}
MPI_Finalize ();
MPI_Barrier
function to ensure that
that every process has completed its set of instructions in a particular order.
A barrier is a special collective operation that does not allow the processes to continue
until all processes in the communicator (here MPI_COMM_WORLD
have called
MPI_Barrier
.
The barriers make sure that all processes have reached the same point in the code. Many of the collective operations
like MPI_ALLREDUCE
to be discussed later, have the same property; viz. no process can exit the operation
until all processes have started.
However, this is slightly more time-consuming since the processes synchronize between themselves as many times as there
are processes. In the next Hello world example we use the send and receive functions in order to a have a synchronized
action.
Step 1. Compile with mpicxx or mpic++
Step 2. Set up collaboration between processes and run
mpd --ncpus=4 & # run code with mpiexec -n 4 ./nameofprog
Here we declare that we will use 4 processes via the -ncpus
option and via -n 4
when running.
Step 3. End with
mpdallexit
MPI_Recv
and MPI_Send
.....
int numprocs, my_rank, flag;
MPI_Status status;
MPI_Init (&nargs, &args);
MPI_Comm_size (MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank (MPI_COMM_WORLD, &my_rank);
if (my_rank > 0)
MPI_Recv (&flag, 1, MPI_INT, my_rank-1, 100,
MPI_COMM_WORLD, &status);
cout << "Hello world, I have rank " << my_rank << " out of "
<< numprocs << endl;
if (my_rank < numprocs-1)
MPI_Send (&my_rank, 1, MPI_INT, my_rank+1,
100, MPI_COMM_WORLD);
MPI_Finalize ();
MPI_SEND
, which in C/C++
is defined as
int MPI_Send(void *buf, int count,
MPI_Datatype datatype,
int dest, int tag, MPI_Comm comm)}
This single command allows the passing of any kind of variable, even a large array, to any group of tasks. The variable {\bf buf} is the variable we wish to send while {\bf count} is the number of variables we are passing. If we are passing only a single value, this should be 1. If we transfer an array, it is the overall size of the array. For example, if we want to send a 10 by 10 array, count would be \( 10\times 10=100 \) since we are actually passing 100 values.
MPI_RECV
is similar to the send call.
int MPI_Recv( void *buf, int count, MPI_Datatype datatype,
int source,
int tag, MPI_Comm comm, MPI_Status *status )
The arguments that are different from those in MPI_SEND
are
{\bf buf} which is the name of the variable where you will be storing the received data,
{\bf source} which replaces the destination in the send command. This is the return ID of the sender.
Finally, we have used MPI_Status status
where one can check if the receive was completed.
The output of this code is the same as the previous example, but now process 0 sends a message to process 1, which forwards it further to process 2, and so forth.
Armed with this wisdom, performed all hello world greetings, we are now ready for serious work.
|
Examples.
|
$$
I=\int_a^bf(x) dx=h\left(f(a)/2 + f(a+h) +f(a+2h)+
\dots +f(b-h)+ f_{b}/2\right).
$$
Another very simple approach is the so-called midpoint or rectangle method. In this case the integration area is split in a given number of rectangles with length \( h \) and heigth given by the mid-point value of the function. This gives the following simple rule for approximating an integral
$$
I=\int_a^bf(x) dx \approx h\sum_{i=1}^N f(x_{i-1/2}),
$$
where \( f(x_{i-1/2}) \) is the midpoint value of \( f \) for a given rectangle. This is used in program5.cpp.
MPI_reduce
and MPI_Allreduce
MPI_Reduce
or MPI_Allreduce
.
The first function takes information from all processes and sends the result of the MPI operation to one process only,
typically the master node. If we use MPI_Allreduce
, the result is sent back to all processes, a feature which is
useful when all nodes need the value of a joint operation. We limit ourselves to MPI_Reduce
since it is only one
process which will print out the final number of our calculation, The arguments to MPI_Allreduce
are the same.
MPI_reduce
MPI_reduce( void *senddata, void* resultdata, int count,
MPI_Datatype datatype, MPI_Op, int root, MPI_Comm comm)
The two variables \( senddata \) and \( resultdata \) are obvious, besides the fact that one sends the address
of the variable or the first element of an array. If they are arrays they need to have the same size.
The variable \( count \) represents the total dimensionality, 1 in case of just one variable,
while MPI_Datatype
defines the type of variable which is sent and received.
The new feature is MPI_Op
. It defines the type
of operation we want to do.
In our case, since we are summing
the rectangle contributions from every process we define MPI_Op = MPI_SUM
.
If we have an array or matrix we can search for the largest og smallest element by sending either MPI_MAX
or
MPI_MIN
. If we want the location as well (which array element) we simply transfer
MPI_MAXLOC
or MPI_MINOC
. If we want the product we write MPI_PROD
.
MPI_Allreduce
is defined as
MPI_Alreduce( void *senddata, void* resultdata, int count,
MPI_Datatype datatype, MPI_Op, MPI_Comm comm)}.
// Trapezoidal rule and numerical integration usign MPI, example program6.cpp
using namespace std;
#include <mpi.h>
#include <iostream>
// Here we define various functions called by the main program
double int_function(double );
double trapezoidal_rule(double , double , int , double (*)(double));
// Main function begins here
int main (int nargs, char* args[])
{
int n, local_n, numprocs, my_rank;
double a, b, h, local_a, local_b, total_sum, local_sum;
double time_start, time_end, total_time;
// MPI initializations
MPI_Init (&nargs, &args);
MPI_Comm_size (MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank (MPI_COMM_WORLD, &my_rank);
time_start = MPI_Wtime();
// Fixed values for a, b and n
a = 0.0 ; b = 1.0; n = 1000;
h = (b-a)/n; // h is the same for all processes
local_n = n/numprocs;
// make sure n > numprocs, else integer division gives zero
// Length of each process' interval of
// integration = local_n*h.
local_a = a + my_rank*local_n*h;
local_b = local_a + local_n*h;
total_sum = 0.0;
local_sum = trapezoidal_rule(local_a, local_b, local_n,
&int_function);
MPI_Reduce(&local_sum, &total_sum, 1, MPI_DOUBLE,
MPI_SUM, 0, MPI_COMM_WORLD);
time_end = MPI_Wtime();
total_time = time_end-time_start;
if ( my_rank == 0) {
cout << "Trapezoidal rule = " << total_sum << endl;
cout << "Time = " << total_time
<< " on number of processors: " << numprocs << endl;
// End MPI
MPI_Finalize ();
return 0;
} // end of main program
MPI_reduce
to collect data from each process. Note also the use of the function
MPI_Wtime
. The final functions are
// this function defines the function to integrate
double int_function(double x)
{
double value = 4./(1.+x*x);
return value;
} // end of function to evaluate
// this function defines the trapezoidal rule
double trapezoidal_rule(double a, double b, int n,
double (*func)(double))
{
double trapez_sum;
double fa, fb, x, step;
int j;
step=(b-a)/((double) n);
fa=(*func)(a)/2. ;
fb=(*func)(b)/2. ;
trapez_sum=0.;
for (j=1; j <= n-1; j++){
x=j*step+a;
trapez_sum+=(*func)(x);
trapez_sum=(trapez_sum+fb+fa)*step;
return trapez_sum;
} // end trapezoidal_rule
$$
E[x^k]= \langle x^k\rangle=\frac{1}{N}\sum_{i=1}^{N}x_i^kp(x_i),
$$
provided that the sums (or integrals) $ \sum_{i=1}^{N}p(x_i)$ converge absolutely (viz ,
$ \sum_{i=1}^{N}|p(x_i)|$ converges)
Continuous PDF
$$
E[x^k]=\langle x^k\rangle=\int_a^b x^kp(x)dx,
$$
Function \( f(x) \)
$$
E[f^k]=\langle f^k\rangle=\int_a^b f^kp(x)dx,
$$
Variance
$$
\sigma^2_f=E[f^2]-(E[f])^2=\langle f^2\rangle-\langle f\rangle^2
$$
$$
p(x)=\frac{1}{b-a}\Theta(x-a)\Theta(b-x),
$$
which gives for \( a=0,b=1 \) $p(x)=1$ for \( x\in [0,1] \) and zero else.
exponential distribution
$$
p(x)=\alpha e^{-\alpha x},
$$
with probability different from zero in \( [0,\infty] \)
normal distribution (Gaussian)
$$
p(x)=\frac{1}{\sqrt{2\pi\sigma^2}}\exp{\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)}
$$
with probability different from zero in \( [-\infty,\infty] \)
All random number generators use the uniform distribution for \( x\in[0,1] \).
$$
\langle H \rangle =
$$
$$
\frac{\int d{\bf R}_1d{\bf R}_2\dots d{\bf R}_N
\Psi^{\ast}({\bf R_1},{\bf R}_2,\dots,{\bf R}_N)
H({\bf R_1},{\bf R}_2,\dots,{\bf R}_N)
\Psi({\bf R_1},{\bf R}_2,\dots,{\bf R}_N)}
{\int d{\bf R}_1d{\bf R}_2\dots d{\bf R}_N
\Psi^{\ast}({\bf R_1},{\bf R}_2,\dots,{\bf R}_N)
\Psi({\bf R_1},{\bf R}_2,\dots,{\bf R}_N)},
$$
an in general intractable problem.
This integral is actually the starting point in a Variational Monte Carlo calculation. {\bf Gaussian quadrature: Forget it!} given 10 particles and 10 mesh points for each degree of freedom and an ideal 1 petaflops machine (all operations take the same time), how long will it ta ke to compute the above integral? Lifetime of the universe $T\approx 4.7 \times 10^{17}$s.
$$
\hat{H}\Psi({\bf r}_1,..,{\bf r}_A,\alpha_1,..,\alpha_A)=E\Psi({\bf r}_1,..,{
\bf r}_A,\alpha_1,..,\alpha_A)
$$
where
$$
{\bf r}_1,..,{\bf r}_A,
$$
are the coordinates and
$$
\alpha_1,..,\alpha_A,
$$
are sets of relevant quantum numbers such as spin and isospin for a system of
\( A \) nucleons (\( A=N+Z \), \( N \) being the number of neutrons and \( Z \) the number of protons).
$$
2^A\times \left(\begin{array}{c} A\\ Z\end{array}\right)
$$
coupled second-order differential equations in \( 3A \) dimensions.
For a nucleus like $^{10}$Be this number is {\bf 215040}. This is a truely challenging many-body problem.
$$
I=\int_0^1 f(x)dx\approx \frac{1}{N}\sum_{i=1}^Nf(x_i),
$$
can be rewritten using the concept of the average of the function \( f \) for a given PDF \( p(x) \) as
$$
E[f]=\langle f \rangle = \frac{1}{N}\sum_{i=1}^Nf(x_i)p(x_i),
$$
and identify \( p(x) \) with the uniform distribution, viz
$ p(x)=1$ when \( x\in [0,1] \) and zero for all other values of \( x \).
The integral
is then the average of \( f \) over the interval \( x \in [0,1] \)
$$
I=\int_0^1 f(x)dx\approx E[f]=\langle f \rangle.
$$
$$
\sigma^2_f=\frac{1}{N}\sum_{i=1}^N(f(x_i)-\langle f\rangle)^2p(x_i),
$$
and inserting the uniform distribution this yields
$$
\sigma^2_f=\frac{1}{N}\sum_{i=1}^Nf(x_i)^2-
\left(\frac{1}{N}\sum_{i=1}^Nf(x_i)\right)^2,
$$
or
$$
\sigma^2_f=E[f^2]-(E[f])^2=\left(\langle f^2\rangle -
\langle f \rangle^2\right).
$$
which is nothing but a measure of the extent to
which \( f \) deviates from its average over the region of integration.
The standard deviation is defined as the square root of the variance.
$$
\langle I \rangle_M=\frac{1}{M}\sum_{l=1}^{M}\langle f\rangle_l.
$$
If we can consider the probability of
correlated events to be zero, we can rewrite
the variance of these series of measurements as (equating \( M=N \))
$$
\sigma^2_N\approx \frac{1}{N}\left(\langle f^2\rangle -
\langle f \rangle^2\right)=\frac{\sigma^2_f}{N}.
$$
We note that the standard deviation is proportional with the inverse square root of the number of measurements
$$
\sigma_N \sim \frac{1}{\sqrt{N}}.
$$
The aim in Monte Carlo calculations is to have \( \sigma_N \) as small as possible after \( N \) samples.
The results from one sample represents,
since we are using concepts from statistics,
a 'measurement'.
* Comparing this with traditional methods, shows that Monte Carlo integration is more efficient than an order-k algorithm when \( d > 2k \)
After some time the system reaches its equilibrium state with equally many particles in both halves, \( N/2 \). Instead of determining complicated initial conditions for a system of \( N \) particles, we model the system by a simple statistical model. In order to simulate this system, which may consist of \( N \gg 1 \) particles, we assume that all particles in the left half have equal probabilities of going to the right half.
// setup of initial conditions
nleft = initial_n_particles;
max_time = 10*initial_n_particles;
idum = -1;
// sampling over number of particles
for( time=0; time <= max_time; time++){
random_n = ((int) initial_n_particles*ran0(&idum));
if ( random_n <= nleft){
nleft -= 1;
else{
nleft += 1;
ofile << setiosflags(ios::showpoint | ios::uppercase);
ofile << setw(15) << time;
ofile << setw(15) << nleft << endl;
$$
dN(t)=-\omega N(t)dt,
$$
whose solution is
$$
N(t)=N(0)e^{-\omega t},
$$
where we have defined the mean lifetime \( \tau \) of \( X \) as
$$
\tau =\frac{1}{\omega}.
$$
If a nucleus \( X \) decays to a daugther nucleus \( Y \) which also can decay, we get the following coupled equations
$$
\frac{dN_X(t)}{dt}=-\omega_XN_X(t),
$$
and
$$
\frac{dN_Y(t)}{dt}=-\omega_YN_Y(t)+\omega_XN_X(t).
$$
$$
\frac{\Delta N(t)}{N(t)\Delta t}= -\lambda
$$
\( \lambda \) is inversely proportional to the lifetime
idum=-1; // initialise random number generator
// loop over monte carlo cycles
// One monte carlo loop is one sample
for (cycles = 1; cycles <= number_cycles; cycles++){
n_unstable = initial_n_particles;
// accumulate the number of particles per time step per trial
ncumulative[0] += initial_n_particles;
// loop over each time step
for (time=1; time <= max_time; time++){
// for each time step, we check each particle
particle_limit = n_unstable;
for ( np = 1; np <= particle_limit; np++) {
if( ran0(&idum) <= decay_probability) {
n_unstable=n_unstable-1;
} // end of loop over particles
ncumulative[time] += n_unstable;
} // end of loop over time steps
} // end of loop over MC trials
} // end mc_sampling function
$$
\langle H \rangle =
$$
$$
\frac{\int d{\bf R}_1d{\bf R}_2\dots d{\bf R}_N
\Psi^{\ast}({\bf R_1},{\bf R}_2,\dots,{\bf R}_N)
H({\bf R_1},{\bf R}_2,\dots,{\bf R}_N)
\Psi({\bf R_1},{\bf R}_2,\dots,{\bf R}_N)}
{\int d{\bf R}_1d{\bf R}_2\dots d{\bf R}_N
\Psi^{\ast}({\bf R_1},{\bf R}_2,\dots,{\bf R}_N)
\Psi({\bf R_1},{\bf R}_2,\dots,{\bf R}_N)},
$$
an in general intractable problem.
This integral is actually the starting point in a Variational Monte Carlo calculation. {\bf Gaussian quadrature: Forget it!} given 10 particles and 10 mesh points for each degree of freedom and an ideal 1 petaflops machine (all operations take the same time), how long will it ta ke to compute the above integral? Lifetime of the universe $T\approx 4.7 \times 10^{17}$s.
$$
I=\int_0^1 f(x)dx\approx \frac{1}{N}\sum_{i=1}^Nf(x_i),
$$
can be rewritten using the concept of the average of the function \( f \) for a given PDF \( p(x) \) as
$$
E[f]=\langle f \rangle = \frac{1}{N}\sum_{i=1}^Nf(x_i)p(x_i),
$$
and identify \( p(x) \) with the uniform distribution, viz
$ p(x)=1$ when \( x\in [0,1] \) and zero for all other values of \( x \).
The integral
is then the average of \( f \) over the interval \( x \in [0,1] \)
$$
I=\int_0^1 f(x)dx\approx E[f]=\langle f \rangle.
$$
$$
\sigma^2_f=\frac{1}{N}\sum_{i=1}^N(f(x_i)-\langle f\rangle)^2p(x_i),
$$
and inserting the uniform distribution this yields
$$
\sigma^2_f=\frac{1}{N}\sum_{i=1}^Nf(x_i)^2-
\left(\frac{1}{N}\sum_{i=1}^Nf(x_i)\right)^2,
$$
or
$$
\sigma^2_f=E[f^2]-(E[f])^2=\left(\langle f^2\rangle -
\langle f \rangle^2\right).
$$
which is nothing but a measure of the extent to
which \( f \) deviates from its average over the region of integration.
The standard deviation is defined as the square root of the variance.
$$
\langle I \rangle_M=\frac{1}{M}\sum_{l=1}^{M}\langle f\rangle_l.
$$
If we can consider the probability of
correlated events to be zero, we can rewrite
the variance of these series of measurements as (equating \( M=N \))
$$
\sigma^2_N\approx \frac{1}{N}\left(\langle f^2\rangle -
\langle f \rangle^2\right)=\frac{\sigma^2_f}{N}.
$$
We note that the standard deviation is proportional with the inverse square root of the number of measurements
$$
\sigma_N \sim \frac{1}{\sqrt{N}}.
$$
The aim in Monte Carlo calculations is to have \( \sigma_N \) as small as possible after \( N \) samples.
The results from one sample represents,
since we are using concepts from statistics,
a 'measurement'.
* Comparing this with traditional methods, shows that Monte Carlo integration is more efficient than an order-k algorithm when \( d>2k \)
$$
p(x) \le M \quad x\in [a,b].
$$
Then we generate a random number \( x \) from the uniform distribution
for \( x\in [a,b] \) and a corresponding number \( s \) for the uniform
distribution between \( [0,M] \).
If
$$
p(x) \ge s,
$$
we accept the new value of \( x \), else we generate
again two new random numbers \( x \) and \( s \) and perform the test
in the latter equation again.
$$
I=\int_0^3\exp{(x)}dx.
$$
Obviously to derive it analytically is much easier, however the integrand could pose some more
difficult challenges. The aim here is simply to show how to implent the acceptance-rejection algorithm.
The integral is the area below the curve \( f(x)=\exp{(x)} \). If we uniformly fill the rectangle
spanned by \( x\in [0,3] \) and \( y\in [0,\exp{(3)}] \), the fraction below the curve obatained from a uniform distribution, and
multiplied by the area of the rectangle, should approximate the chosen integral. It is rather
easy to implement this numerically, as shown in the following code.
// Loop over Monte Carlo trials n
integral =0.;
for ( int i = 1; i <= n; i++){
// Finds a random value for x in the interval [0,3]
x = 3*ran0(&idum);
// Finds y-value between [0,exp(3)]
y = exp(3.0)*ran0(&idum);
// if the value of y at exp(x) is below the curve, we accept
// THIS IS OUR SAMPLING RULE
if ( y < exp(x)) s = s+ 1.0;
// The integral is area enclosed below the line f(x)=exp(x)
// Then we multiply with the area of the rectangle and
// divide by the number of cycles
Integral = 3.*exp(3.)*s/n
$$
I=\int_0^1 f(x)dx\approx \frac{1}{N}\sum_{i=1}^Nf(x_i),
$$
$$
I=\int_0^1 f(x)dx\approx E[f]=\langle f \rangle.
$$
$$
\sigma^2_f=\frac{1}{N}\sum_{i=1}^Nf(x_i)^2-
\left(\frac{1}{N}\sum_{i=1}^Nf(x_i)\right)^2,
$$
or
$$
\sigma^2_f=E[f^2]-(E[f])^2=\left(\langle f^2\rangle -
\langle f \rangle^2\right).
$$
// crude mc function to calculate pi
int i, n;
long idum;
double crude_mc, x, sum_sigma, fx, variance;
cout << "Read in the number of Monte-Carlo samples" << endl;
cin >> n;
crude_mc = sum_sigma=0. ; idum=-1 ;
// evaluate the integral with the a crude Monte-Carlo method
for ( i = 1; i <= n; i++){
x=ran0(&idum);
fx=func(x);
crude_mc += fx;
sum_sigma += fx*fx;
crude_mc = crude_mc/((double) n );
sum_sigma = sum_sigma/((double) n );
variance=sum_sigma-crude_mc*crude_mc;
// crude mc function to calculate pi
int main()
{
const int n = 1000000;
double x, fx, pi, invers_period, pi2;
int i;
invers_period = 1./RAND_MAX;
srand(time(NULL));
pi = pi2 = 0.;
for (i=0; i<n;i++)
{
x = double(rand())*invers_period;
// This is our sampling rule, all points accepted
fx = 4./(1+x*x);
pi += fx;
pi2 += fx*fx;
pi /= n; pi2 = pi2/n - pi*pi;
cout << "pi=" << pi << " sigma^2=" << pi2 << endl;
return 0;
long idum;
idum=-1 ;
.....
x=ran0(&idum);
....
or
...
invers_period = 1./RAND_MAX;
srand(time(NULL));
...
x = double(rand())*invers_period;
$$
p(x)dx=\left\{\begin{array}{cc} dx & 0 \le x \le 1\\
0 & else\end{array}\right.
$$
with \( p(x)=1 \) and
satisfying
$$
\int_{-\infty}^{\infty}p(x)dx=1.
$$
All random number generators provided in the program library
generate numbers in this domain.
When we attempt a transformation to a new variable \( x\rightarrow y \) we have to conserve the probability
$$
p(y)dy=p(x)dx,
$$
which for the uniform distribution implies
$$
p(y)dy=dx.
$$
$$
x(y)=\int_0^y p(y')dy',
$$
which is nothing but the cumulative distribution of \( p(y) \), i.e.,
$$
x(y)=P(y)=\int_0^y p(y')dy'.
$$
This is an important result which has consequences for eventual improvements over the brute force Monte Carlo.
$$
p(y)dy=\left\{\begin{array}{cc} \frac{dy}{b-a} & a \le y \le b\\
0 & else\end{array}\right.
$$
If we wish to relate this distribution to the one in the interval
\( x \in [0,1] \)
we have
$$
p(y)dy=\frac{dy}{b-a}=dx,
$$
and integrating we obtain the cumulative function
$$
x(y)=\int_a^y \frac{dy'}{b-a},
$$
yielding
$$
y=a+(b-a)x,
$$
a well-known result!
$$
p(y)=e^{-y},
$$
which is the exponential distribution, important for the analysis
of e.g., radioactive decay. Again,
\( p(x) \) is given by the uniform distribution with
\( x \in [0,1] \), and
with the assumption that the probability is conserved we have
$$
p(y)dy=e^{-y}dy=dx,
$$
which yields after integration
$$
x(y)=P(y)=\int_0^y \exp{(-y')}dy'=1-\exp{(-y)},
$$
or
$$
y(x)=-ln(1-x).
$$
This gives us the new random variable \( y \) in the domain
\( y \in [0,\infty) \)
determined through the random variable \( x \in [0,1] \) generated by
our favorite random generator.
$$
I=\int_0^{\infty}F(y)dy=\int_0^{\infty}\exp{(-y)}G(y)dy
$$
which we rewrite as
$$
\int_0^{\infty}\exp{(-y)}G(y)dy=
\int_0^{\infty}\frac{dx}{dy}G(y)dy\approx
\frac{1}{N}\sum_{i=1}^NG(y(x_i)),
$$
where \( x_i \) is a random number in the interval
[0,1].
Note that in practical implementations, our random number generators for the uniform distribution never return exactly 0 or 1, but we we may come very close. We should thus in principle set \( x\in (0,1) \).
.....
idum=-1;
x=ran0(&idum);
y=-log(1.-x);
.....
$$
p(y)dy=\frac{dy}{(a+by)^n},
$$
with \( n > 1 \). It is normalizable, positive definite, analytically
integrable and the integral is invertible, allowing thereby
the expression of a new variable in terms of the old one.
The integral
$$
\int_0^{\infty} \frac{dy}{(a+by)^n}=\frac{1}{(n-1)ba^{n-1}},
$$
gives
$$
p(y)dy=\frac{(n-1)ba^{n-1}}{(a+by)^n}dy,
$$
which in turn gives the cumulative function
$$
x(y)=P(y)=\int_0^y \frac{(n-1)ba^{n-1}}{(a+bx)^n}dy'=,
$$
resulting in
$$
y=\frac{a}{b}\left((1-x)^{-1/(n-1)}-1\right).
$$
$$
g(x,y)=\exp{(-(x^2+y^2)/2)}dxdy.
$$
it is rather difficult to find an inverse since the cumulative
distribution is given by the error function \( erf(x) \).
If we however switch to polar coordinates, we have for \( x \) and \( y \)
$$
r=\left(x^2+y^2\right)^{1/2} \quad
\theta =tan^{-1}\frac{x}{y},
$$
resulting in
$$
g(r,\theta)=r\exp{(-r^2/2)}drd\theta,
$$
where the angle \( \theta \) could be given by a uniform
distribution in the region \( [0,2\pi] \).
Following example 1 above, this implies simply
multiplying random numbers
\( x\in [0,1] \) by \( 2\pi \).
$$
u=\frac{1}{2}r^2,
$$
and define a PDF
$$
\exp{(-u)}du,
$$
with \( u\in [0,\infty) \).
Using the results from example 2, we have that
$$
u=-ln(1-x'),
$$
where \( x' \) is a random number generated for \( x'\in [0,1] \).
With
$$
x=r\cos(\theta)=\sqrt{2u}\cos(\theta),
$$
and
$$
y=r\sin(\theta)=\sqrt{2u}\sin(\theta),
$$
we can obtain new random numbers \( x,y \) through
$$
x=\sqrt{-2ln(1-x')}\cos(\theta),
$$
and
$$
y=\sqrt{-2ln(1-x')}\sin(\theta),
$$
with \( x'\in [0,1] \) and \( \theta \in 2\pi [0,1] \).
.....
idum=-1;
radius=sqrt(-2*ln(1.-ran0(idum)));
theta=2*pi*ran0(idum);
x=radius*cos(theta);
y=radius*sin(theta);
.....
// random numbers with gaussian distribution
double gaussian_deviate(long * idum)
{
static int iset = 0;
static double gset;
double fac, rsq, v1, v2;
if ( idum < 0) iset =0;
if (iset == 0) {
do {
v1 = 2.*ran0(idum) -1.0;
v2 = 2.*ran0(idum) -1.0;
rsq = v1*v1+v2*v2;
} while (rsq >= 1.0 || rsq == 0.);
fac = sqrt(-2.*log(rsq)/rsq);
gset = v1*fac;
iset = 1;
return v2*fac;
} else {
iset =0;
return gset;
Let us assume that \( p(y) \) is a PDF whose behavior resembles that of a function \( F \) defined in a certain interval \( [a,b] \). The normalization condition is
$$
\int_a^bp(y)dy=1.
$$
We can rewrite our integral as
$$
\begin{equation}
I=\int_a^b F(y) dy =\int_a^b p(y)\frac{F(y)}{p(y)} dy.
\tag{27}
\end{equation}
$$
$$
x(y)=\int_a^y p(y')dy',
$$
where we used
$$
p(x)dx=dx=p(y)dy.
$$
If we can invert \( x(y) \), we find \( y(x) \) as well.
$$
I=\int_a^b p(y)\frac{F(y)}{p(y)} dy=\int_a^b\frac{F(y(x))}{p(y(x))} dx,
$$
meaning that a Monte Carlo evalutaion of the above integral gives
$$
\int_a^b\frac{F(y(x))}{p(y(x))} dx=
\frac{1}{N}\sum_{i=1}^N\frac{F(y(x_i))}{p(y(x_i))}.
$$
The advantage of such a change of variables in case \( p(y) \) follows
closely \( F \) is that the integrand becomes smooth and we can sample
over relevant values for the integrand. It is however not trivial
to find such a function \( p \).
The conditions on \( p \) which allow us to perform these transformations
are
Use the uniform distribution to find the random variable \( y \) in the interval [0,1]. \( p(x) \) is a user provided PDF.
Evaluate thereafter
$$
I=\int_a^b F(x) dx =\int_a^b p(x)\frac{F(x)}{p(x)} dx,
$$
by rewriting
$$
\int_a^b p(x)\frac{F(x)}{p(x)} dx =
\int_a^b\frac{F(x(y))}{p(x(y))} dy,
$$
since
$$
\frac{dy}{dx}=p(x).
$$
Perform then a Monte Carlo sampling for
$$
\int_a^b\frac{F(x(y))}{p(x(y))} dy,\approx
\frac{1}{N}\sum_{i=1}^N\frac{F(x(y_i))}{p(x(y_i))},
$$
with \( y_i\in [0,1] \),
Evaluate the variance
$$
I=\int_0^1 F(x) dx = \int_0^1 \frac{1}{1+x^2} dx = \frac{\pi}{4}.
$$
We choose the following PDF (which follows closely the function to integrate)
$$
p(x)=\frac{1}{3}\left(4-2x\right) \quad \int_0^1p(x)dx=1,
$$
resulting
$$
\frac{F(0)}{p(0)}=\frac{F(1)}{p(1)}=\frac{3}{4}.
$$
Check that it fullfils the requirements of a PDF.
We perform then the change of variables (via the Cumulative function)
$$
y(x)=\int_0^x p(x')dx'=\frac{1}{3}x\left(4-x\right),
$$
or
$$
x=2-\left(4-3y\right)^{1/2}
$$
We have that when \( y=0 \) then \( x=0 \) and when \( y=1 \) we have \( x=1 \).
// evaluate the integral with importance sampling
for ( int i = 1; i <= n; i++){
x = ran0(&idum); // random numbers in [0,1]
y = 2 - sqrt(4-3*x); // new random numbers
fy=3*func(y)/(4-2*y); // weighted function
int_mc += fy;
sum_sigma += fy*fy;
int_mc = int_mc/((double) n );
sum_sigma = sum_sigma/((double) n );
variance=(sum_sigma-int_mc*int_mc);
\begin{tabular}{rllll}\hline $N$&$I_{cr}$ &$\sigma_{cr}$ &$I_{is}$ &$\sigma_{is}$\\\hline 10000 & 3.13395E+00 & 4.22881E-01 & 3.14163E+00 & 6.49921E-03 \\ 100000 & 3.14195E+00& 4.11195E-01 & 3.14163E+00 & 6.36837E-03 \\ 1000000& 3.14003E+00 & 4.14114E-01 & 3.14128E+00& 6.39217E-03\\ 10000000 &3.14213E+00 & 4.13838E-01 & 3.14160E+00 & 6.40784E-03 \\ \hline \end{tabular}
However, it is unfair to study one-dimensional integrals with MC methods!
$$
I=\int_0^1dx_1\int_0^1dx_2\dots \int_0^1dx_d g(x_1,\dots,x_d),
$$
with
\( x_i \) defined in the interval \( [a_i,b_i] \) we would typically
need a transformation
of variables of the form
$$
x_i=a_i+(b_i-a_i)t_i,
$$
if we were to use the uniform distribution on the interval \( [0,1] \).
In this case, we need a
Jacobi determinant
$$
\prod_{i=1}^d (b_i-a_i),
$$
and to convert the function \( g(x_1,\dots,x_d) \) to
$$
g(x_1,\dots,x_d)\rightarrow
g(a_1+(b_1-a_1)t_1,\dots,a_d+(b_d-a_d)t_d).
$$
$$
\int_{-\infty}^{\infty}{\bf dxdy}g({\bf x, y}),
$$
where
$$
g({\bf x, y})=\exp{(-{\bf x}^2-{\bf y}^2-({\bf x-y})^2/2)},
$$
with \( d=6 \).
$$
\frac{1}{\sqrt{\pi}}\exp{(-x^2)},
$$
and through
$$
\pi^3\int\prod_{i=1}^6\left(
\frac{1}{\sqrt{\pi}}\exp{(-x_i^2)}\right)
\exp{(-({\bf x-y})^2/2)}dx_1.\dots dx_6,
$$
we can rewrite our integral as
$$
\int f(x_1,\dots,x_d)F(x_1,\dots,x_d)\prod_{i=1}^6dx_i,
$$
where \( f \) is the gaussian distribution.
.....
// evaluate the integral without importance sampling
// Loop over Monte Carlo Cycles
for ( int i = 1; i <= n; i++){
// x[] contains the random numbers for all dimensions
for (int j = 0; j< 6; j++) {
x[j]=-length+2*length*ran0(&idum);
fx=brute_force_MC(x);
int_mc += fx;
sum_sigma += fx*fx;
int_mc = int_mc/((double) n );
sum_sigma = sum_sigma/((double) n );
variance=sum_sigma-int_mc*int_mc;
......
double brute_force_MC(double *x)
{
double a = 1.; double b = 0.5;
// evaluate the different terms of the exponential
double xx=x[0]*x[0]+x[1]*x[1]+x[2]*x[2];
double yy=x[3]*x[3]+x[4]*x[4]+x[5]*x[5];
double xy=pow((x[0]-x[3]),2)+pow((x[1]-x[4]),2)+pow((x[2]-x[5]),2);
return exp(-a*xx-a*yy-b*xy);
..........
// evaluate the integral with importance sampling
for ( int i = 1; i <= n; i++){
// x[] contains the random numbers for all dimensions
for (int j = 0; j < 6; j++) {
x[j] = gaussian_deviate(&idum)*sqrt2;
fx=gaussian_MC(x);
int_mc += fx;
sum_sigma += fx*fx;
int_mc = int_mc/((double) n );
sum_sigma = sum_sigma/((double) n );
variance=sum_sigma-int_mc*int_mc;
.............
// this function defines the integrand to integrate
double gaussian_MC(double *x)
{
double a = 0.5;
// evaluate the different terms of the exponential
double xy=pow((x[0]-x[3]),2)+pow((x[1]-x[4]),2)+pow((x[2]-x[5]),2);
return exp(-a*xy);
} // end function for the integrand
\begin{tabular}{rll}\hline $N$&$I_{cr}$&$I_{gd}$\\\hline 10000 & 1.15247E+01& 1.09128E+01 \\ 100000 & 1.29650E+01 & 1.09522E+01 \\ 1000000& 1.18226E+01 & 1.09673E+01 \\ 10000000&1.04925E+01 & 1.09612E+01 \\ \hline \end{tabular}
$$
{\bf r}_i = x_i {\bf e}_x + y_i {\bf e}_y +z_i {\bf e}_z ,
$$
as
$$
\psi_{1s}({\bf r}_i) = e^{-\alpha r_i},
$$
where \( \alpha \) is a parameter and
$$
r_i = \sqrt{x_i^2+y_i^2+z_i^2}.
$$
We will fix \( \alpha=2 \), which should correspond to the charge of the helium atom \( Z=2 \).
$$
d{\bf r}_1d{\bf r}_2 = r_1^2dr_1 r_2^2dr_2 dcos(\theta_1)dcos(\theta_2)d\phi_1d\phi_2,
$$
and
$$
\frac{1}{r_{12}}= \frac{1}{\sqrt{r_1^2+r_2^2-2r_1r_2cos(\beta)}}
$$
with
$$
\cos(\beta) = \cos(\theta_1)\cos(\theta_2)+\sin(\theta_1)\sin(\theta_2)\cos(\phi_1-\phi_2))
$$
\( \theta_{1,2} \in [0,\pi] \), use mapping \( \theta_{1,2}=\pi*ran \) with \( ran \in [0,1] \) a uniform distribution point.
\( \phi_{1,2} \in [0,2\pi] \), use mapping \( \phi_{1,2}=2\pi*ran \) with \( ran \in [0,1] \) a uniform distribution point.
Be careful with the integrand
$$
\frac{\exp{(-4(r_1+r_2))}r_1^2dr_1 r_2^2dr_2 dcos(\theta_1)dcos(\theta_2)d\phi_1d\phi_2}{\sqrt{r_1^2+r_2^2-2r_1r_2cos(\beta)}}
$$
$$
\Psi({\bf r}_1,{\bf r}_2) = e^{-\alpha (r_1+r_2)}.
$$
Note that it is not possible to find an analytic solution to Schr\"odinger's equation for
two interacting electrons in the helium atom.
The integral we need to solve is the quantum mechanical expectation value of the correlation energy between two electrons, namely
$$
\begin{equation}label{eq:correlationenergy}
\langle \frac{1}{|{\bf r}_1-{\bf r}_2|} \rangle =
\int_{-\infty}^{\infty} d{\bf r}_1d{\bf r}_2 e^{-2\alpha (r_1+r_2)}\frac{1}{|{\bf r}_1-{\bf r}_2|}=\frac{5\pi^2}{16^2}=0.192765711.
\end{equation}
$$
Note that our wave function is not normalized. There is a normalization factor missing, but for this project
we don't need to worry about that.
for each variable \( x_1,y_1,z_1,x_2,y_2,z_2 \) from \( -\infty \) to \( \infty \). How many mesh points do you need before the results converges at the level of the fourth leading digit? Hint: the single-particle wave function \( e^{-\alpha r_i} \) is more or less zero at \( r_i \approx ? \). You can therefore replace the integration limits \( -\infty \) and \( \infty \) with \( -\Lambda \) and \( \Lambda \), respectively. You need to check that this approximation is satisfactory.
and compare your results with those from the previous point. Discuss the differences. With bruce force we mean that you should use the uniform distribution.
Does the variance decrease? Does the CPU time used compared with the brute force Monte Carlo decrease in order to achieve the same accuracy? Comment your results.
double *x = new double [N];
double *w = new double [N];
// set up the mesh points and weights
gauleg(a,b,x,w, N);
// evaluate the integral with the Gauss-Legendre method
// Note that we initialize the sum
double int_gauss = 0.;
for (int i=0;i<N;i++){
for (int j = 0;j<N;j++){
for (int k = 0;k<N;k++){
for (int l = 0;l<N;l++){
for (int m = 0;m<N;m++){
for (int n = 0;n<N;n++){
int_gauss+=w[i]*w[j]*w[k]*w[l]*w[m]*w[n]
*int_function(x[i],x[j],x[k],x[l],x[m],x[n]);
}}}}}
}
// this function defines the function to integrate
double int_function(double x1, double y1, double z1,
double x2, double y2, double z2)
{
double alpha = 2.;
// evaluate the different terms of the exponential
double exp1=-2*alpha*sqrt(x1*x1+y1*y1+z1*z1);
double exp2=-2*alpha*sqrt(x2*x2+y2*y2+z2*z2);
double deno=sqrt(pow((x1-x2),2)+pow((y1-y2),2)+pow((z1-z2),2));
if(deno <pow(10.,-6.)) { return 0;}
else return exp(exp1+exp2)/deno;
} // end of function to evaluate
double int_mc = 0.; double variance = 0.;
double sum_sigma= 0. ; long idum=-1 ;
double length=1.5; // we fix the max size of the box to L=3
double volume=pow((2*length),6.);
// evaluate the integral with importance sampling
for ( int i = 1; i <= n; i++){
// x[] contains the random numbers for all dimensions
for (int j = 0; j< 6; j++) {
// Maps U[0,1] to U[-L,L]
x[j]=-length+2*length*ran0(&idum);
fx=brute_force_MC(x);
int_mc += fx;
sum_sigma += fx*fx;
int_mc = int_mc/((double) n );
sum_sigma = sum_sigma/((double) n );
variance=sum_sigma-int_mc*int_mc;
....
double brute_force_MC(double *x)
{
double alpha = 2.;
// evaluate the different terms of the exponential
double exp1=-2*alpha*sqrt(x[0]*x[0]+x[1]*x[1]+x[2]*x[2]);
double exp2=-2*alpha*sqrt(x[3]*x[3]+x[4]*x[4]+x[5]*x[5]);
double deno=sqrt(pow((x[0]-x[3]),2)
+pow((x[1]-x[4]),2)+pow((x[2]-x[5]),2));
double value=exp(exp1+exp2)/deno;
return value;
} // end function for the integrand
double int_mc = 0.; double variance = 0.;
double sum_sigma= 0. ; long idum=-1 ;
// The 'volume' contains 4 jacobideterminants(pi,pi,2pi,2pi)
// and a scaling factor 1/16
double volume=4*pow(acos(-1.),4.)*1./16;
// evaluate the integral with importance sampling
for ( int i = 1; i <= n; i++){
for (int j = 0; j < 2; j++) {
y=ran0(&idum);
x[j]=-0.25*log(1.-y);
}
for (int j = 2; j < 4; j++) {
x[j] = 2*acos(-1.)*ran0(&idum);
}
for (int j = 4; j < 6; j++) {
x[j] = acos(-1.)*ran0(&idum);
}
fx=integrand_MC(x);
....
// this function defines the integrand to integrate
double integrand_MC(double *x)
{
double num=x[0]*x[0]*x[1]*x[1]*sin(x[4])*sin(x[5]);
double deno=sqrt(x[0]*x[0]+x[1]*x[1]-2*x[0]*x[1]*
(sin(x[4])*sin(x[5])*cos(x[2]-x[3])+cos(x[4])*cos(x[5])));
return num/deno;
} // end function for the integrand
\begin{tabular}{rllllll}\hline $N$&$I_{br}$ &$\sigma_{br}$ &time(s) &$I_{is}$ &$\sigma_{is}$ &time(s)\\\hline 1E6 & 0.19238 &3.85124E-4 & 0.6 &0.19176 & 1.01515E-4 & 1.4 \\ 10E6 &0.18607 &1.18053E-4 & 6 &0.192254 &1.22430E-4 &14 \\ 100E6 &0.18846 &4.37163E-4 & 57 &0.192720 &1.03346E-4 &138 \\ 1000E6 &0.18843 &1.35879E-4 &581 &0.192789 &3.28795E-5 &1372 \\ \hline \end{tabular}
Gauss-Legendre results:
\begin{tabular}{rllll}\hline \( N \) &time(s) &$I_n$ &$|I_n-I$ \\\hline 20 & 31 &0.18047 & 1.123E-2 \\ 30 & 354 & 0.18501 & 7.76E-3 \\ 40 & 1999 & 0.18653 & 6.24E-3 \\ 50 & 7578 & 0.18722 & 5.54E-3 \\ \hline \end{tabular}
$$
\{x_1, x_2,\dots\,x_k,\dots\}.
$$
We will call these
values our measurements and the entire set as our measured
sample. The action of measuring all the elements of a sample
we will call a stochastic experiment (since, operationally,
they are often associated with results of empirical observation of
some physical or mathematical phenomena; precisely an experiment). We
assume that these values are distributed according to some
PDF \( p_X^{\phantom X}(x) \), where \( X \) is just the formal symbol for the
stochastic variable whose PDF is \( p_X^{\phantom X}(x) \). Instead of
trying to determine the full distribution \( p \) we are often only
interested in finding the few lowest moments, like the mean
\( \mu_X^{\phantom X} \) and the variance \( \sigma_X^{\phantom X} \).
$$
p(x) = \prob(X=x)
$$
In the continuous case, the PDF does not directly depict the
actual probability. Instead we define the probability for the
stochastic variable to assume any value on an infinitesimal interval
around \( x \) to be \( p(x)dx \). The continuous function \( p(x) \) then gives us
the density of the probability rather than the probability
itself. The probability for a stochastic variable to assume any value
on a non-infinitesimal interval \( [a,\,b] \) is then just the integral:
$$
\prob(a\leq X\leq b) = \int_a^b p(x)dx
$$
Qualitatively speaking, a stochastic variable represents the values of
numbers chosen as if by chance from some specified PDF so that the
selection of a large set of these numbers reproduces this PDF.
$$
P(x)=\mbox{Prob(}X\leq x\mbox{)} =
\int_{-\infty}^x p(x^{\prime})dx^{\prime}
$$
The relation between a CDF and its corresponding PDF is then:
$$
p(x) = \frac{d}{dx}P(x)
$$
$$
\mean{x^n} \equiv \int\! x^n p(x)\,dx
$$
The zero-th moment \( \mean{1} \) is just the normalization condition of
\( p \). The first moment, \( \mean{x} \), is called the mean of \( p \)
and often denoted by the letter \( \mu \):
$$
\mean{x} = \mu \equiv \int\! x p(x)\,dx
$$
$$
\mean{(x-\mean{x})^n} \equiv \int\! (x-\mean{x})^n p(x)\,dx
$$
The zero-th and first central moments are both trivial, equal \( 1 \) and
\( 0 \), respectively. But the second central moment, known as the
variance of \( p \), is of particular interest. For the stochastic
variable \( X \), the variance is denoted as \( \sigma^2_X \) or \( \var(X) \):
$$
\begin{align*}
\sigma^2_X\ \ =\ \ \var(X) &= \mean{(x-\mean{x})^2} =
\int\! (x-\mean{x})^2 p(x)\,dx\\
&= \int\! \left(x^2 - 2 x \mean{x}^{\phantom{2}} +
\mean{x}^2\right)p(x)\,dx\\
&= \mean{x^2} - 2 \mean{x}\mean{x} + \mean{x}^2\\
&= \mean{x^2} - \mean{x}^2
\end{align*}
$$
The square root of the variance, \( \sigma = \sqrt{\mean{(x-\mean{x})^2}} \) is called the standard deviation of \( p \). It is clearly just the RMS (root-mean-square) value of the deviation of the PDF from its mean value, interpreted qualitatively as the "spread" of \( p \) around its mean.
$$
\begin{align}
\cov(X_i,\,X_j) &\equiv \meanb{(x_i-\mean{x_i})(x_j-\mean{x_j})}
\nonumber\\
&=
\int\!\cdots\!\int\!(x_i-\mean{x_i})(x_j-\mean{x_j})\,
P(x_1,\dots,x_n)\,dx_1\dots dx_n
\tag{28}
\end{align}
$$
with
$$
\mean{x_i} =
\int\!\cdots\!\int\!x_i\,P(x_1,\dots,x_n)\,dx_1\dots dx_n
$$
$$
\begin{align*}
\cov(X_i,\,X_j) &= \meanb{(x_i-\mean{x_i})(x_j-\mean{x_j})}\\
&=\mean{x_i x_j - x_i\mean{x_j} - \mean{x_i}x_j + \mean{x_i}\mean{x_j}}\\
&=\mean{x_i x_j} - \mean{x_i\mean{x_j}} - \mean{\mean{x_i}x_j} +
\mean{\mean{x_i}\mean{x_j}}\\
&=\mean{x_i x_j} - \mean{x_i}\mean{x_j} - \mean{x_i}\mean{x_j} +
\mean{x_i}\mean{x_j}\\
&=\mean{x_i x_j} - \mean{x_i}\mean{x_j}
\end{align*}
$$
Also useful for us is the covariance of linear combinations of stochastic variables. Let \( \{X_i\} \) and \( \{Y_i\} \) be two sets of stochastic variables. Let also \( \{a_i\} \) and \( \{b_i\} \) be two sets of scalars. Consider the linear combination:
$$
U = \sum_i a_i X_i \qquad V = \sum_j b_j Y_j
$$
By the linearity of the expectation value
$$
\cov(U, V) = \sum_{i,j}a_i b_j \cov(X_i, Y_j)
$$
$$
\var(U) = \sum_{i,j}a_i a_j \cov(X_i, X_j)
\tag{29}
$$
And in the special case when the stochastic variables are
uncorrelated, the off-diagonal elements of the covariance are as we
know< zero, resulting in:
$$
\var(U) = \sum_i a_i^2 \cov(X_i, X_i) = \sum_i a_i^2 \var(X_i)
$$
$$
\var(\sum_i a_i X_i) = \sum_i a_i^2 \var(X_i)
$$
which will become very useful in our study of the error in the mean
value of a set of measurements.
$$
\bar x_n \equiv \frac{1}{n}\sum_{k=1}^n x_k
$$
The sample variance is:
$$
\var(x) \equiv \frac{1}{n}\sum_{k=1}^n (x_k - \bar x_n)^2
$$
its square root being the standard deviation of the sample. The
sample covariance is:
$$
\cov(x)\equiv\frac{1}{n}\sum_{kl}(x_k - \bar x_n)(x_l - \bar x_n)
$$
These quantities, being known experimental values, differ significantly from and must not be confused with the similarly named quantities for stochastic variables, mean \( \mu_X \), variance \( \var(X) \) and covariance \( \cov(X,Y) \).
$$
\lim_{n\to\infty}\bar x_n = \mu_X^{\phantom X}
$$
The sample mean \( \bar x_n \) works therefore as an estimate of the true
mean \( \mu_X^{\phantom X} \).
What we need to find out is how good an approximation \( \bar x_n \) is to \( \mu_X^{\phantom X} \). In any stochastic measurement, an estimated mean is of no use to us without a measure of its error. A quantity that tells us how well we can reproduce it in another experiment. We are therefore interested in the PDF of the sample mean itself. Its standard deviation will be a measure of the spread of sample means, and we will simply call it the error of the sample mean, or just sample error, and denote it by \( \mbox{err}_X^{\phantom X} \). In practice, we will only be able to produce an estimate of the sample error since the exact value would require the knowledge of the true PDFs behind, which we usually do not have.
$$
\overline X_n = \frac{1}{n}\sum_{i=1}^n X_i
$$
All the coefficients are just equal \( 1/n \). The PDF of \( \overline X_n \),
denoted by \( p_{\overline X_n}(x) \) is the desired PDF of the sample
means.
$$
p_{\overline X_n}(x) = \int p_X^{\phantom X}(x_1)\cdots
\int p_X^{\phantom X}(x_n)\
\delta\!\left(x - \frac{x_1+x_2+\dots+x_n}{n}\right)dx_n \cdots dx_1
$$
And in particular we are interested in its variance \( \var(\overline
X_n) \).
$$
\lim_{n\to\infty} p_{\overline X_n}(x) =
\left(\frac{n}{2\pi\var(X)}\right)^{1/2}
e^{-\frac{n(x-\bar x_n)^2}{2\var(X)}}
\tag{30}
$$
$$
\begin{equation}
\mbox{err}_X^2 = \var(\overline X_n) = \frac{1}{n^2}
\sum_{ij} \cov(X_i, X_j)
\tag{31}
\end{equation}
$$
We see now that in order to calculate the exact error of the sample
with the above expression, we would need the true means
\( \mu_{X_i}^{\phantom X} \) of the stochastic variables \( X_i \). To
calculate these requires that we know the true multivariate PDF of all
the \( X_i \). But this PDF is unknown to us, we have only got the measurements of
one sample. The best we can do is to let the sample itself be an
estimate of the PDF of each of the \( X_i \), estimating all properties of
\( X_i \) through the measurements of the sample.
$$
\mu_{X_i}^{\phantom X} = \mean{x_i} \approx \frac{1}{n}\sum_{k=1}^n
x_k = \bar x
$$
Using \( \bar x \) in place of \( \mu_{X_i}^{\phantom X} \) we can give an
estimate of the covariance in eq. (31):
$$
\begin{align*}
\cov(X_i, X_j) &= \mean{(x_i-\mean{x_i})(x_j-\mean{x_j})}
\approx\mean{(x_i - \bar x)(x_j - \bar{x})}\\
&\approx&\frac{1}{n} \sum_{l}^n \left(\frac{1}{n}\sum_{k}^n (x_k -
\bar x_n)(x_l - \bar x_n)\right)
=\frac{1}{n}\frac{1}{n} \sum_{kl} (x_k -
\bar x_n)(x_l - \bar x_n)\\
&=\frac{1}{n}\cov(x)
\end{align*}
$$
$$
\begin{align}
\var(X_i)
&=\mean{x_i - \mean{x_i}} \approx \mean{x_i - \bar x_n}\nonumber\\
&\approx&\frac{1}{n}\sum_{k=1}^n (x_k - \bar x_n)\nonumber\\
&=\var(x)
\tag{32}
\end{align}
$$
Now we can calculate an estimate of the error \( \mbox{err}_X^{\phantom X} \) of the sample mean \( \bar x_n \):
$$
\begin{align}
\mbox{err}_X^2
&=\frac{1}{n^2}\sum_{ij} \cov(X_i, X_j) \nonumber \\
&\approx&\frac{1}{n^2}\sum_{ij}\frac{1}{n}\cov(x) =
\frac{1}{n^2}n^2\frac{1}{n}\cov(x)\nonumber\\
&=\frac{1}{n}\cov(x)
\tag{33}
\end{align}
$$
which is nothing but the sample covariance divided by the number of
measurements in the sample.
$$
\begin{align}
\mbox{err}_X^2 &= \frac{1}{n^2}\sum_{ij} \cov(X_i, X_j) =
\frac{1}{n^2} \sum_i \var(X_i)\nonumber\\
&\approx&
\frac{1}{n^2} \sum_i \var(x)\nonumber\\ &= \frac{1}{n}\var(x)
\tag{34}
\end{align}
$$
where in the second step we have used eq. (32).
The error of the sample is then just its standard deviation divided by
the square root of the number of measurements the sample contains.
This is a very useful formula which is easy to compute. It acts as a
first approximation to the error, but in numerical experiments, we
cannot overlook the always present correlations.
$$
\begin{align}
\mbox{err}_X^2 &=
\frac{1}{n}\var(x) + \frac{1}{n}(\cov(x)-\var(x))\nonumber\\&=
\frac{1}{n^2}\sum_{k=1}^n (x_k - \bar x_n)^2 +
\frac{2}{n^2}\sum_{k < l} (x_k - \bar x_n)(x_l - \bar x_n)
\tag{35}
\end{align}
$$
The first term is the same as the error in the uncorrelated case,
Eq. (34). This means that the second
term accounts for the error correction due to correlation between the
measurements. For uncorrelated measurements this second term is zero.
$$
\var(x) = \frac{1}{n}\sum_{k=1}^n (x_k - \bar x_n)^2 =
\left(\frac{1}{n}\sum_{k=1}^n x_k^2\right) - \bar x_n^2
$$
We just accumulate separately the values \( x^2 \) and \( x \) for every measurement \( x \) we receive. The correlation term, though, has to be calculated at the end of the experiment since we need all the measurements to calculate the cross terms. Therefore, all measurements have to be stored throughout the experiment.
$$
N_i=(aN_{i-1}+c) MOD (M),
$$
and to find a number in \( x\in [0,1] \)
$$
x_i=N_i/M
$$
\( M \) is called the period and should be as big as possible.
The start value is \( N_0 \) and is called the seed.
Consider the following example
$$
N_i=(6N_{i-1}+7) \mbox{MOD} (5),
$$
with a seed \( N_0=2 \). This generator produces the sequence
\( 4,1,3,0,2,4,1,3,0,2,...\dots \), i.e., a sequence with period \( 5 \).
However, increasing \( M \) may not guarantee a larger period as the following
example shows
$$
N_i=(27N_{i-1}+11) \mbox{MOD} (54),
$$
which still, with \( N_0=2 \), results in \( 11,38,11,38,11,38,\dots \), a period of
just \( 2 \).
$$
N_l=(aN_{l-i}+cN_{l-j})\mbox{MOD}(M).
$$
Such a generator again produces a sequence of pseudorandom numbers
but this time with a period much larger than \( M \).
It is also possible to construct more elaborate algorithms by including
more than two past terms in the sum of each iteration.
One example is the generator of Marsaglia and Zaman (Computers in Physics {\bf 8} (1994) 117)
which consists of two congruential relations
$$
N_l=(N_{l-3}-N_{l-1})\mbox{MOD}(2^{31}-69),
\tag{36}
$$
followed by
$$
N_l=(69069N_{l-1}+1013904243)\mbox{MOD}(2^{32}),
\tag{37}
$$
which according to the authors has a period larger than \( 2^{94} \).
$$
N_l=(N_{l-i})\oplus (N_{l-j})
$$
where the bitwise action of \( \oplus \) means that if \( N_{l-i}=N_{l-j} \) the result is
\( 0 \) whereas if \( N_{l-i}\ne N_{l-j} \) the result is
\( 1 \). As an example, consider the case where \( N_{l-i}=6 \) and \( N_{l-j}=11 \). The first
one has a bit representation (using 4 bits only) which reads \( 0110 \) whereas the
second number is \( 1011 \). Employing the \( \oplus \) operator yields
\( 1101 \), or \( 2^3+2^2+2^0=13 \).
In Fortran90, the bitwise \( \oplus \) operation is coded through the intrinsic function \( \mbox{IEOR}(m,n) \) where \( m \) and \( n \) are the input numbers, while in \( C \) it is given by \( m\wedge n \).
$$
N_i=(aN_{i-1}) \mbox{MOD} (M).
$$
Note that \( c=0 \) and that it cannot be initialized with \( N_0=0 \).
However, since \( a \) and \( N_{i-1} \) are integers and their multiplication
could become greater than the standard 32 bit integer, there is a trick via
Schrage's algorithm which approximates the multiplication
of large integers through the factorization
$$
M=aq+r,
$$
where we have defined
$$
q=[M/a],
$$
and
$$
r = M\quad\mbox{MOD} \quad a.
$$
where the brackets denote integer division. In the code below the numbers
\( q \) and \( r \) are chosen so that \( r < q \).
$$
(aN_{i-1}) \mbox{MOD} (M)= (aN_{i-1}-[N_{i-1}/q]M)\mbox{MOD} (M),
\tag{38}
$$
since we can add or subtract any integer multiple of \( M \) from \( aN_{i-1} \).
The last term $ [N_{i-1}/q]M \mbox{MOD} (M)$ is zero since the integer division
\( [N_{i-1}/q] \) just yields a constant which is multiplied with \( M \).
Rewrite as
$$
(aN_{i-1}) \mbox{MOD} (M)= (aN_{i-1}-[N_{i-1}/q](aq+r))\mbox{MOD} (M),
\tag{39}
$$
$$
(aN_{i-1}) \mbox{MOD} (M)= \left(a(N_{i-1}-[N_{i-1}/q]q)-[N_{i-1}/q]r)\right)\mbox{MOD} (M),
\tag{40}
$$
yielding
$$
(aN_{i-1}) \mbox{MOD} (M)= \left(a(N_{i-1}\mbox{MOD} (q)) -[N_{i-1}/q]r)\right)\mbox{MOD} (M).
\tag{41}
$$
/* ran0() is an "Minimal" random number generator of Park and Miller
** Set or reset the input value
** idum to any integer value (except the unlikely value MASK)
** to initialize the sequence; idum must not be altered between
** calls for sucessive deviates in a sequence.
** The function returns a uniform deviate between 0.0 and 1.0.
*/
double ran0(long &idum)
{
const int a = 16807, m = 2147483647, q = 127773;
const int r = 2836, MASK = 123459876;
const double am = 1./m;
long k;
double ans;
idum ^= MASK;
k = (*idum)/q;
idum = a*(idum - k*q) - r*k;
// add m if negative difference
if(idum < 0) idum += m;
ans=am*(idum);
idum ^= MASK;
return ans;
} // End: function ran0()
For the uniform distribution we have
$$
\langle x^k\rangle=\int_0^1dxp(x)x^k=\int_0^1dxx^k=\frac{1}{k+1},
$$
since \( p(x)=1 \).
The mean value \( \mu \) is then
$$
\mu=\langle x\rangle=\frac{1}{2}
$$
while the standard deviation is
$$
\sigma=\sqrt{\langle x^2\rangle-\mu^2}=\frac{1}{\sqrt{12}}=0.2886.
$$
This mimicks the way a real system reaches its most likely state at a given temperature of the surroundings.
This sequence of steps forms a {\bf chain}.
Neuroscience, see for example Lipinski, Physics Medical Biology {\bf 35}, page 441 (1990) or Farnell and Gibson, Journal of Computational Physics {\bf 208}, page 253 (2005)
Tons of applications in physics
and chemistry
and biology, medicine
Nobel prize in economy to Black and Scholes
$$
\frac{\partial V}{\partial t}+\frac{1}{2}\sigma^{2}S^{2}\frac{\partial^{2} V}{\partial S^{2}}+rS\frac{\partial V}{\partial S}-rV=0.
$$
The
Black and Scholes equation is a partial differential equation, which describes the price
of the option over time. The derivation is based on Brownian motion (Langevin and Fokker-Planck, 12.6)
...the list is almost infinite
Consider a system whose energy is defined by the orientation of single spins. Consider the state \( i \), with given energy \( E_i \) represented by the following \( N \) spins
$$
\begin{array}{cccccccccc}
\uparrow&\uparrow&\uparrow&\dots&\uparrow&\downarrow&\uparrow&\dots&\uparrow&\downarrow\\
1&2&3&\dots& k-1&k&k+1&\dots&N-1&N\end{array}
$$
We may be interested in the transition with one single spinflip to a new state \( j \) with energy \( E_j \)
$$
\begin{array}{cccccccccc}
\uparrow&\uparrow&\uparrow&\dots&\uparrow&\uparrow&\uparrow&\dots&\uparrow&\downarrow\\
1&2&3&\dots& k-1&k&k+1&\dots&N-1&N\end{array}
$$
This change from one microstate \( i \) (or spin configuration) to another microstate \( j \) is the
{\bf configuration space} analogue to a random walk on a lattice. Instead of jumping from
one place to another in space, we 'jump' from one microstate to another.
$$
\begin{equation}
j(x,t) = -D\frac{\partial w(x,t)}{\partial x},
\end{equation}
$$
where \( D \) is the so-called diffusion constant, with dimensionality length$^2$ per time.
If the number of particles is conserved, we have the continuity equation
$$
\begin{equation}
\frac{\partial j(x,t)}{\partial x} = -\frac{\partial w(x,t)}{\partial t},
\end{equation}
$$
which leads to
$$
\begin{equation}
\frac{\partial w(x,t)}{\partial t} =
D\frac{\partial^2w(x,t)}{\partial x^2},
\tag{42}
\end{equation}
$$
which is the diffusion equation in one dimension. Solved as a partial differential equation
in chapter 10.
$$
\begin{equation}
\langle x(t)\rangle = \int_{-\infty}^{\infty}xw(x,t)dx,
\end{equation}
$$
or
$$
\begin{equation}
\langle x^2(t)\rangle = \int_{-\infty}^{\infty}x^2w(x,t)dx,
\end{equation}
$$
which allows for the computation of the variance
\( \sigma^2=\langle x^2(t)\rangle-\langle x(t)\rangle^2 \). Note well that
these expectation values are time-dependent. In a similar way we can also
define expectation values of functions \( f(x,t) \) as
$$
\begin{equation}
\langle f(x,t)\rangle = \int_{-\infty}^{\infty}f(x,t)w(x,t)dx.
\end{equation}
$$
$$
\begin{equation}
\int_{-\infty}^{\infty}w(x,t)dx=1
\end{equation}
$$
imposes significant constraints on \( w(x,t) \). These are
$$
\begin{equation}
w(x=\pm \infty,t)=0 \quad
\frac{\partial^{n}w(x,t)}{\partial x^n}|_{x=\pm\infty} = 0,
\end{equation}
$$
implying that when we study the time-derivative
\( {\partial\langle x(t)\rangle}/{\partial t} \), we obtain after integration by parts and using
Eq. (42)
$$
\begin{equation}
\frac{\partial \langle x\rangle}{\partial t} =
\int_{-\infty}^{\infty}x\frac{\partial w(x,t)}{\partial t}dx=
D\int_{-\infty}^{\infty}x\frac{\partial^2w(x,t)}{\partial x^2}dx,
\end{equation}
$$
leading to
$$
\begin{equation}
\frac{\partial \langle x\rangle}{\partial t} =
Dx\frac{\partial w(x,t)}{\partial x}|_{x=\pm\infty}-
D\int_{-\infty}^{\infty}\frac{\partial w(x,t)}{\partial x}dx,
\end{equation}
$$
implying that
$$
\begin{equation}
\frac{\partial \langle x\rangle}{\partial t} = 0.
\end{equation}
$$
$$
\begin{equation}
\frac{\partial \langle x^2\rangle}{\partial t} =
Dx^2\frac{\partial w(x,t)}{\partial x}|_{x=\pm\infty}-
2D\int_{-\infty}^{\infty}x\frac{\partial w(x,t)}{\partial x}dx,
\end{equation}
$$
where we have performed an integration by parts as we did
for \( \frac{\partial \langle x\rangle}{\partial t} \). A further integration by parts
results in
$$
\begin{equation}
\frac{\partial \langle x^2\rangle}{\partial t} =
-Dxw(x,t)|_{x=\pm\infty}+
2D\int_{-\infty}^{\infty}w(x,t)dx=2D,
\end{equation}
$$
leading to
$$
\begin{equation}
\langle x^2\rangle = 2Dt,
\end{equation}
$$
and the variance as
$$
\begin{equation}label{eq:variancediffeq}
\langle x^2\rangle-\langle x\rangle^2 = 2Dt.
\end{equation}
$$
The root mean square displacement after a time \( t \) is then
$$
\begin{equation}
\sqrt{\langle x^2\rangle-\langle x\rangle^2} = \sqrt{2Dt}.
\end{equation}
$$
$$
\langle x(n)\rangle = \sum_{i}^{n} \Delta x_i = 0 \quad \Delta x_i=\pm l,
$$
since we have an equal probability of jumping either to the left or to right.
The value of \( \langle x(n)^2\rangle \) is
$$
\langle x(n)^2\rangle = \left(\sum_{i}^{n} \Delta x_i\right)^2=\sum_{i}^{n} \Delta x_i^2+
\sum_{i\ne j}^{n} \Delta x_i\Delta x_j=l^2n.
$$
For many enough steps the non-diagonal contribution is
$$
\sum_{i\ne j}^{N} \Delta x_i\Delta x_j=0,
$$
since \( \Delta x_{i,j} = \pm l \).
$$
\langle x(n)^2\rangle - \langle x(n)\rangle^2 = l^2n.
\tag{43}
$$
It is also rather straightforward to compute the variance for \( L\ne R \). The result is
$$
\langle x(n)^2\rangle - \langle x(n)\rangle^2 = 4LRl^2n.
$$
The variable \( n \) represents the number of time
steps. If we define \( n=t/\Delta t \), we can then couple the variance result
from a random walk
in one dimension with the variance from diffusion
by defining the diffusion constant as
$$
D = \frac{l^2}{\Delta t}.
$$
$$
\begin{equation}
\frac{\partial w(x,t)}{\partial t} \approx
\frac{w(i,n+1)-w(i,n)}{\Delta t},
\end{equation}
$$
whereas the gradient is approximated as
$$
\begin{equation}
D\frac{\partial^2w(x,t)}{\partial x^2}\approx
D\frac{w(i+1,n)+w(i-1,n)-2w(i,n)}{(\Delta x)^2},
\end{equation}
$$
resulting in the discretized diffusion equation
$$
\frac{w(i,n+1)-w(i,n)}{\Delta t}=D\frac{w(i+1,n)+w(i-1,n)-2w(i,n)}{(\Delta x)^2},
$$
where \( n \) represents a given time step and \( i \) a step in the \( x \)-direction.
$$
W_{ij}(\epsilon)=W(il-jl,\epsilon)=\left\{\begin{array}{cc}\frac{1}{2} & |i-j| = 1\\
0 & \mbox{else} \end{array} \right.
$$
We call \( W_{ij} \) for the transition probability and we can represent it, see below,
as a matrix. Our new PDF \( w_i(t=\epsilon) \) is now related to the PDF at
\( t=0 \) through the relation
$$
w_i(t=\epsilon) = W(j\rightarrow i)w_j(t=0).
$$
$$
\sum_i w_i(t) = 1,
$$
and
$$
\sum_j W(j\rightarrow i) = 1.
$$
The further constraints are
\( 0 \le W_{ij} \le 1 \) and \( 0 \le w_{j} \le 1 \).
Note that the probability for remaining at the same place is in general
not necessarily equal zero. In our Markov process we allow only for jumps to the left or to
the right.
$$
w_i(t_n) = \sum_jW_{ij}(t_n)w_j(0),
$$
and defining
$$
W(il-jl,n\epsilon)=(W^n(\epsilon))_{ij}
$$
we obtain
$$
w_i(n\epsilon) = \sum_j(W^n(\epsilon))_{ij}w_j(0),
$$
or in matrix form
$$
\begin{equation}
\tag{44}
\hat{w(n\epsilon)} = \hat{W}^n(\epsilon)\hat{w}(0).
\end{equation}
$$
$$
w_i(t=\epsilon) = W(j\rightarrow i)w_j(t=0).
$$
Normally we don't know the form of \( W \)!!
This equation represents the discretized time-development of an original
PDF. We can rewrite this as a
$$
w_i(t=\epsilon) = W_{ij}w_j(t=0).
$$
with the transition matrix \( W \) for a random walk left or right (cannot stay in the same position) given by
$$
W_{ij}(\epsilon)=W(il-jl,\epsilon)=\left\{\begin{array}{cc}\frac{1}{2} & |i-j| = 1\\
0 & \mbox{else} \end{array} \right.
$$
We call \( W_{ij} \) for the transition probability and we represent it
as a matrix.
$$
\sum_i w_i(t) = 1,
$$
and
$$
\sum_j W(j\rightarrow i) = 1.
$$
Further constraints are
\( 0 \le W_{ij} \le 1 \) and \( 0 \le w_{j} \le 1 \).
We can thus write the action of \( W \) as
$$
w_i(t+1) = \sum_jW_{ij}w_j(t),
$$
or as vector-matrix relation
$$
{\bf \hat{w}}(t+1) = {\bf \hat{W}\hat{w}}(t),
$$
and if we have that \( ||{\bf \hat{w}}(t+1)-{\bf \hat{w}}(t)||\rightarrow 0 \), we say that
we have reached the most likely state of the system, the so-called steady state or equilibrium state.
Another way of phrasing this is
$$ {\bf w}(t=\infty) = {\bf Ww}(t=\infty). $$
$$
\hat{W} = \left(\begin{array}{ccc} 1/4 & 1/8 & 2/3\\
3/4 & 5/8 & 0\\
0 & 1/4 & 1/3\\ \end{array} \right),
$$
and we choose our initial state as
$$
\hat{w}(t=0)= \left(\begin{array}{c} 1\\
0\\
0 \end{array} \right).
$$
The first iteration is
$$
w_i(t=\epsilon) = W(j\rightarrow i)w_j(t=0),
$$
resulting in
$$
\hat{w}(t=\epsilon)= \left(\begin{array}{c} 1/4\\
3/4 \\
0 \end{array} \right).
$$
$$
w_i(t=2\epsilon) = W(j\rightarrow i)w_j(t=\epsilon),
$$
resulting in
$$
\hat{w}(t=2\epsilon)= \left(\begin{array}{c} 5/32\\
21/32 \\
6/32 \end{array} \right).
$$
Note that the vector \( \hat{w} \) is always normalized to \( 1 \). We find the steady state of the system by solving the linear set of equations
$$ {\bf w}(t=\infty) = {\bf Ww}(t=\infty). $$
$$
\begin{align}
W_{11}w_1(t=\infty) +W_{12}w_2(t=\infty) +W_{13}w_3(t=\infty)=&w_1(t=\infty) \nonumber \\
W_{21}w_1(t=\infty) + W_{22}w_2(t=\infty) + W_{23}w_3(t=\infty)=&w_2(t=\infty) \nonumber \\
W_{31}w_1(t=\infty) + W_{32}w_2(t=\infty) + W_{33}w_3(t=\infty)=&w_3(t=\infty) \nonumber \\
\end{align}
$$
with the constraint that
$$
\sum_i w_i(t=\infty) = 1,
$$
yielding as solution
$$
\hat{w}(t=\infty)= \left(\begin{array}{c} 4/15\\
8/15 \\
3/15 \end{array} \right).
$$
\begin{tabular}{rllll}\hline Iteration &$w_1$ &$w_2$ &$w_3$\\\hline 0 & 1.00000 &0.00000 &0.00000 \\ 1 & 0.25000 &0.75000 &0.00000 \\ 2 & 0.15625 &0.62625 &0.18750 \\ 3 & 0.24609 &0.52734 &0.22656 \\ 4 &0.27848 &0.51416 &0.20736 \\ 5 &0.27213 &0.53021 &0.19766 \\ 6 &0.26608 &0.53548 &0.19844 \\ 7 &0.26575 &0.53424 &0.20002 \\ 8 &0.26656 &0.53321 &0.20023 \\ 9 &0.26678 &0.53318 &0.20005 \\ 10 &0.26671 &0.53332 &0.19998 \\ 11 &0.26666 &0.53335 &0.20000 \\ 12 &0.26666 &0.53334 &0.20000 \\ 13 &0.26667 &0.53333 &0.20000 \\ \( \hat{w}(t=\infty) \) &0.26667 &0.53333 &0.20000 \\ \hline \end{tabular}
{\bf In a Markov chain Monte Carlo \( w \) is normally given, we need to find \( W \)!}
$$
{\bf \hat{w}}(t) = {\bf \hat{W}^t\hat{w}}(0),
$$
with \( {\bf \hat{w}}(0) \) the distribution at \( t=0 \) and \( {\bf \hat{W}} \) representing the
transition probability matrix.
We can always expand \( {\bf \hat{w}}(0) \) in terms of the right eigenvectors
\( {\bf \hat{v}} \) of \( {\bf \hat{W}} \) as
$$
{\bf \hat{w}}(0) = \sum_i\alpha_i{\bf \hat{v}}_i,
$$
resulting in
$$
{\bf \hat{w}}(t) = {\bf \hat{W}}^t{\bf \hat{w}}(0)={\bf \hat{W}}^t\sum_i\alpha_i{\bf \hat{v}}_i=
\sum_i\lambda_i^t\alpha_i{\bf \hat{v}}_i,
$$
with \( \lambda_i \) the \( i^{\mbox{th}} \) eigenvalue corresponding to
the eigenvector \( {\bf \hat{v}}_i \).
In our discussion below in connection with the entropy of a system and later statistical physics and quantum physics applications, we will relate these properties to correlation functions such as the time-correlation function.
That will allow us to define the so-called equilibration time,viz the time needed for the system to reach its most likely state. From that state and on we can can compute contributions to various statistical variables.
We can relate this property to an observable like the mean magnetization of say a magnetic material. With the probabilty \( {\bf \hat{w}}(t) \) we can write the mean magnetization as
$$
\langle {\cal M}(t) \rangle = \sum_{\mu} {\bf \hat{w}}(t)_{\mu}{\cal M}_{\mu},
$$
or as the scalar of a vector product
$$
\langle {\cal M}(t) \rangle = {\bf \hat{w}}(t){\bf m},
$$
with \( {\bf m} \) being the vector whose elements are the values of \( {\cal M}_{\mu} \) in its
various microstates \( \mu \).
Recall our definition of an expectation value with a discrete PDF \( p(x_i) \):
$$
E[x^k]= \langle x^k\rangle=\frac{1}{N}\sum_{i=1}^{N}x_i^kp(x_i),
$$
provided that the sums (or integrals) $ \sum_{i=1}^{N}p(x_i)$ converge absolutely (viz ,
$ \sum_{i=1}^{N}|p(x_i)|$ converges)
$$
\langle {\cal M}(t) \rangle = {\bf \hat{w}}(t){\bf m}=\sum_i\lambda_i^t\alpha_i{\bf \hat{v}}_i{\bf m}_i.
$$
If we define \( m_i={\bf \hat{v}}_i{\bf m}_i \) as the expectation value of
\( {\cal M} \) in the \( i^{\mbox{th}} \) eigenstate we can rewrite the last equation as
$$
\langle {\cal M}(t) \rangle = \sum_i\lambda_i^t\alpha_im_i.
$$
Since we have that in the limit \( t\rightarrow \infty \) the mean magnetization is dominated by the
largest eigenvalue \( \lambda_0 \), we can rewrite the last equation as
$$
\langle {\cal M}(t) \rangle = \langle {\cal M}(\infty) \rangle+\sum_{i\ne 0}\lambda_i^t\alpha_im_i.
$$
$$
\tau_i=-\frac{1}{log\lambda_i},
$$
and rewrite the last expectation value as
$$
\langle {\cal M}(t) \rangle = \langle {\cal M}(\infty) \rangle+\sum_{i\ne 0}\alpha_im_ie^{-t/\tau_i}.
$$
The quantities \( \tau_i \) are the correlation times for the system. They control also the time-correlation functions.
The longest correlation time is obviously given by the second largest eigenvalue \( \tau_1 \), which normally defines the correlation time discussed above. For large times, this is the only correlation time that survives. If higher eigenvalues of the transition matrix are well separated from \( \lambda_1 \) and we simulate long enough, \( \tau_1 \) may well define the correlation time. In other cases we may not be able to extract a reliable result for \( \tau_1 \).
$$
S = -\sum_i w_i ln(w_i),
$$
where \( w_i \) is the probability of finding our system in a state \( i \). For our one-dimensional randow walk
it represents the probability for being at position \( i=i\Delta x \) after a given number of time steps.
Assume now that we have \( N \) random walkers at
\( i=0 \) and \( t=0 \) and let these random walkers diffuse as function of time.
// loop over all time steps
for (int step=1; step <= time_steps; step++){
// move all walkers with periodic boundary conditions
for (int walks = 1; walks <= walkers; walks++){
if (ran0(&idum) <= move_probability) {
if ( x[walks] +1 > length) {
x[walks] = -length;
}
else{
x[walks] += 1;
}
else {
if ( x[walks] -1 < -length) {
x[walks] = length;
else{
x[walks] -= 1;
}
} // end of loop over walks
} // end of loop over trials
// at the final time step we compute the probability
// by counting the number of walkers at every position
for ( int i = -length; i <= length; i++){
int count = 0;
for( int j = 1; j <= walkers; j++){
if ( x[j] == i ) {
count += 1;
probability[i+length] = count;
// Writes the results to screen
void output(int length, int time_steps, int walkers, int *probability)
{
double entropy, histogram;
// find norm of probability
double norm = 1.0/walkers;
// compute the entropy
entropy = 0.; histogram = 0.;
for( int i = -length; i <= length; i++){
histogram = (double) probability[i+length]*norm;
if ( histogram > 0.0) {
entropy -= histogram*log(histogram);
// then write entropy to file
Markov process with transition probability from a state \( j \) to another state \( i \)
$$ \sum_j W(j\rightarrow i) = 1 $$
Note that the probability for remaining at the same place is not necessarily equal zero.
PDF \( w_i \) at time \( t=n\epsilon \)
$$ w_i(t) = \sum_j W(j\rightarrow i)^nw_j(t=0) $$
$$ \sum_i w_i(t) = 1 $$
$$ \sum_i W(j\rightarrow i)w_j= \sum_i W(i\rightarrow j)w_i $$
Ensures that it is the correct distribution which is achieved when equilibrium
is reached.
When a Markow process reaches equilibrium we have
$$ {\bf w}(t=\infty) = {\bf Ww}(t=\infty) $$
General condition at equilibrium
$$ W(j\rightarrow i)w_j= W(i\rightarrow j)w_i $$
which is the detailed balance condition. Proof is simple.
$$
\frac{d w_i(t)}{dt} = \sum_j\left[ W(j\rightarrow i)w_j-W(i\rightarrow j)w_i\right],
$$
which simply states that the rate at which the systems moves from a state \( j \)
to a final state \( i \) (the first term on the right-hand side of the last equation) is balanced by the rate at which the systems undergoes transitions from the state \( i \) to a state \( j \) (the second term). If we have reached the so-called steady state, then the temporal dependence is zero. This means that in equilibrium we have
$$
\frac{d w_i(t)}{dt} = 0.
$$
Any state in a Boltzmann distribution has a probability different from zero and if such a state cannot be reached from a given starting point, then the system is not ergodic.
$$ \frac{W(j\rightarrow i)}{W(i\rightarrow j)}=\frac{w_i}{w_j} $$
Boltzmann distribution
$$ \frac{w_i}{w_j}= \exp{(-\beta(E_i-E_j))} $$
$$ W(i\rightarrow j)=g(i\rightarrow j)A(i\rightarrow j) $$
where \( g \) is a selection probability while \( A \) is the probability for accepting a
move. It is also called the acceptance ratio.
With detailed balance this gives
$$ \frac{g(j\rightarrow i)A(j\rightarrow i)}{g(i\rightarrow j)A(i\rightarrow j)}= \exp{(-\beta(E_i-E_j))} $$
$$ A(j\rightarrow i)=\left\{\begin{array}{cc} \exp{(-\beta(E_i-E_j))} & E_i-E_j > 0 \\ 1 & else \end{array} \right. $$
This algorithm satisfies the condition for detailed balance and ergodicity.
$$
P(\beta)=\frac{e^{-\beta E}}{Z},
$$
with \( \beta=1/kT \) being the inverse temperature, \( E \) is the energy of
the system and
\( Z \) is the partition function. The only functions you will need are those
to generate random numbers.
We are going to study one single particle in equilibrium with its surroundings, the latter modeled via a large heat bath with temperature \( T \).
The model used to describe this particle is that of an ideal gas in {\bf one} dimension and with velocity \( -v \) or \( v \). We are interested in finding \( P(v)dv \), which expresses the probability for finding the system with a given velocity \( v\in [v,v+dv] \). The energy for this one-dimensional system is
$$
E=\frac{1}{2}kT=\frac{1}{2}v^2,
$$
with mass \( m=1 \).
$$
P(\beta)=\frac{e^{-\beta E}}{Z},
$$
with \( \beta=1/kT \) being the inverse temperature, \( E \) is the energy of
the system and
\( Z \) is the partition function. The only functions you will need are those
to generate random numbers.
We are going to study one single particle in equilibrium with its surroundings, the latter modeled via a large heat bath with temperature \( T \).
The model used to describe this particle is that of an ideal gas in {\bf one} dimension and with velocity \( -v \) or \( v \). We are interested in finding \( P(v)dv \), which expresses the probability for finding the system with a given velocity \( v\in [v,v+dv] \). The energy for this one-dimensional system is
$$
E=\frac{1}{2}kT=\frac{1}{2}v^2,
$$
with mass \( m=1 \).
$$
\begin{equation}
Z = \int_{-\infty}^{+\infty} e^{-\beta v^2 /2} dv = \sqrt{2\pi} \beta^{-1/2} \nonumber
\end{equation}
$$
The mean velocity
$$
\begin{equation}
\langle v \rangle = \int_{-\infty}^{+\infty} v e^{-\beta v^2 /2} dv = 0 \nonumber
\end{equation}
$$
The expressions for \( \langle E \rangle \) and \( \sigma_E \) assume the following form:
$$
\begin{equation}
\langle E \rangle = \int_{-\infty}^{+\infty} \frac{v^2}{2}\, e^{-\beta v^2 /2} dv = -\frac{1}{Z} \frac{\partial Z}{\partial \beta} =
\frac{1}{2} \beta^{-1} = \frac{1}{2} T \nonumber
\end{equation}
$$
$$
\begin{equation}
\langle E^2 \rangle = \int_{-\infty}^{+\infty} \frac{v^4}{4}\, e^{-\beta v^2 /2} dv = \frac{1}{Z} \frac{\partial^2 Z}{\partial \beta^2} =
\frac{3}{4} \beta^{-2} = \frac{3}{4} T^2 \nonumber
\end{equation}
$$
and
$$
\begin{equation}
\sigma_E = \langle E^2 \rangle - \langle E \rangle^2 = \frac{1}{2} T^2 \nonumber
\end{equation}
$$
for( montecarlo_cycles=1; Max_cycles; montecarlo_cycles++) {
...
// change speed as function of delta v
v_change = (2*ran1(&idum) -1 )* delta_v;
v_new = v_old+v_change;
// energy change
delta_E = 0.5*(v_new*v_new - v_old*v_old) ;
......
// Metropolis algorithm begins here
if ( ran1(&idum) <= exp(-beta*delta_E) ) {
accept_step = accept_step + 1 ;
v_old = v_new ;
// thereafter we must fill in P[N] as a function of
// the new speed
// upgrade mean velocity, energy and variance
v_current = v0;
// start simulation
ofile.open("evsmc.dat");
for (tries = 1; tries <= MC; tries++){
v_change = (2.*ran0(&idum) - 1.) * dv;
v_trial = v_current + v_change;
// evaluate dE
delta_E = 0.5 * ( v_trial * v_trial - v_current * v_current );
// Metropolis test
if (delta_E <= 0) {
acceptance++; v_current = v_trial;
else if (ran0(&idum) <= exp( -beta * delta_E )){
acceptance++; v_current = v_trial;
// check if velocity value lies within given limits
if (abs(v_current) > v_max) {
cout<<"Velocity out of range."; exit(1);
// save event in P array
address = (int) floor( v_current / dv ) + N/2 + 1;
P[address]++;
// update mean velocity, mean energy and energy variance values
mean_v += v_current;
mean_E += 0.5 * v_current * v_current;
E_variance += 0.25 * v_current * v_current * v_current * v_current;
// initialize model parameters
beta = 1./T; v_max = 10. * sqrt (T);
// calculate amount of P-array elements
N = 2 * (int)(v_max/dv) + 1;
// initialize P-array
P = new int [N];
for (int i=0; i < N; i++) P[i] = 0;
mean_v = 0.; mean_E = 0.; E_variance = 0.;
acceptance = 0;
}// initialize
$$
-\frac{\hbar^2}{2m}\nabla^2\Psi(x,t)+
V(x,t)\Psi(x,t)=
\imath\hbar\frac{\partial \Psi(x,t)}{\partial t},
$$
$$
P(x,t)=\Psi(x,t)^*\Psi(x,t)
$$
$$
P(x,t)dx=\Psi(x,t)^*\Psi(x,t)dx
$$
Interpretation: probability of finding the system in a region between
\( x \) and \( x+dx \).
Always real
$$
\Psi(x,t)=R(x,t)+\imath I(x,t)
$$
yielding
$$
\Psi(x,t)^*\Psi(x,t)=(R-\imath I)(R+\imath I)=R^2+I^2
$$
Variational Monte Carlo uses only \( P(x,t) \)!!
Choose \( \tau = it/\hbar \).
The time-dependent (1-dim) Schr\"odinger equation becomes then
$$
\frac{\partial \Psi(x,\tau)}{\partial \tau} =
\frac{\hbar^2}{2m} \frac{\partial^2\Psi(x,\tau)}{\partial x^2}
-V(x,\tau)\Psi(x,\tau).
$$
With \( V=0 \) we have a diffusion equation in complex time with
diffusion constant
$$
D= \frac{\hbar^2}{2m}.
$$
Used in diffusion Monte Carlo calculations. Topic for FYS4411, Computational Physics II
Normalization
$$ \int_{-\infty}^{\infty}P(x,t)dx=
\int_{-\infty}^{\infty}\Psi(x,t)^*\Psi(x,t)dx=1 $$
meaning that
$$ \int_{-\infty}^{\infty}\Psi(x,t)^*\Psi(x,t)dx < \infty $$
And
$$
\OP{A} = A(\vec{r},-i\hbar \vec{\bigtriangledown)}.
$$
\begin{tabular}{|l|l|l|} \hline
Quantity & Classical definition & QM operator\\
\hline
Position & \( \vec{r} \) & $\OP{\vec{r}} = \vec{r}$\\
Momentum & \( \vec{p} \)
& $\OP{\vec{p}} = -i \hbar \vec{\bigtriangledown}$\\
Orbital momentum & \( \vec{L} = \vec{r} \times \vec{p} \)
& $\OP{\vec{L}} = \vec{r} \times (-i\hbar \vec{\bigtriangledown})$\\
Kinetic energy & \( T = (\vec{p})^2 / 2 m \)
& $\OP{T} = - (\hbar^2 / 2 m) (\vec{\bigtriangledown})^2$\\
Total energy & \( H = (p^2 / 2 m) + V(\vec{r}) \)
& $\OP{H} = - ( \hbar^2 / 2 m )(\vec{\bigtriangledown})^2
+ V(\vec{r})$\\
\hline
\end{tabular}
$$
\OP{A} \psi_{\nu}
= a_{\nu} \psi_{\nu},
$$
resulting in the eigenvalues $ a_1, a_2, a_3,\cdots$ as the only outcomes of a measurement. The corresponding eigenstates $ \psi_1, \psi_2, \psi_3 \cdots$ contain all relevant information about the system.
$$
\Phi = c_1 \psi_1 + c_2 \psi_2 + \cdots
= \sum_{\nu} c_{\nu} \psi_{\nu}.
$$
The eigenfunctions are orthogonal and we get
$$
c_{\nu} = \int (\Phi)^{\ast} \psi_{\nu} d\tau.
$$
From this we can formulate the third postulate: \sl When the eigenfunction is \( \Phi \), the probability of obtaining the value \( a_{\nu} \) as the outcome of a measurement of the physical quantity \( A \) is given by \( |c_{\nu}|^2 \) and \( \psi_{\nu} \) is an eigenfunction of \( \OP{A} \) with eigenvalue \( a_{\nu} \).
$$
\langle A \rangle
= \int (\Phi)^{\ast} \OP{A}(\vec{r}, -i \hbar\vec{\bigtriangledown})
\Phi d\tau.
$$
We have assumed that
\( \Phi \) has been normalized, viz., \( \int (\Phi)^{\ast} \Phi d\tau = 1 \).
Else
$$
\langle A \rangle = \frac{\int (\Phi)^{\ast} \OP{A} \Phi d\tau}
{\int (\Phi)^{\ast} \Phi d\tau}.
$$
$$
i \hbar \frac{\partial \Psi}{\partial t} = \OP{H} \Psi,
$$
with \( \OP{H} \) the quantal Hamiltonian operator for the system.
$$
\langle H \rangle =
$$
$$
\frac{\int d{\bf R}_1d{\bf R}_2\dots d{\bf R}_N
\Psi^{\ast}({\bf R_1},{\bf R}_2,\dots,{\bf R}_N)
H({\bf R_1},{\bf R}_2,\dots,{\bf R}_N)
\Psi({\bf R_1},{\bf R}_2,\dots,{\bf R}_N)}
{\int d{\bf R}_1d{\bf R}_2\dots d{\bf R}_N
\Psi^{\ast}({\bf R_1},{\bf R}_2,\dots,{\bf R}_N)
\Psi({\bf R_1},{\bf R}_2,\dots,{\bf R}_N)},
$$
an in general intractable problem.
$$
E[H]= \langle H \rangle =
\frac{\int d{\bf R}\Psi^{\ast}_T({\bf R})H({\bf R})\Psi_T({\bf R})}
{\int d{\bf R}\Psi^{\ast}_T({\bf R})\Psi_T({\bf R})},
$$
is an upper bound to the ground state energy \( E_0 \) of the hamiltonian \( H \), that
is
$$
E_0 \le \langle H \rangle .
$$
In general, the integrals involved in the calculation of various expectation
values are multi-dimensional ones. Traditional integration methods
such as the Gauss-Legendre will not be adequate for say the
computation of the energy of a many-body system.
$$
\Psi_T({\bf R})=\sum_i a_i\Psi_i({\bf R}),
$$
and assuming the set of eigenfunctions to be normalized one obtains
$$
\frac{\sum_{nm}a^*_ma_n \int d{\bf R}\Psi^{\ast}_m({\bf R})H({\bf R})\Psi_n({\bf R})}
{\sum_{nm}a^*_ma_n \int d{\bf R}\Psi^{\ast}_m({\bf R})\Psi_n({\bf R})} =\frac{\sum_{n}a^2_n E_n}
{\sum_{n}a^2_n} \ge E_0,
$$
where we used that \( H({\bf R})\Psi_n({\bf R})=E_n\Psi_n({\bf R}) \).
In general, the integrals involved in the calculation of various expectation
values are multi-dimensional ones.
The variational principle yields the lowest state of a given symmetry.
Then we evaluate the expectation value of the hamiltonian \( H \)
$$
E[H]=\langle H \rangle =
\frac{\int d{\bf R}\Psi^{\ast}_{T_{\alpha}}({\bf R})H({\bf R})
\Psi_{T_{\alpha}}({\bf R})}
{\int d{\bf R}\Psi^{\ast}_{T_{\alpha}}({\bf R})\Psi_{T_{\alpha}}({\bf R})}.
$$
Thereafter we vary \( \alpha \) according to some minimization algorithm and return to the first step.
$$
P({\bf R})= \frac{\left|\psi_T({\bf R})\right|^2}{\int \left|\psi_T({\bf R})\right|^2d{\bf R}}.
$$
This is our new probability distribution function (PDF).
The approximation to the expectation value of the Hamiltonian is now
$$
E[H]\approx
\frac{\int d{\bf R}\Psi^{\ast}_T({\bf R})H({\bf R})\Psi_T({\bf R})}
{\int d{\bf R}\Psi^{\ast}_T({\bf R})\Psi_T({\bf R})}.
$$
Define a new quantity
$$
E_L({\bf R})=\frac{1}{\psi_T({\bf R})}H\psi_T({\bf R}),
\tag{45}
$$
called the local energy, which, together with our trial PDF yields
$$
E[H]=\langle H \rangle \approx \int P({\bf R})E_L({\bf R}) d{\bf R}\approx \frac{1}{N}\sum_{i=1}^NE_L({\bf R_i}
\tag{46}
$$
with \( N \) being the number of Monte Carlo samples.
Observe that the jumping in space is governed by the variable \( step \). Called brute-force sampling. Need importance sampling to get more relevant sampling.
$$
-\frac{\hbar^2}{2m}\frac{\partial^2 u(r)}{\partial r^2}-
\left(\frac{ke^2}{r}-\frac{\hbar^2l(l+1)}{2mr^2}\right)u(r)=Eu(r),
$$
or with dimensionless variables
$$
-\frac{1}{2}\frac{\partial^2 u(\rho)}{\partial \rho^2}-
\frac{u(\rho)}{\rho}+\frac{l(l+1)}{2\rho^2}u(\rho)-\lambda u(\rho)=0,
\tag{47}
$$
with the hamiltonian
$$
H=-\frac{1}{2}\frac{\partial^2 }{\partial \rho^2}-
\frac{1}{\rho}+\frac{l(l+1)}{2\rho^2}.
$$
Use variational parameter \( \alpha \) in the trial
wave function
$$
u_T^{\alpha}(\rho)=\alpha\rho e^{-\alpha\rho}.
\tag{48}
$$
$$
E_L(\rho)=-\frac{1}{\rho}-
\frac{\alpha}{2}\left(\alpha-\frac{2}{\rho}\right).
$$
\begin{tabular}{rrrr}\hline $\alpha$&$\langle H \rangle $&$\sigma^2$&$\sigma/\sqrt{N}$ \\\hline 7.00000E-01 & -4.57759E-01 & 4.51201E-02 & 6.71715E-04 \\ 8.00000E-01 & -4.81461E-01 & 3.05736E-02 & 5.52934E-04 \\ 9.00000E-01 & -4.95899E-01 & 8.20497E-03 & 2.86443E-04 \\ 1.00000E-00 & -5.00000E-01 & 0.00000E+00 & 0.00000E+00 \\ 1.10000E+00 & -4.93738E-01 & 1.16989E-02 & 3.42036E-04 \\ 1.20000E+00 & -4.75563E-01 & 8.85899E-02 & 9.41222E-04 \\ 1.30000E+00 & -4.54341E-01 & 1.45171E-01 & 1.20487E-03 \\ \end{tabular}
$$
H\psi = \mbox{constant}\times \psi,
$$
yields just a constant. The integral which defines various
expectation values involving moments of the hamiltonian becomes then
$$
\langle H^n \rangle =
\frac{\int d{\bf R}\Psi^{\ast}_T({\bf R})H^n({\bf R})\Psi_T({\bf R})}
{\int d{\bf R}\Psi^{\ast}_T({\bf R})\Psi_T({\bf R})}=
\mbox{constant}\times\frac{\int d{\bf R}\Psi^{\ast}_T({\bf R})\Psi_T({\bf R})}
{\int d{\bf R}\Psi^{\ast}_T({\bf R})\Psi_T({\bf R})}=
\mbox{constant}.
$$
{\bf This gives an important information: the exact wave function leads to zero variance!}
Variation is then performed by minimizing both the energy and the variance.
$$
-\frac{2ke^2}{r_1}-\frac{2ke^2}{r_2},
$$
and if we add the repulsion arising from the two
interacting electrons, we obtain the potential energy
$$
V(r_1, r_2)=-\frac{2ke^2}{r_1}-\frac{2ke^2}{r_2}+
\frac{ke^2}{r_{12}},
$$
with the electrons separated at a distance
\( r_{12}=|{\bf r}_1-{\bf r}_2| \).
$$
\OP{H}=-\frac{\hbar^2\nabla_1^2}{2m}-\frac{\hbar^2\nabla_2^2}{2m}
-\frac{2ke^2}{r_1}-\frac{2ke^2}{r_2}+
\frac{ke^2}{r_{12}},
$$
and Schr\"odingers equation reads
$$
\OP{H}\psi=E\psi.
$$
All observables are evaluated with respect to the probability distribution
$$
P({\bf R})= \frac{\left|\psi_T({\bf R})\right|^2}{\int \left|\psi_T({\bf R})\right|^2d{\bf R}}.
$$
generated by the trial wave function.
The trial wave function must approximate an exact
eigenstate in order that accurate results are to be obtained.
Improved trial
wave functions also improve the importance sampling,
reducing the cost of obtaining a certain statistical accuracy.
$$
E_L({\bf R})=\frac{1}{\psi_T({\bf R})}H\psi_T({\bf R})=
\frac{1}{\psi_T({\bf R})}\left(-\frac{1}{2}\nabla^2_1
-\frac{Z}{r_1}\right)\psi_T({\bf R}) + \mbox{finite terms}.
$$
$$
E_L(R)=
\frac{1}{{\cal R}_T(r_1)}\left(-\frac{1}{2}\frac{d^2}{dr_1^2}-
\frac{1}{r_1}\frac{d}{dr_1}
-\frac{Z}{r_1}\right){\cal R}_T(r_1) + \mbox{finite terms}
$$
For small values of \( r_1 \), the terms which dominate are
$$
\lim_{r_1 \rightarrow 0}E_L(R)=
\frac{1}{{\cal R}_T(r_1)}\left(-
\frac{1}{r_1}\frac{d}{dr_1}
-\frac{Z}{r_1}\right){\cal R}_T(r_1),
$$
since the second derivative does not diverge due to the finiteness of
\( \Psi \) at the origin.
$$
\frac{1}{{\cal R}_T(r_1)}\frac{d {\cal R}_T(r_1)}{dr_1}=-Z,
$$
and
$$
{\cal R}_T(r_1)\propto e^{-Zr_1}.
$$
A similar condition applies to electron 2 as well.
For orbital momenta \( l > 0 \) we have
$$
\frac{1}{{\cal R}_T(r)}\frac{d {\cal R}_T(r)}{dr}=-\frac{Z}{l+1}.
$$
Similalry, studying the case \( r_{12}\rightarrow 0 \) we can write
a possible trial wave function as
$$
\psi_T({\bf R})=e^{-\alpha(r_1+r_2)}e^{\beta r_{12}}.
\tag{49}
$$
The last equation can be generalized to
$$
\psi_T({\bf R})=\phi({\bf r}_1)\phi({\bf r}_2)\dots\phi({\bf r}_N)
\prod_{i < j}f(r_{ij}),
$$
for a system with \( N \) electrons or particles.
vmc_para.cpp
// Here we define global variables used in various functions
// These can be changed by reading from file the different parameters
int dimension = 3; // three-dimensional system
int charge = 2; // we fix the charge to be that of the helium atom
int my_rank, numprocs; // these are the parameters used by MPI to
// define which node and how many
double step_length = 1.0; // we fix the brute force jump to 1 Bohr radius
int number_particles = 2; // we fix also the number of electrons to be 2
vmc_para.cpp
, main part // MPI initializations
MPI_Init (&argc, &argv);
MPI_Comm_size (MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank (MPI_COMM_WORLD, &my_rank);
time_start = MPI_Wtime();
if (my_rank == 0 && argc <= 2) {
cout << "Bad Usage: " << argv[0] <<
" read also output file on same line" << endl;
exit(1);
if (my_rank == 0 && argc > 2) {
outfilename=argv[1];
ofile.open(outfilename);
vmc_para.cpp
, main part // Setting output file name for this rank:
ostringstream ost;
ost << "blocks_rank" << my_rank << ".dat";
// Open file for writing:
blockofile.open(ost.str().c_str(), ios::out | ios::binary);
total_cumulative_e = new double[max_variations+1];
total_cumulative_e2 = new double[max_variations+1];
cumulative_e = new double[max_variations+1];
cumulative_e2 = new double[max_variations+1];
// initialize the arrays by zeroing them
for( i=1; i <= max_variations; i++){
cumulative_e[i] = cumulative_e2[i] = 0.0;
total_cumulative_e[i] = total_cumulative_e2[i] = 0.0;
vmc_para.cpp
, main part // broadcast the total number of variations
MPI_Bcast (&max_variations, 1, MPI_INT, 0, MPI_COMM_WORLD);
MPI_Bcast (&number_cycles, 1, MPI_INT, 0, MPI_COMM_WORLD);
total_number_cycles = number_cycles*numprocs;
// array to store all energies for last variation of alpha
all_energies = new double[number_cycles+1];
// Do the mc sampling and accumulate data with MPI_Reduce
mc_sampling(max_variations, number_cycles, cumulative_e,
cumulative_e2, all_energies);
// Collect data in total averages
for( i=1; i <= max_variations; i++){
MPI_Reduce(&cumulative_e[i], &total_cumulative_e[i], 1, MPI_DOUBLE,
MPI_SUM, 0, MPI_COMM_WORLD);
MPI_Reduce(&cumulative_e2[i], &total_cumulative_e2[i], 1, MPI_DOUBLE,
MPI_SUM, 0, MPI_COMM_WORLD);
vmc_para.cpp
, main part blockofile.write((char*)(all_energies+1),
number_cycles*sizeof(double));
blockofile.close();
delete [] total_cumulative_e; delete [] total_cumulative_e2;
delete [] cumulative_e; delete [] cumulative_e2; delete [] all_energies;
// End MPI
MPI_Finalize ();
return 0;
} // end of main function
vmc_para.cpp
, sampling alpha = 0.5*charge;
// every node has its own seed for the random numbers
idum = -1-my_rank;
// allocate matrices which contain the position of the particles
r_old =(double **)matrix(number_particles,dimension,sizeof(double));
r_new =(double **)matrix(number_particles,dimension,sizeof(double));
for (i = 0; i < number_particles; i++) {
for ( j=0; j < dimension; j++) {
r_old[i][j] = r_new[i][j] = 0;
// loop over variational parameters
vmc_para.cpp
, sampling for (variate=1; variate <= max_variations; variate++){
// initialisations of variational parameters and energies
alpha += 0.1;
energy = energy2 = 0; accept =0; delta_e=0;
// initial trial position, note calling with alpha
for (i = 0; i < number_particles; i++) {
for ( j=0; j < dimension; j++) {
r_old[i][j] = step_length*(ran2(&idum)-0.5);
wfold = wave_function(r_old, alpha);
vmc_para.cpp
, sampling // loop over monte carlo cycles
for (cycles = 1; cycles <= number_cycles; cycles++){
// new position
for (i = 0; i < number_particles; i++) {
for ( j=0; j < dimension; j++) {
r_new[i][j] = r_old[i][j]+step_length*(ran2(&idum)-0.5);
// for the other particles we need to set the position to the old position since
// we move only one particle at the time
for (k = 0; k < number_particles; k++) {
if ( k != i) {
for ( j=0; j < dimension; j++) {
r_new[k][j] = r_old[k][j];
}
}
vmc_para.cpp
, sampling wfnew = wave_function(r_new, alpha);
// The Metropolis test is performed by moving one particle at the time
if(ran2(&idum) <= wfnew*wfnew/wfold/wfold ) {
for ( j=0; j < dimension; j++) {
r_old[i][j]=r_new[i][j];
}
wfold = wfnew;
}
} // end of loop over particles
vmc_para.cpp
, sampling // compute local energy
delta_e = local_energy(r_old, alpha, wfold);
// save all energies on last variate
if(variate==max_variations){
all_energies[cycles] = delta_e;
// update energies
energy += delta_e;
energy2 += delta_e*delta_e;
} // end of loop over MC trials
// update the energy average and its squared
cumulative_e[variate] = energy;
cumulative_e2[variate] = energy2;
} // end of loop over variational steps
vmc_para.cpp
, wave function // Function to compute the squared wave function, simplest form
double wave_function(double **r, double alpha)
{
int i, j, k;
double wf, argument, r_single_particle, r_12;
argument = wf = 0;
for (i = 0; i < number_particles; i++) {
r_single_particle = 0;
for (j = 0; j < dimension; j++) {
r_single_particle += r[i][j]*r[i][j];
argument += sqrt(r_single_particle);
wf = exp(-argument*alpha) ;
return wf;
vmc_para.cpp
, local energy // Function to calculate the local energy with num derivative
double local_energy(double **r, double alpha, double wfold)
{
int i, j , k;
double e_local, wfminus, wfplus, e_kinetic, e_potential, r_12,
r_single_particle;
double **r_plus, **r_minus;
vmc_para.cpp
, local energy // allocate matrices which contain the position of the particles
// the function matrix is defined in the progam library
r_plus =(double **)matrix(number_particles,dimension,sizeof(double));
r_minus =(double **)matrix(number_particles,dimension,sizeof(double));
for (i = 0; i < number_particles; i++) {
for ( j=0; j < dimension; j++) {
r_plus[i][j] = r_minus[i][j] = r[i][j];
vmc_para.cpp
, local energy // compute the kinetic energy
e_kinetic = 0;
for (i = 0; i < number_particles; i++) {
for (j = 0; j < dimension; j++) {
r_plus[i][j] = r[i][j]+h;
r_minus[i][j] = r[i][j]-h;
wfminus = wave_function(r_minus, alpha);
wfplus = wave_function(r_plus, alpha);
e_kinetic -= (wfminus+wfplus-2*wfold);
r_plus[i][j] = r[i][j];
r_minus[i][j] = r[i][j];
// include electron mass and hbar squared and divide by wave function
e_kinetic = 0.5*h2*e_kinetic/wfold;
vmc_para.cpp
, local energy // compute the potential energy
e_potential = 0;
// contribution from electron-proton potential
for (i = 0; i < number_particles; i++) {
r_single_particle = 0;
for (j = 0; j < dimension; j++) {
r_single_particle += r[i][j]*r[i][j];
e_potential -= charge/sqrt(r_single_particle);
vmc_para.cpp
, local energy // contribution from electron-electron potential
for (i = 0; i < number_particles-1; i++) {
for (j = i+1; j < number_particles; j++) {
r_12 = 0;
for (k = 0; k < dimension; k++) {
r_12 += (r[i][k]-r[j][k])*(r[i][k]-r[j][k]);
e_potential += 1/sqrt(r_12);
$$
\Psi_T(\mathbf{r}_1,\mathbf{r}_2) = e^{-\alpha(r_1+r_2)}
$$
The local energy is for this case
$$
E_{L1} = \left(\alpha-Z\right)\left(\frac{1}{r_1}+\frac{1}{r_2}\right)+\frac{1}{r_{12}}-\alpha^2
$$
which gives an expectation value for the local energy given by
$$
\langle E_{L1} \rangle = \alpha^2-2\alpha\left(Z-\frac{5}{16}\right)
$$
$$
\Psi_C= \exp{\left\{\sum_{i < j}\frac{ar_{ij}}{1+\beta r_{ij}}\right\}},
$$
which means that the gradient needed for the local energy
can be calculated analytically.
This will speed up your code since the computation of the correlation part and the Slater determinant are the most
time consuming parts in your code.
We will refer to this correlation function as \( \Psi_C \) or the linear Pad\'e-Jastrow.
$$
\psi_{T}({\bf r}_1,{\bf r}_2) =
\exp{\left(-\alpha(r_1+r_2)\right)}
\exp{\left(\frac{r_{12}}{2(1+\beta r_{12})}\right)},
$$
with \( \alpha \) and \( \beta \) as variational parameters.
The local energy is for this case
$$
E_{L2} = E_{L1}+\frac{1}{2(1+\beta r_{12})^2}\left\{\frac{\alpha(r_1+r_2)}{r_{12}}(1-\frac{\mathbf{r}_1\mathbf{r}_2}{r_1r_2})-\frac{1}{2(1+\beta r_{12})^2}-\frac{2}{r_{12}}+\frac{2\beta}{1+\beta r_{12}}\right\}
$$
Getting a closed form expression for the local energy means that you don't need to
compute derivatives numerically.
gprof > outprofile
outprofile
c++ -O3
You don't need to answer all questions in a chronological order. When you write the introduction you could focus on the following aspects
Go to http://www.uio.no/studier/emner/matnat/fys/FYS3150/h13/ and click on syllabus.
For the written exam, you can bring with you 2 pages (written on both sides) with own notes. See also the exams from last years.
x
cppcod sudo apt-get install git-core
.This applies to Dropbox as well.
git init
This creates a subdirectory .git tat contains all of your necessary repository files. At this point nothing in your project is tracked yet. To start, there are a few commands you need in the beginning. As an example
git add *.cpp *.hpp
git add ANOTHERFILE
git commit -m 'This is first setup of my superduper project'
using
git status
Tells you about the modifcations made. You can also standard unix commands like \( mv \), \( rm \) etc
git mv file_from file_to
git clone
If you are familiar with other VCS systems such as Subversion, you will notice that the command is clone and not checkout. This is an important distinction. Git receives a copy of nearly all data that the server has. Every version of every file for the history of the project is pulled down when you run git clone. In fact, if your server disk gets corrupted, you can use any of the clones on any client to set the server back to the state it was in when it was cloned.
git status
c++ -c mycode.cpp
c++ -o mycode.exe mycode.o
This is what we call a flat compiler option and should be used when we develop the code. It produces normally a very large and slow code when translated to machine instructions. We use this option for debugging and for establishing the correct program output because every operation is done precisely as the user specified it.
It is instructive to look up the compiler manual for further instructions
man c++ > out_to_file
c++ -O3 -c mycode.cpp
c++ -O3 -o mycode.exe mycode.o
This is the recommended option. {\bf But you must check that you get the same results as previously}.
c++ -pg -O3 -c mycode.cpp
c++ -pg -O3 -o mycode.exe mycode.o
After you have run the code you can obtain the profiling information via
gprof mycode.exe > out_to_profile
When you have profiled properly your code, you must take out this option as it increases your CPU expenditure.
Bad code
for i = 1:n a(i) = b(i) +c*d e = g(k) end
Better code
temp = c*d for i = 1:n a(i) = b(i) + temp end e = g(k)
#include <cstdlib>
#include <ios>
#include <iostream>
#include <armadillo>
using namespace std;
using namespace arma;
/*Because fortran files don't have any header files,
*we need to declare the functions ourself.*/
extern "C"
{
void dgemm_(char*, char*, int*, int*, int*, double*,
double*, int*, double*, int*, double*, double*, int*);
int main(int argc, char** argv)
{
//Dimensions
int n = atoi(argv[1]);
int m = n;
int p = m;
/*Create random matrices
* (note that older versions of armadillo uses "rand" instead of "randu") */
srand(time(NULL));
mat A(n, p);
A.randu();
// Pretty print, and pretty save, are as easy as the two following lines.
// cout << A << endl;
// A.save("A.mat", raw_ascii);
mat A_trans = trans(A);
mat B(p, m);
B.randu();
mat C(n, m);
// cout << B << endl;
// B.save("B.mat", raw_ascii);
// ARMADILLO TEST
cout << "Starting armadillo multiplication\n";
//Simple wall_clock timer is a part of armadillo.
wall_clock timer;
timer.tic();
C = A*B;
double num_sec = timer.toc();
cout << "-- Finished in " << num_sec << " seconds.\n\n";
C = zeros<mat> (n, m);
cout << "Starting blas multiplication.\n";
char trans = 'N';
double alpha = 1.0;
double beta = 0.0;
int _numRowA = A.n_rows;
int _numColA = A.n_cols;
int _numRowB = B.n_rows;
int _numColB = B.n_cols;
int _numRowC = C.n_rows;
int _numColC = C.n_cols;
int lda = (A.n_rows >= A.n_cols) ? A.n_rows : A.n_cols;
int ldb = (B.n_rows >= B.n_cols) ? B.n_rows : B.n_cols;
int ldc = (C.n_rows >= C.n_cols) ? C.n_rows : C.n_cols;
dgemm_(&trans, &trans, &_numRowA, &_numColB, &_numColA, &alpha,
A.memptr(), &lda, B.memptr(), &ldb, &beta, C.memptr(), &ldc);
&&&&\\ &&&&\\ \hline \end{tabular}
$$
\begin{equation}S=k_{B}ln\Omega\end{equation}
$$
$$
\begin{equation}dS=\frac{1}{T}dE+\frac{p}{T}dV-\frac{\mu}{T}dN\end{equation}
$$
Temperature
$$
\begin{equation}\frac{1}{k_{B}T}=\left(\frac{\partial ln\Omega}{\partial E}\right)_{N, V}\end{equation}
$$
Pressure
$$
\begin{equation}\frac{p}{k_{B}T}=\left(\frac{\partial ln\Omega}{\partial V}\right)_{N, E}\end{equation}
$$
Chemical potential
$$
\begin{equation}\frac{\mu}{k_{B}T}=-\left(\frac{\partial ln\Omega}{\partial N}\right)_{V, E}\end{equation}
$$
$$
\begin{equation}F=-k_{B}TlnZ\end{equation}
$$
$$
\begin{equation}dF=-SdT-pdV+\mu dN\end{equation}
$$
Entropy
$$
\begin{equation}S =k_{B}lnZ
+k_{B}T\left(\frac{\partial lnZ}{\partial T}\right)_{N, V}\end{equation}
$$
Pressure
$$
\begin{equation}p=k_{B}T\left(\frac{\partial lnZ}{\partial V}\right)_{N, T}\end{equation}
$$
Chemical Potential
$$
\begin{equation}\mu =-k_{B}T\left(\frac{\partial lnZ}{\partial N}\right)_{V, T}\end{equation}
$$
Energy (internal only)
$$
\begin{equation}E =k_{B}T^{2}\left(\frac{\partial lnZ}{\partial T}\right)_{V, N}\end{equation}
$$
$$
\begin{equation}pV=k_{B}Tln\Xi\end{equation}
$$
$$
\begin{equation}d(pV)=SdT+Nd\mu +pdV\end{equation}
$$
Entropy
$$
\begin{equation}S =k_{B}ln\Xi
+k_{B}T\left(\frac{\partial ln\Xi}{\partial T}\right)_{V, \mu}\end{equation}
$$
Particles
$$
\begin{equation}N =k_{B}T\left(\frac{\partial ln\Xi}{\partial \mu}\right)_{V, T}\end{equation}
$$
Pressure
$$
\begin{equation}p =k_{B}T\left(\frac{\partial ln\Xi}{\partial V}\right)_{\mu, T}\end{equation}
$$
$$
\begin{equation}G=-k_{B}Tln\Delta\end{equation}
$$
$$
\begin{equation}dG=-SdT+Vdp+\mu dN\end{equation}
$$
Entropy
$$
\begin{equation}S =k_{B}ln\Delta
+k_{B}T\left(\frac{\partial ln\Delta}{\partial T}\right)_{p, N}\end{equation}
$$
Volume
$$
\begin{equation}V =-k_{B}T\left(\frac{\partial ln\Delta}{\partial p}\right)_{N, T}\end{equation}
$$
Chemical potential
$$
\begin{equation}\mu =-k_{B}T\left(\frac{\partial ln\Delta}{\partial N}\right)_{p, T}\end{equation}
$$
$$
P_i(\beta) = \frac{e^{-\beta E_i}}{Z}
$$
with \( \beta=1/kT \) being the inverse temperature, \( k \) the
Boltzmann constant, \( E_i \) is the energy of a state \( i \) while
\( Z \) is the partition function for the canonical ensemble
defined as
$$
Z=\sum_{i=1}^{M}e^{-\beta E_i},
$$
where the sum extends over all states
\( M \).
\( P_i \) expresses the probability of finding the system in a given
configuration \( i \).
$$
\langle E \rangle = \sum_{i=1}^M E_i P_i(\beta)=
\frac{1}{Z}\sum_{i=1}^M E_ie^{-\beta E_i},
$$
with a corresponding variance defined as
$$
\sigma_E^2=\langle E^2 \rangle-\langle E \rangle^2=
\frac{1}{Z}\sum_{i=1}^M E_i^2e^{-\beta E_i}-
\left(\frac{1}{Z}\sum_{i=1}^M E_ie^{-\beta E_i}\right)^2.
$$
If we divide the latter quantity with
\( kT^2 \) we obtain the specific heat at constant volume
$$
C_V= \frac{1}{kT^2}\left(\langle E^2 \rangle-\langle E \rangle^2\right).
$$
$$
\langle E \rangle=-\frac{\partial lnZ}{\partial \beta}.
$$
The specific heat is
$$
C_V=\frac{1}{kT^2}\frac{\partial^2 lnZ}{\partial\beta^2}
$$
These expressions link a physical quantity (in thermodynamics) with the microphysics
given by the partition function. Statistical physics is the field where one relates
microscopic quantities to observables at finite temperature.
$$
\langle {\cal M} \rangle = \sum_i^M {\cal M}_i P_i(\beta)=
\frac{1}{Z}\sum_i^M {\cal M}_ie^{-\beta E_i},
$$
and the corresponding variance
$$
\sigma_{{\cal M}}^2=\langle {\cal M}^2 \rangle-\langle {\cal M} \rangle^2=
\frac{1}{Z}\sum_{i=1}^M {\cal M}_i^2e^{-\beta E_i}-
\left(\frac{1}{Z}\sum_{i=1}^M {\cal M}_ie^{-\beta E_i}\right)^2.
$$
This quantity defines also the susceptibility
\( \chi \)
$$
\chi=\frac{1}{kT}\left(\langle {\cal M}^2 \rangle-\langle {\cal M} \rangle^2\right).
$$
$$
F= \langle E\rangle -TS = -kTln Z
$$
meaning $ lnZ = -F/kT = -F\beta$
and
$$
\langle E \rangle=-\frac{\partial lnZ}{\partial \beta} =\frac{\partial (\beta F)}{\partial \beta}.
$$
and
$$
C_V=-\frac{1}{kT^2}\frac{\partial^2 (\beta F)}{\partial\beta^2}.
$$
We can relate observables to various derivatives of the partition
function and the free energy. When a given derivative of the free energy or the partition function is discontinuous
or diverges (logarithmic divergence for the heat capacity from the Ising model) we talk of a phase transition
of order of the derivative.
$$
E=-J\sum_{< kl >}^{N}s_ks_l-B\sum_k^Ns_k,
$$
with \( s_k=\pm 1 \), \( N \) is the total number of spins,
\( J \) is a coupling constant expressing the strength of the interaction
between neighboring spins and
\( B \) is an external magnetic field interacting with the magnetic
moment set up by the spins.
The symbol \( < kl > \) indicates that we sum over nearest
neighbors only.
$$
1= \uparrow\uparrow\quad
2= \uparrow\downarrow\quad
3= \downarrow\uparrow\quad
4=\downarrow\downarrow
$$
What is the energy of each of these configurations?
For small systems, the way we treat the ends matters. In the first case we employ what is called free ends. For the one-dimensional case, the energy is then written as a sum over a single index
$$
E_i =-J\sum_{j=1}^{N-1}s_js_{j+1},
$$
If we label the first spin as \( s_1 \) and the second as \( s_2 \)
we obtain the following
expression for the energy
$$
E=-Js_1s_2.
$$
for ( j=1; j < N; j++) {
energy += spin[j]*spin[j+1];
where the vector \( spin[] \) contains the spin value \( s_k=\pm 1 \). For the specific state \( E_1 \), we have chosen all spins up. The energy of this configuration becomes then
$$
E_1=E_{\uparrow\uparrow}=-J.
$$
The other configurations give
$$
E_2=E_{\uparrow\downarrow}=+J,
$$
$$
E_3=E_{\downarrow\uparrow}=+J,
$$
and
$$
E_4=E_{\downarrow\downarrow}=-J.
$$
$$
E_i =-J\sum_{j=1}^{N}s_js_{j+1},
$$
and we obtain the following expression for the two-spin case
$$
E=-J(s_1s_2+s_2s_1).
$$
If we choose to use periodic boundary conditions we can code the above
expression as
jm=N;
for ( j=1; j <=N ; j++) {
energy += spin[j]*spin[jm];
jm = j ;
$$
{\cal M}_i=\sum_{j=1}^N s_j,
$$
where we sum over all spins for a given configuration \( i \).
$$
E=-8J\quad\begin{array}{cc}\uparrow & \uparrow \\
\uparrow & \uparrow\end{array}
\quad
E=0\quad\begin{array}{cc}\uparrow & \uparrow \\
\uparrow & \downarrow\end{array}
\quad
E=0\quad\begin{array}{cc}\downarrow & \downarrow \\
\uparrow & \downarrow\end{array}
$$
$$
E_i =-J\sum_{j=1}^{N}s_js_{j+1},
$$
In our case we have as the Monte Carlo sampling function the probability
for finding the system in a state \( s \) given by
$$
P_s=\frac{e^{-(\beta E_s)}}{Z},
$$
with energy \( E_s \), \( \beta=1/kT \) and \( Z \) is a normalization constant which
defines the partition function in the canonical ensemble
$$
Z(\beta)=\sum_se^{-(\beta E_s)}
$$
This is difficult to compute since we need all states. In a calculation
of the Ising model in two dimensions, the number of configurations is
given by \( 2^N \) with \( N=L\times L \) the number of spins for a lattice
of length \( L \). Fortunately, the Metropolis algorithm considers
only ratios between probabilities and we do not need to
compute the partition function at all.
$$
E=-4J\quad\begin{array}{ccc} & \uparrow & \\
\uparrow & \uparrow & \uparrow\\
& \uparrow & \end{array}
\quad\Longrightarrow\quad
E=4J\quad\begin{array}{ccc} & \uparrow & \\
\uparrow & \downarrow & \uparrow\\
& \uparrow & \end{array}
$$
$$
E=-2J\quad\begin{array}{ccc} & \uparrow & \\
\downarrow & \uparrow & \uparrow\\
& \uparrow & \end{array}
\quad\Longrightarrow\quad
E=2J\quad\begin{array}{ccc} & \uparrow & \\
\downarrow & \downarrow & \uparrow\\
& \uparrow & \end{array}
$$
with \( \Delta E=4J \),
$$
E=0\quad\begin{array}{ccc} & \uparrow & \\
\downarrow & \uparrow & \uparrow\\
& \downarrow & \end{array}
\quad\Longrightarrow\quad
E=0\quad\begin{array}{ccc} & \uparrow & \\
\downarrow & \downarrow & \uparrow\\
& \downarrow & \end{array}
$$
with \( \Delta E=0 \)
$$
E=2J\quad\begin{array}{ccc} & \downarrow & \\
\downarrow & \uparrow & \uparrow\\
& \downarrow & \end{array}
\quad\Longrightarrow\quad
E=-2J\quad\begin{array}{ccc} & \downarrow & \\
\downarrow & \downarrow & \uparrow\\
& \downarrow & \end{array}
$$
with \( \Delta E=-4J \) and finally
$$
E=4J\quad\begin{array}{ccc} & \downarrow & \\
\downarrow & \uparrow & \downarrow\\
& \downarrow & \end{array}
\quad\Longrightarrow\quad
E=-4J\quad\begin{array}{ccc} & \downarrow & \\
\downarrow & \downarrow & \downarrow\\
& \downarrow & \end{array}
$$
with \( \Delta E=-8J \).
This means in turn that we could construct an array which contains all values
of \( e^{\beta \Delta E} \) before doing the Metropolis sampling. Else, we
would have to evaluate the exponential at each Monte Carlo sampling.
for ( double temp = initial_temp; temp <= final_temp; temp+=temp_step){
// initialise energy and magnetization
E = M = 0.;
// setup array for possible energy changes
for( int de =-8; de <= 8; de++) w[de+8] = 0;
for( int de =-8; de <= 8; de+=4) w[de+8] = exp(-de/temp);
// initialise array for expectation values
for( int i = 0; i < 5; i++) average[i] = 0.;
initialize(n_spins, temp, spin_matrix, E, M);
// start Monte Carlo computation
for (int cycles = 1; cycles <= mcs; cycles++){
Metropolis(n_spins, idum, spin_matrix, E, M, w);
// update expectation values
average[0] += E; average[1] += E*E;
average[2] += M; average[3] += M*M; average[4] += fabs(M);
// print results
output(n_spins, mcs, temp, average);
void initialize(int n_spins, double temp, int **spin_matrix,
double& E, double& M)
{
// setup spin matrix and intial magnetization
for(int y =0; y < n_spins; y++) {
for (int x= 0; x < n_spins; x++){
spin_matrix[y][x] = 1; // spin orientation for the ground state
M += (double) spin_matrix[y][x];
// setup initial energy
for(int y =0; y < n_spins; y++) {
for (int x= 0; x < n_spins; x++){
E -= (double) spin_matrix[y][x]*
(spin_matrix[periodic(y,n_spins,-1)][x] +
spin_matrix[y][periodic(x,n_spins,-1)]);
}// end function initialise
// inline function for periodic boundary conditions
inline int periodic(int i, int limit, int add) {
return (i+limit+add) % (limit);
with the following example from the function initialise
E -= (double) spin_matrix[y][x]*
(spin_matrix[periodic(y,n_spins,-1)][x] +
spin_matrix[y][periodic(x,n_spins,-1)]);
DO y = 1,lattice_y
DO x = 1,lattice_x
right = x+1 ; IF(x == lattice_x ) right = 1
left = x-1 ; IF(x == 1 ) left = lattice_x
up = y+1 ; IF(y == lattice_y ) up = 1
down = y-1 ; IF(y == 1 ) down = lattice_y
energy=energy - spin_matrix(x,y)*(spin_matrix(right,y)+&
spin_matrix(left,y)+spin_matrix(x,up)+ &
spin_matrix(x,down) )
magnetization = magnetization + spin_matrix(x,y)
ENDDO
ENDDO
energy = energy*0.5
$$
\Delta E = E_2-E_1 =J\sum_{< kl >}^{N}s_k^1s_{l}^1-J\sum_{< kl >}^{N}s_k^2s_{l}^2,
$$
which we can rewrite as
$$
\Delta E = -J \sum_{< kl >}^{N}s_k^2(s_l^2-s_{l}^1),
$$
where the sum now runs only over the nearest neighbors \( k \) of the spin.
Since the spin to be flipped takes only two values, \( s_l^1=\pm 1 \) and \( s_l^2=\pm 1 \), it means that if
\( s_l^1= 1 \), then \( s_l^2=-1 \) and if \( s_l^1= -1 \), then \( s_l^2=1 \).
The other spins keep their values, meaning that
\( s_k^1=s_k^2 \).
If \( s_l^1= 1 \) we must have \( s_l^1-s_{l}^2=2 \), and
if \( s_l^1= -1 \) we must have \( s_l^1-s_{l}^2=-2 \). From these results we see that the energy difference
can be coded efficiently as
$$
\Delta E = 2Js_l^1\sum_{< k >}^{N}s_k,
$$
where the sum runs only over the nearest neighbors \( k \) of spin \( l \)
The difference in magnetisation is given by the difference \( s_l^1-s_{l}^2=\pm 2 \), or in a more compact way as
$$
M_2 = M_1+2s_l^2,
$$
where \( M_1 \) and \( M_2 \) are the magnetizations before and after the spin flip, respectively.
// loop over all spins
for(int y =0; y < n_spins; y++) {
for (int x= 0; x < n_spins; x++){
int ix = (int) (ran1(&idum)*(double)n_spins); // RANDOM SPIN
int iy = (int) (ran1(&idum)*(double)n_spins); // RANDOM SPIN
int deltaE = 2*spin_matrix[iy][ix]*
(spin_matrix[iy][periodic(ix,n_spins,-1)]+
spin_matrix[periodic(iy,n_spins,-1)][ix] +
spin_matrix[iy][periodic(ix,n_spins,1)] +
spin_matrix[periodic(iy,n_spins,1)][ix]);
if ( ran1(&idum) <= w[deltaE+8] ) {
spin_matrix[iy][ix] *= -1; // flip one spin and accept new spin config
M += (double) 2*spin_matrix[iy][ix];
E += (double) deltaE;
double norm = 1/((double) (mcs));// divided by total number of cycles
double Eaverage = average[0]*norm;
double E2average = average[1]*norm;
double Maverage = average[2]*norm;
double M2average = average[3]*norm;
double Mabsaverage = average[4]*norm;
// all expectation values are per spin, divide by 1/n_spins/n_spins
double Evariance = (E2average- Eaverage*Eaverage)/n_spins/n_spins;
double Mvariance = (M2average - Mabsaverage*Mabsaverage)/n_spins/n_spins;
ofile << setiosflags(ios::showpoint | ios::uppercase);
ofile << setw(15) << setprecision(8) << temp;
ofile << setw(15) << setprecision(8) << Eaverage/n_spins/n_spins;
ofile << setw(15) << setprecision(8) << Evariance/temp/temp;
ofile << setw(15) << setprecision(8) << Maverage/n_spins/n_spins;
ofile << setw(15) << setprecision(8) << Mvariance/temp;
ofile << setw(15) << setprecision(8) << Mabsaverage/n_spins/n_spins << endl;
int Jacobi_P(int mynode, int numnodes, int N, double **A, double *x, double *b, double abstol){
int i,j,k,i_global;
int maxit = 100000;
int rows_local,local_offset,last_rows_local,*count,*displacements;
double sum1,sum2,*xold;
double error_sum_local, error_sum_global;
MPI_Status status;
rows_local = (int) floor((double)N/numnodes);
local_offset = mynode*rows_local;
if(mynode == (numnodes-1))
rows_local = N - rows_local*(numnodes-1);
/*Distribute the Matrix and R.H.S. among the processors */
if(mynode == 0){
for(i=1;i<numnodes-1;i++){
for(j=0;j<rows_local;j++)
MPI_Send(A[i*rows_local+j],N,MPI_DOUBLE,i,j,MPI_COMM_WORLD);
MPI_Send(b+i*rows_local,rows_local,MPI_DOUBLE,i,rows_local,
MPI_COMM_WORLD);
last_rows_local = N-rows_local*(numnodes-1);
for(j=0;j<last_rows_local;j++)
MPI_Send(A[(numnodes-1)*rows_local+j],N,MPI_DOUBLE,numnodes-1,j,
MPI_COMM_WORLD);
MPI_Send(b+(numnodes-1)*rows_local,last_rows_local,MPI_DOUBLE,numnodes-1,
last_rows_local,MPI_COMM_WORLD);
else{
A = CreateMatrix(rows_local,N);
x = new double[rows_local];
b = new double[rows_local];
for(i=0;i<rows_local;i++)
MPI_Recv(A[i],N,MPI_DOUBLE,0,i,MPI_COMM_WORLD,&status);
MPI_Recv(b,rows_local,MPI_DOUBLE,0,rows_local,MPI_COMM_WORLD,&status);
xold = new double[N];
count = new int[numnodes];
displacements = new int[numnodes];
//set initial guess to all 1.0
for(i=0; i<N; i++){
xold[i] = 1.0;
for(i=0;i<numnodes;i++){
count[i] = (int) floor((double)N/numnodes);
displacements[i] = i*count[i];
count[numnodes-1] = N - ((int)floor((double)N/numnodes))*(numnodes-1);
for(k=0; k<maxit; k++){
error_sum_local = 0.0;
for(i = 0; i<rows_local; i++){
i_global = local_offset+i;
sum1 = 0.0; sum2 = 0.0;
for(j=0; j < i_global; j++)
sum1 = sum1 + A[i][j]*xold[j];
for(j=i_global+1; j < N; j++)
sum2 = sum2 + A[i][j]*xold[j];
x[i] = (-sum1 - sum2 + b[i])/A[i][i_global];
error_sum_local += (x[i]-xold[i_global])*(x[i]-xold[i_global]);
MPI_Allreduce(&error_sum_local,&error_sum_global,1,MPI_DOUBLE,
MPI_SUM,MPI_COMM_WORLD);
MPI_Allgatherv(x,rows_local,MPI_DOUBLE,xold,count,displacements,
MPI_DOUBLE,MPI_COMM_WORLD);
if(sqrt(error_sum_global)<abstol){
if(mynode == 0){
for(i=0;i<N;i++)
x[i] = xold[i];
else{
DestroyMatrix(A,rows_local,N);
delete[] x;
delete[] b;
delete[] xold;
delete[] count;
delete[] displacements;
return k;
cerr << "Jacobi: Maximum Number of Interations Reached Without Convergence\n";
if(mynode == 0){
for(i=0;i<N;i++)
x[i] = xold[i];
else{
DestroyMatrix(A,rows_local,N);
delete[] x;
delete[] b;
delete[] xold;
delete[] count;
delete[] displacements;
return maxit;
$$
K_i = -\frac{1}{2}\frac{\nabla_{i}^{2} \Psi}{\Psi}.
$$
$$
\begin{align}
\frac{\nabla^2 \Psi}{\Psi}
&= \frac{\nabla^2 \Psi_{D}}{\Psi_{D}} + \frac{\nabla^2 \Psi_J}{ \Psi_J} + 2 \frac{\Grad \Psi_{D}}{\Psi_{D}}\cdot\frac{\Grad \Psi_J}{ \Psi_J}
\end{align}
$$
$$
\Psi_J=\prod_{i < j}g(r_{ij})=\prod_{i < j}^Ng(r_{ij})= \prod_{i=1}^N\prod_{j=i+1}^Ng(r_{ij}),
$$
with
\( r_{ij}=|{\bf r}_i-{\bf r}_j|=\sqrt{(x_i-x_j)^2+(y_i-y_j)^2+(z_i-z_j)^2} \) for three dimensions and
\( r_{ij}=|{\bf r}_i-{\bf r}_j|=\sqrt{(x_i-x_j)^2+(y_i-y_j)^2} \) for two dimensions.
In our particular case we have
$$
\Psi_J=\prod_{i < j}g(r_{ij})=\exp{\left\{\sum_{i < j}f(r_{ij})\right\}}=
\exp{\left\{\sum_{i < j}\frac{ar_{ij}}{1+\beta r_{ij}}\right\}},
$$
$$
\left[\frac{\nabla^2 \Psi_J}{\Psi_J}\right]_x =\
2\sum_{k=1}^{N}
\sum_{i=1}^{k-1}\frac{\partial^2 g_{ik}}{\partial x_k^2}\ +\
\sum_{k=1}^N
\left(
\sum_{i=1}^{k-1}\frac{\partial g_{ik}}{\partial x_k} -
\sum_{i=k+1}^{N}\frac{\partial g_{ki}}{\partial x_i}
\right)^2
$$
But we have a simple form for the function, namely
$$
\Psi_{J}=\prod_{i < j}\exp{f(r_{ij})}= \exp{\left\{\sum_{i < j}\frac{ar_{ij}}{1+\beta r_{ij}}\right\}},
$$
and it is easy to see that for particle \( k \)
we have
$$
\frac{\nabla^2_k \Psi_J}{\Psi_J }=
\sum_{ij\ne k}\frac{({\bf r}_k-{\bf r}_i)({\bf r}_k-{\bf r}_j)}{r_{ki}r_{kj}}f'(r_{ki})f'(r_{kj})+
\sum_{j\ne k}\left( f''(r_{kj})+\frac{2}{r_{kj}}f'(r_{kj})\right)
$$
$$
\Psi_J=\prod_{i < j}g(r_{ij})= \exp{\left\{\sum_{i < j}\frac{ar_{ij}}{1+\beta r_{ij}}\right\}},
$$
the gradient needed for the local energy is easy to compute.
We get for particle \( k \)
$$
\frac{ \nabla_k \Psi_J}{ \Psi_J }= \sum_{j\ne k}\frac{{\bf r}_{kj}}{r_{kj}}\frac{a}{(1+\beta r_{kj})^2},
$$
which is rather easy to code. Remember to sum over all particles when you compute the local energy.
$$
\frac{\partial u}{\partial n} = \nabla u\cdot {\bf n} = 0,
$$
where \( \partial u/\partial n \) is the derivative in the direction normal to the boundary.
Here you must pay particular attention to the endpoints.
$$
u(x,y,0) = f(x,y)\quad x,y\in (0,L),
$$
and
$$
\partial u/\partial t|_{t=0}=0 \quad x,y\in (0,L).
$$
We will approximate the initial elevation with the function
$$
f(x,y) = A_0 \exp{\left(-\left[\frac{x-x_c}{\sigma_x}\right]^2-\left[\frac{y-y_c}{\sigma_y}\right]^2\right)},
$$
where \( A_0 \) is the elevation of the surface and is typically \( 1-2 \) m. The variables \( \sigma_x \) and
\( \sigma_y \) represent the extensions of the surface elevation. Here we let \( \sigma_x=80 \) km
and \( \sigma_y=200 \) km. The 2004 tsunami had extensions of approximately 200 and 1000 km, respectively.
The variables \( x_c \) and \( y_c \) represent the epicentre of the earthquake.
In mean field theory the local magnetization is a treated as a constant, all effects from fluctuations are neglected. A way to achieve this is to rewrite by adding and subtracting the mean magnetization \( \langle s \rangle \)
$$
s_is_j=(s_i-\langle s \rangle+\langle s \rangle)(s_i-\langle s \rangle+\langle s \rangle)
\approx \langle s \rangle^2+\langle s \rangle(s_i-\langle s \rangle)+\langle s \rangle(s_j-\langle s \rangle),
$$
where we have ignored terms of the order
\( (s_i-\langle s \rangle)(s_i-\langle s \rangle) \), which leads to correlations between neighbouring
spins. In mean field theory we ignore correlations.
$$
E=-J\sum_{< ij >}^{N}s_ks_l-B\sum_i^Ns_i,
$$
as
$$
E=-J\sum_{< ij >}\langle s \rangle^2+\langle s \rangle(s_i-\langle s \rangle)+\langle s \rangle(s_j-\langle s \rangle) -B\sum_i s_i,
$$
resulting in
$$
E=-(B+zJ\langle s \rangle) \sum_i s_i +zJ\langle s \rangle^2,
$$
with \( z \) the number of nearest neighbours for a given site \( i \).
We can define an effective field which all spins see, namely
$$
B_{\mbox{eff}}=(B+zJ\langle s \rangle).
$$
Here we use the canonical ensemble. The partition function reads in this case
$$
Z=e^{-NzJ\langle s \rangle^2/kT}\left(2cosh(B_{\mbox{eff}}/kT)\right)^N,
$$
with a free energy
$$
F=-kTlnZ=-NkTln(2)+NzJ\langle s \rangle^2-NkTln\left(cosh(B_{\mbox{eff}}/kT)\right)
$$
and minimizing \( F \) wrt \( \langle s \rangle \) we arrive at
$$
\langle s \rangle = tanh(2cosh\left(B_{\mbox{eff}}/kT)\right).
$$
$$
F=-NkTln(2)+NzJ\langle s \rangle^2-NkT-BN\langle s \rangle+NkT\left(\frac{1}{2}\langle s \rangle^2+
\frac{1}{12}\langle s \rangle^4+\dots\right),
$$
and using \( \langle M\rangle = N\langle s \rangle \) we can rewrite as
$$
F=F_0-B\langle M\rangle +\frac{1}{2}a\langle M \rangle^2+
\frac{1}{4}b\langle M \rangle^4+\dots
$$
$$
F=F_0+\frac{1}{2}am^2+
\frac{1}{4}bm^4+\frac{1}{6}cm^6
$$
\( F \) has a minimum at equilibrium \( F'(m) =0 \) and \( F''(m) > 0 \)
$$
F'(m)=0=m(a+bm^2+cm^4),
$$
and if we assume that \( m \) is real we have two solutions
$$
m=0,
$$
or
$$
m^2 = \frac{b}{2c}\left(-1\pm\sqrt{1-4ac/b^2}\right)
$$
$$
m^2 = \frac{b}{2c}\left((-1\pm\sqrt{1-4ac/b^2}\right)\approx -a/b
$$
Assume that
$$ a(T) = \alpha (T-T_C), $$
with \( \alpha > 0 \) and \( T_C \) the critical temperature where the magnetization vanishes. If \( a \) is negative we have two solutions
$$ m = \pm \sqrt{-a/b} = \pm \sqrt{\frac{\alpha (T_C-T)}{b}}$$
\( m \) evolves continuously to the critical temperature where \( F=0 \) for \( T \le T_C \) (see separate graph).
$$ S= -\left(\frac{\partial F}{\partial T}\right)$$
For \( T \ge T_C \) we have \( m=0 \) and
$$ S= -\left(\frac{\partial F_0}{\partial T}\right)$$
and for \( T \le T_C \)
$$ S= -\left(\frac{\partial F_0}{\partial T}\right)-\alpha^2(T_C-T)/2b,$$
and we see that there is a smooth crossover at \( T_C \).
$$ C_V= T\left(\frac{\partial S}{\partial T}\right)$$
and \( T_C \) we get a discontinuity of
$$\Delta C_V = -\alpha^2/2b,$$
signalling a second-order phase transition.
Landau theory gives irrespective of dimension critical exponents
$$ m \sim (T_C-T)^{\beta},$$
and
$$ C_V \sim (T_C-T)^{\alpha},$$
with \( \beta =1/2 \) and \( \alpha = 1 \).
It predicts a phase transition for one dimension as well.
For the Ising model there is no phase transition for \( d=1 \).
In two dimensions we have \( \beta=1/8 \) and \( \alpha =0 \).
$$
\begin{equation}
\langle {\cal M}(T) \rangle \sim \left(T-T_C\right)^{\beta},
\end{equation}
$$
where \( \beta \) is a so-called critical exponent. A similar relation
applies to the heat capacity
$$
\begin{equation}
C_V(T) \sim \left|T_C-T\right|^{-\alpha},
\end{equation}
$$
the susceptibility
$$
\begin{equation}
\chi(T) \sim \left|T_C-T\right|^{\gamma}.
\end{equation}
$$
and the correlation length
$$
\begin{equation}
\xi(T) \sim \left|T_C-T\right|^{-\nu}.
\end{equation}
$$
\( \alpha = 0 \), \( \beta = 1/8 \), \( \gamma = 7/4 \) and \( \nu = 1 \). Below we will derive these coefficients
from finite size scaling theories.
$$
\begin{equation}
T_C(L)-T_C(L=\infty) \sim aL^{-1/\nu},
\tag{50}
\end{equation}
$$
$$
\begin{equation}
\langle {\cal M}(T) \rangle \sim \left(T-T_C\right)^{\beta}
\rightarrow L^{-\beta/\nu},
\tag{51}
\end{equation}
$$
$$
\begin{equation}
C_V(T) \sim \left|T_C-T\right|^{-\gamma} \rightarrow L^{\gamma/\nu},
\tag{52}
\end{equation}
$$
and
$$
\begin{equation}
\chi(T) \sim \left|T_C-T\right|^{-\alpha} \rightarrow L^{\alpha/\nu}.
\tag{53}
\end{equation}
$$
We can compute the slope of the curves for \( M \), \( C_V \) and \( \chi \) as function of lattice sites and extract
the exponent \( \nu \).
$$
Z_N=\left[2cosh(\beta J)e^I\right]^N,
$$
with
$$
I=\frac{1}{2\pi}\int_0^{\pi}d\phi ln
\left[\frac{1}{2}\left(1+(1-\kappa^2sin^2\phi)^{1/2}\right)\right],
$$
and
$$
\kappa=2sinh(2\beta J)/cosh^2(2\beta J).
$$
$$
\langle E\rangle = -Jcoth(2\beta J)\left[1+\frac{2}{\pi}(2tanh^2(2\beta J)-1)K_1(q)\right],
$$
with
\( q=2sinh(2\beta J)/cosh^2(2\beta J) \) and the complete elliptic integral of the first kind
$$
K_1(q) = \int_0^{\pi/2}\frac{d\phi}{\sqrt{1-q^2sin^2\phi}}.
$$
$$
C_V=\frac{4k_B}{\pi}(\beta Jcoth(2\beta J))^2
$$
$$
\left\{K_1(q)-K_2(q)-(1-tanh^2(2\beta J))\left[\frac{\pi}{2}+(2tanh^2(2\beta J)-1)K_1(q)\right]\right\},
$$
with
$$
K_2(q) = \int_0^{\pi/2}d\phi\sqrt{1-q^2sin^2\phi}.
$$
is the complete elliptic integral of the second kind.
Near the critical temperature \( T_C \) the specific heat behaves as
$$
C_V \approx -\frac{2}{k_B\pi}\left(\frac{J}{T_C}\right)^2ln\left|1-\frac{T}{T_C}\right|+\mbox{const}.
$$
$$
C_V\sim \left|1-\frac{T}{T_C}\right|^{-\alpha},
$$
and Onsager's result is a special case of this power law behavior. The limiting form of the function
$$
lim_{\alpha\rightarrow 0} \frac{1}{\alpha}(Y^{-\alpha}-1)=-lnY,
$$
meaning that the analytic result is a special case of the power law singularity with
\( \alpha =0 \).
$$
\left[1-\frac{(1-tanh^2 (\beta J))^4}{16tanh^4(\beta J)}\right]^{1/8}
$$
for \( T < T_C \) and \( 0 \) for \( T> T_C \). The behavior is thus as \( T\rightarrow T_C \) from below
$$
M(T) \sim (T_C-T)^{1/8}
$$
The susceptibility behaves as
$$
\chi(T) \sim |T_C-T|^{-7/4}
$$
$$
\phi(t) = \int dt' \left[{\cal M}(t')-\langle {\cal M} \rangle\right]\left[{\cal M}(t'+t)-\langle {\cal M} \rangle\right],
$$
which can be rewritten as
$$
\phi(t) = \int dt' \left[{\cal M}(t'){\cal M}(t'+t)-\langle {\cal M} \rangle^2\right],
$$
where \( \langle {\cal M} \rangle \) is the average value of the magnetization and
\( {\cal M}(t) \) its instantaneous value. We can discretize this function as follows, where we used our
set of computed values \( {\cal M}(t) \) for a set of discretized times (our Monte Carlo cycles corresponding
to a sweep over the lattice)
$$
\phi(t) = \frac{1}{t_{\mbox{max}}-t}\sum_{t'=0}^{t_{\mbox{max}}-t}{\cal M}(t'){\cal M}(t'+t)
-\frac{1}{t_{\mbox{max}}-t}\sum_{t'=0}^{t_{\mbox{max}}-t}{\cal M}(t')\times
\frac{1}{t_{\mbox{max}}-t}\sum_{t'=0}^{t_{\mbox{max}}-t}{\cal M}(t'+t).label{eq:phitf}
$$
// define m-value at each cycle within loop over cycles
m_matrix[cycles] = Maverage/(n_spins**2)/cycles
// Then compute phi(i) as (in pseudocode)
for i = 1, mcs
r = 1.0/(mcs-i)
s = 0.0; v = 0.0; p= 0.0
for j = 1, mcs-i
p = p+ m_matrix(j)*m_matrix(j+i)
s = s+ m_matrix(j)
v = v+ m_matrix(j+i)
end for
phi(i) = r*p-r*r*s*v
end for
One should therefore choose \( t \ll t_{\mbox{max}} \).
Note also that we could replace the magnetization with the mean energy, or any other expectation values of interest.
The time-correlation function for the magnetization gives a measure of the correlation between the magnetization at a time \( t' \) and a time \( t'+t \). If we multiply the magnetizations at these two different times, we will get a positive contribution if the magnetizations are fluctuating in the same direction, or a negative value if they fluctuate in the opposite direction. If we then integrate over time, or use the discretized version of, the time correlation function \( \phi(t) \) should take a non-zero value if the fluctuations are correlated, else it should gradually go to zero. For times a long way apart the magnetizations are most likely uncorrelated and \( \phi(t) \) should be zero.
$$
{\bf \hat{w}}(t) = {\bf \hat{W}^t\hat{w}}(0),
$$
with \( {\bf \hat{w}}(0) \) the distribution at \( t=0 \) and \( {\bf \hat{W}} \) representing the
transition probability matrix.
We can always expand \( {\bf \hat{w}}(0) \) in terms of the right eigenvectors of
\( {\bf \hat{v}} \) of \( {\bf \hat{W}} \) as
$$
{\bf \hat{w}}(0) = \sum_i\alpha_i{\bf \hat{v}}_i,
$$
resulting in
$$
{\bf \hat{w}}(t) = {\bf \hat{W}}^t{\bf \hat{w}}(0)={\bf \hat{W}}^t\sum_i\alpha_i{\bf \hat{v}}_i=
\sum_i\lambda_i^t\alpha_i{\bf \hat{v}}_i,
$$
with \( \lambda_i \) the \( i^{\mbox{th}} \) eigenvalue corresponding to
the eigenvector \( {\bf \hat{v}}_i \).
We can relate this property to an observable like the mean magnetization. With the probabilty \( {\bf \hat{w}}(t) \) (which in our case is the Boltzmann distribution) we can write the mean magnetization as
$$
\langle {\cal M}(t) \rangle = \sum_{\mu} {\bf \hat{w}}(t)_{\mu}{\cal M}_{\mu},
$$
or as the scalar of a vector product
$$
\langle {\cal M}(t) \rangle = {\bf \hat{w}}(t){\bf m},
$$
with \( {\bf m} \) being the vector whose elements are the values of \( {\cal M}_{\mu} \) in its
various microstates \( \mu \).
$$
\langle {\cal M}(t) \rangle = {\bf \hat{w}}(t){\bf m}=\sum_i\lambda_i^t\alpha_i{\bf \hat{v}}_i{\bf m}_i.
$$
If we define \( m_i={\bf \hat{v}}_i{\bf m}_i \) as the expectation value of
\( {\cal M} \) in the \( i^{\mbox{th}} \) eigenstate we can rewrite the last equation as
$$
\langle {\cal M}(t) \rangle = \sum_i\lambda_i^t\alpha_im_i.
$$
Since we have that in the limit \( t\rightarrow \infty \) the mean magnetization is dominated by the
the largest eigenvalue \( \lambda_0 \), we can rewrite the last equation as
$$
\langle {\cal M}(t) \rangle = \langle {\cal M}(\infty) \rangle+\sum_{i\ne 0}\lambda_i^t\alpha_im_i.
$$
We define the quantity
$$
\tau_i=-\frac{1}{log\lambda_i},
$$
and rewrite the last expectation value as
$$
\langle {\cal M}(t) \rangle = \langle {\cal M}(\infty) \rangle+\sum_{i\ne 0}\alpha_im_ie^{-t/\tau_i}.
\tag{54}
$$
$$
\phi(t) =\langle {\cal M}(0)-{\cal M}(\infty)\rangle \langle {\cal M}(t)-{\cal M}(\infty)\rangle,
$$
resulting in
$$
\phi(t) =\sum_{i,j\ne 0}m_i\alpha_im_j\alpha_je^{-t/\tau_i},
$$
which is appropriate for all times.
$$ \phi (t) \sim \exp{(-t/\tau)}$$
then the exponential correlation time can be computed as the average
$$ \tau_{\mbox{exp}} = -\langle \frac{t}{log|\frac{\phi(t)}{\phi(0)}|} \rangle. $$
If the decay is exponential, then
$$ \int_0^{\infty} dt \phi(t) = \int_0^{\infty} dt \phi(0)\exp{(-t/\tau)} = \tau \phi(0),$$
which suggests another measure of correlation
$$ \tau_{\mbox{int}} = \sum_k \frac{\phi(k)}{\phi(0)}, $$
called the integrated correlation time.
$$
\tau \sim L^{d+z},
$$
with \( d \) the dimensionality of the system.
For the Metropolis algorithm based on a single spin-flip process, Nightingale and Bl\"ote obtained
\( z=2.1665\pm 0.0012 \). This is a rather high value, meaning that our algorithm is not the best
choice when studying properties of the Ising model near \( T_C \).
We can understand this behavior by studying the development of the two-dimensional Ising model as function of temperature.
$$
E=-4J\quad\begin{array}{ccc} & \uparrow & \\
\uparrow & \uparrow & \uparrow\\
& \uparrow & \end{array}
\quad\Longrightarrow\quad
E=4J\quad\begin{array}{ccc} & \uparrow & \\
\uparrow & \downarrow & \uparrow\\
& \uparrow & \end{array}
$$
leads to an energy difference of \( \Delta E = 8J \). Using the exact critical temperature \( k_BT_C/J\approx 2.2.69 \),
we obtain a probability \( \exp{-(8/2.269)}=0.029429 \) which is rather small.
The increase in large correlation times due to increasing lattices can be diminished by using so-called
cluster algorithms, such as that
introduced by Ulli Wolff in 1989 and the Swendsen-Wang algorithm from 1987.
The two-dimensional Ising model with the Wolff or Swendsen-Wang algorithms exhibits a much smaller
correlation time, with the variable \( z=0.25\pm 001 \). Here, instead of flipping a single spin, one flips an entire cluster
of spins pointing in the same direction.
$$ \tau \sim \xi^z, $$
with \( z\approx 2.1 \) for the Metropolis algorithm.
The exponent \( z \) is called the dynamic critical exponent.
The maximum possible value for \( \xi \) in a finite system of \( N = L \times L \) spins is \( \xi\sim L \), because \( \xi \)
cannot be larger than the
lattice size!
This implies that \( \tau\sim L^z \approx N \). This makes simulations difficult because the Metropolis algorithm time
scales like \( N \), so the time to generate independent Metropolis configurations scales like
$$ N\tau\sim N^2 = L^4.$$
If the lattice
size
$$
L \rightarrow \sqrt{10}L\approx 3.2L,
$$
the simulation time increases by a factor of \( 100 \).