Full configuration interaction theory

3. Full configuration interaction theory#

3.1. Slater determinants as basis states#

The simplest possible choice for many-body wavefunctions are product wavefunctions. That is

\[ \Psi(x_1, x_2, x_3, \ldots, x_A) \approx \phi_1(x_1) \phi_2(x_2) \phi_3(x_3) \ldots \]

because we are really only good at thinking about one particle at a time. Such product wavefunctions, without correlations, are easy to work with; for example, if the single-particle states $\phi_i(x)$ are orthonormal, then the product wavefunctions are easy to orthonormalize.

Similarly, computing matrix elements of operators are relatively easy, because the integrals factorize.

The price we pay is the lack of correlations, which we must build up by using many, many product wavefunctions. (Thus we have a trade-off: compact representation of correlations but difficult integrals versus easy integrals but many states required.)

Because we have fermions, we are required to have antisymmetric wavefunctions, e.g.

\[ \Psi(x_1, x_2, x_3, \ldots, x_A) = - \Psi(x_2, x_1, x_3, \ldots, x_A) \]

etc. This is accomplished formally by using the determinantal formalism

\[\begin{split} \Psi(x_1, x_2, \ldots, x_A) = \frac{1}{\sqrt{A!}} \det \left | \begin{array}{cccc} \phi_1(x_1) & \phi_1(x_2) & \ldots & \phi_1(x_A) \\ \phi_2(x_1) & \phi_2(x_2) & \ldots & \phi_2(x_A) \\ \vdots & & & \\ \phi_A(x_1) & \phi_A(x_2) & \ldots & \phi_A(x_A) \end{array} \right | \end{split}\]

Product wavefunction + antisymmetry = Slater determinant.

\[\begin{split} \Psi(x_1, x_2, \ldots, x_A) = \frac{1}{\sqrt{A!}} \det \left | \begin{array}{cccc} \phi_1(x_1) & \phi_1(x_2) & \ldots & \phi_1(x_A) \\ \phi_2(x_1) & \phi_2(x_2) & \ldots & \phi_2(x_A) \\ \vdots & & & \\ \phi_A(x_1) & \phi_A(x_2) & \ldots & \phi_A(x_A) \end{array} \right | \end{split}\]

Properties of the determinant (interchange of any two rows or any two columns yields a change in sign; thus no two rows and no two columns can be the same) lead to the Pauli principle:

No two particles can be at the same place (two columns the same); and
No two particles can be in the same state (two rows the same).

As a practical matter, however, Slater determinants beyond $N=4$ quickly become unwieldy. Thus we turn to the occupation representation or second quantization to simplify calculations.

The occupation representation or number representation, using fermion creation and annihilation operators, is compact and efficient. It is also abstract and, at first encounter, not easy to internalize. It is inspired by other operator formalism, such as the ladder operators for the harmonic oscillator or for angular momentum, but unlike those cases, the operators do not have coordinate space representations.

Instead, one can think of fermion creation/annihilation operators as a game of symbols that compactly reproduces what one would do, albeit clumsily, with full coordinate-space Slater determinants.

We start with a set of orthonormal single-particle states $\{ \phi_i(x) \}$. (Note: this requirement, and others, can be relaxed, but leads to a more involved formalism.) Any orthonormal set will do.

To each single-particle state $\phi_i(x)$ we associate a creation operator $\hat{a}^\dagger_i$ and an annihilation operator $\hat{a}_i$.

When acting on the vacuum state $| 0 \rangle$, the creation operator $\hat{a}^\dagger_i$ causes a particle to occupy the single-particle state $\phi_i(x)$:

\[ \phi_i(x) \rightarrow \hat{a}^\dagger_i |0 \rangle \]

But with multiple creation operators we can occupy multiple states:

\[ \phi_i(x) \phi_j(x^\prime) \phi_k(x^{\prime \prime}) \rightarrow \hat{a}^\dagger_i \hat{a}^\dagger_j \hat{a}^\dagger_k |0 \rangle. \]

Now we impose antisymmetry, by having the fermion operators satisfy anticommutation relations:

\[ \hat{a}^\dagger_i \hat{a}^\dagger_j + \hat{a}^\dagger_j \hat{a}^\dagger_i = [ \hat{a}^\dagger_i ,\hat{a}^\dagger_j ]_+ = \{ \hat{a}^\dagger_i ,\hat{a}^\dagger_j \} = 0 \]

so that

\[ \hat{a}^\dagger_i \hat{a}^\dagger_j = - \hat{a}^\dagger_j \hat{a}^\dagger_i \]

Because of this property, automatically $\hat{a}^\dagger_i \hat{a}^\dagger_i = 0$, enforcing the Pauli exclusion principle. Thus when writing a Slater determinant using creation operators,

\[ \hat{a}^\dagger_i \hat{a}^\dagger_j \hat{a}^\dagger_k \ldots |0 \rangle \]

each index $i,j,k, \ldots$ must be unique.

We have defined the ansatz for the ground state as

\[ |\Phi_0\rangle = \left(\prod_{i\le F}\hat{a}_{i}^{\dagger}\right)|0\rangle, \]

where the index $i$ defines different single-particle states up to the Fermi level. We have assumed that we have $N$ fermions. A given one-particle-one-hole ($1p1h$) state can be written as

\[ |\Phi_i^a\rangle = \hat{a}_{a}^{\dagger}\hat{a}_i|\Phi_0\rangle, \]

while a $2p2h$ state can be written as

\[ |\Phi_{ij}^{ab}\rangle = \hat{a}_{a}^{\dagger}\hat{a}_{b}^{\dagger}\hat{a}_j\hat{a}_i|\Phi_0\rangle, \]

and a general $NpNh$ state as

\[ |\Phi_{ijk\dots}^{abc\dots}\rangle = \hat{a}_{a}^{\dagger}\hat{a}_{b}^{\dagger}\hat{a}_{c}^{\dagger}\dots\hat{a}_k\hat{a}_j\hat{a}_i|\Phi_0\rangle. \]

We can then expand our exact state function for the ground state as

\[ |\Psi_0\rangle=C_0|\Phi_0\rangle+\sum_{ai}C_i^a|\Phi_i^a\rangle+\sum_{abij}C_{ij}^{ab}|\Phi_{ij}^{ab}\rangle+\dots =(C_0+\hat{C})|\Phi_0\rangle, \]

where we have introduced the so-called correlation operator

\[ \hat{C}=\sum_{ai}C_i^a\hat{a}_{a}^{\dagger}\hat{a}_i +\sum_{abij}C_{ij}^{ab}\hat{a}_{a}^{\dagger}\hat{a}_{b}^{\dagger}\hat{a}_j\hat{a}_i+\dots \]

Since the normalization of $\Psi_0$ is at our disposal and since $C_0$ is by hypothesis non-zero, we may arbitrarily set $C_0=1$ with corresponding proportional changes in all other coefficients. Using this so-called intermediate normalization we have

\[ \langle \Psi_0 | \Phi_0 \rangle = \langle \Phi_0 | \Phi_0 \rangle = 1, \]

resulting in

\[ |\Psi_0\rangle=(1+\hat{C})|\Phi_0\rangle. \]

We rewrite

\[ |\Psi_0\rangle=C_0|\Phi_0\rangle+\sum_{ai}C_i^a|\Phi_i^a\rangle+\sum_{abij}C_{ij}^{ab}|\Phi_{ij}^{ab}\rangle+\dots, \]

in a more compact form as

\[ |\Psi_0\rangle=\sum_{PH}C_H^P\Phi_H^P=\left(\sum_{PH}C_H^P\hat{A}_H^P\right)|\Phi_0\rangle, \]

where $H$ stands for $0,1,\dots,n$ hole states and $P$ for $0,1,\dots,n$ particle states. Our requirement of unit normalization gives

\[ \langle \Psi_0 | \Phi_0 \rangle = \sum_{PH}|C_H^P|^2= 1, \]

and the energy can be written as

\[ E= \langle \Psi_0 | \hat{H} |\Phi_0 \rangle= \sum_{PP'HH'}C_H^{*P}\langle \Phi_H^P | \hat{H} |\Phi_{H'}^{P'} \rangle C_{H'}^{P'}. \]

Normally

\[ E= \langle \Psi_0 | \hat{H} |\Phi_0 \rangle= \sum_{PP'HH'}C_H^{*P}\langle \Phi_H^P | \hat{H} |\Phi_{H'}^{P'} \rangle C_{H'}^{P'}, \]

is solved by diagonalization setting up the Hamiltonian matrix defined by the basis of all possible Slater determinants. A diagonalization

is equivalent to finding the variational minimum of

\[ \langle \Psi_0 | \hat{H} |\Phi_0 \rangle-\lambda \langle \Psi_0 |\Phi_0 \rangle, \]

where $\lambda$ is a variational multiplier to be identified with the energy of the system. The minimization process results in

\[ \delta\left[ \langle \Psi_0 | \hat{H} |\Phi_0 \rangle-\lambda \langle \Psi_0 |\Phi_0 \rangle\right]= \]

\[ \sum_{P'H'}\left\{\delta[C_H^{*P}]\langle \Phi_H^P | \hat{H} |\Phi_{H'}^{P'} \rangle C_{H'}^{P'}+ C_H^{*P}\langle \Phi_H^P | \hat{H} |\Phi_{H'}^{P'} \rangle \delta[C_{H'}^{P'}]- \lambda( \delta[C_H^{*P}]C_{H'}^{P'}+C_H^{*P}\delta[C_{H'}^{P'}]\right\} = 0. \]

Since the coefficients $\delta[C_H^{*P}]$ and $\delta[C_{H'}^{P'}]$ are complex conjugates it is necessary and sufficient to require the quantities that multiply with $\delta[C_H^{*P}]$ to vanish.

This leads to

\[ \sum_{P'H'}\langle \Phi_H^P | \hat{H} |\Phi_{H'}^{P'} \rangle C_{H'}^{P'}-\lambda C_H^{P}=0, \]

for all sets of $P$ and $H$.

If we then multiply by the corresponding $C_H^{*P}$ and sum over $PH$ we obtain

\[ \sum_{PP'HH'}C_H^{*P}\langle \Phi_H^P | \hat{H} |\Phi_{H'}^{P'} \rangle C_{H'}^{P'}-\lambda\sum_{PH}|C_H^P|^2=0, \]

leading to the identification $\lambda = E$. This means that we have for all $PH$ sets

\[ \begin{equation} \sum_{P'H'}\langle \Phi_H^P | \hat{H} -E|\Phi_{H'}^{P'} \rangle = 0. \label{eq:fullci} \tag{1} \end{equation} \]

An alternative way to derive the last equation is to start from

\[ (\hat{H} -E)|\Psi_0\rangle = (\hat{H} -E)\sum_{P'H'}C_{H'}^{P'}|\Phi_{H'}^{P'} \rangle=0, \]

and if this equation is successively projected against all $\Phi_H^P$ in the expansion of $\Psi$, then the last equation on the previous slide results. As stated previously, one solves this equation normally by diagonalization. If we are able to solve this equation exactly (that is numerically exactly) in a large Hilbert space (it will be truncated in terms of the number of single-particle states included in the definition of Slater determinants), it can then serve as a benchmark for other many-body methods which approximate the correlation operator $\hat{C}$.

3.2. Example of a Hamiltonian matrix#

Suppose, as an example, that we have six fermions below the Fermi level. This means that we can make at most $6p-6h$ excitations. If we have an infinity of single particle states above the Fermi level, we will obviously have an infinity of say $2p-2h$ excitations. Each such way to configure the particles is called a configuration. We will always have to truncate in the basis of single-particle states. This gives us a finite number of possible Slater determinants. Our Hamiltonian matrix would then look like (where each block can have a large dimensionalities):

	$0p-0h$	$1p-1h$	$2p-2h$	$3p-3h$	$4p-4h$	$5p-5h$	$6p-6h$
$0p-0h$	x	x	x	0	0	0	0
$1p-1h$	x	x	x	x	0	0	0
$2p-2h$	x	x	x	x	x	0	0
$3p-3h$	0	x	x	x	x	x	0
$4p-4h$	0	0	x	x	x	x	x
$5p-5h$	0	0	0	x	x	x	x
$6p-6h$	0	0	0	0	x	x	x

with a two-body force. Why are there non-zero blocks of elements?

If we use a Hartree-Fock basis, this corresponds to a particular unitary transformation where matrix elements of the type $\langle 0p-0h \vert \hat{H} \vert 1p-1h\rangle =\langle \Phi_0 | \hat{H}|\Phi_{i}^{a}\rangle=0$ and our Hamiltonian matrix becomes

	$0p-0h$	$1p-1h$	$2p-2h$	$3p-3h$	$4p-4h$	$5p-5h$	$6p-6h$
$0p-0h$	$\tilde{x}$	0	$\tilde{x}$	0	0	0	0
$1p-1h$	0	$\tilde{x}$	$\tilde{x}$	$\tilde{x}$	0	0	0
$2p-2h$	$\tilde{x}$	$\tilde{x}$	$\tilde{x}$	$\tilde{x}$	$\tilde{x}$	0	0
$3p-3h$	0	$\tilde{x}$	$\tilde{x}$	$\tilde{x}$	$\tilde{x}$	$\tilde{x}$	0
$4p-4h$	0	0	$\tilde{x}$	$\tilde{x}$	$\tilde{x}$	$\tilde{x}$	$\tilde{x}$
$5p-5h$	0	0	0	$\tilde{x}$	$\tilde{x}$	$\tilde{x}$	$\tilde{x}$
$6p-6h$	0	0	0	0	$\tilde{x}$	$\tilde{x}$	$\tilde{x}$

3.3. FCI and the exponential growth#

Full configuration interaction theory calculations provide in principle, if we can diagonalize numerically, all states of interest. The dimensionality of the problem explodes however quickly.

The total number of Slater determinants which can be built with say $N$ neutrons distributed among $n$ single particle states is

\[\begin{split} \left (\begin{array}{c} n \\ N\end{array} \right) =\frac{n!}{(n-N)!N!}. \end{split}\]

For a model space which comprises the first for major shells only $0s$, $0p$, $1s0d$ and $1p0f$ we have $40$ single particle states for neutrons and protons. For the eight neutrons of oxygen-16 we would then have

\[\begin{split} \left (\begin{array}{c} 40 \\ 8\end{array} \right) =\frac{40!}{(32)!8!}\sim 10^{9}, \end{split}\]

and multiplying this with the number of proton Slater determinants we end up with approximately with a dimensionality $d$ of $d\sim 10^{18}$.

This number can be reduced if we look at specific symmetries only. However, the dimensionality explodes quickly!

For Hamiltonian matrices of dimensionalities which are smaller than $d\sim 10^5$, we would use so-called direct methods for diagonalizing the Hamiltonian matrix
For larger dimensionalities iterative eigenvalue solvers like Lanczos’ method are used. The most efficient codes at present can handle matrices of $d\sim 10^{10}$.

3.4. A non-practical way of solving the eigenvalue problem#

To see this, we look at the contributions arising from

\[ \langle \Phi_H^P | = \langle \Phi_0| \]

in Eq. (1), that is we multiply with $\langle \Phi_0 |$ from the left in

\[ (\hat{H} -E)\sum_{P'H'}C_{H'}^{P'}|\Phi_{H'}^{P'} \rangle=0. \]

If we assume that we have a two-body operator at most, Slater’s rule gives then an equation for the correlation energy in terms of $C_i^a$ and $C_{ij}^{ab}$ only. We get then

\[ \langle \Phi_0 | \hat{H} -E| \Phi_0\rangle + \sum_{ai}\langle \Phi_0 | \hat{H} -E|\Phi_{i}^{a} \rangle C_{i}^{a}+ \sum_{abij}\langle \Phi_0 | \hat{H} -E|\Phi_{ij}^{ab} \rangle C_{ij}^{ab}=0, \]

or

\[ E-E_0 =\Delta E=\sum_{ai}\langle \Phi_0 | \hat{H}|\Phi_{i}^{a} \rangle C_{i}^{a}+ \sum_{abij}\langle \Phi_0 | \hat{H}|\Phi_{ij}^{ab} \rangle C_{ij}^{ab}, \]

where the energy $E_0$ is the reference energy and $\Delta E$ defines the so-called correlation energy. The single-particle basis functions could be the results of a Hartree-Fock calculation or just the eigenstates of the non-interacting part of the Hamiltonian.

To see this, we look at the contributions arising from

\[ \langle \Phi_H^P | = \langle \Phi_0| \]

in Eq. (1), that is we multiply with $\langle \Phi_0 |$ from the left in

\[ (\hat{H} -E)\sum_{P'H'}C_{H'}^{P'}|\Phi_{H'}^{P'} \rangle=0. \]

If we assume that we have a two-body operator at most, Slater’s rule gives then an equation for the correlation energy in terms of $C_i^a$ and $C_{ij}^{ab}$ only. We get then

\[ \langle \Phi_0 | \hat{H} -E| \Phi_0\rangle + \sum_{ai}\langle \Phi_0 | \hat{H} -E|\Phi_{i}^{a} \rangle C_{i}^{a}+ \sum_{abij}\langle \Phi_0 | \hat{H} -E|\Phi_{ij}^{ab} \rangle C_{ij}^{ab}=0, \]

or

\[ E-E_0 =\Delta E=\sum_{ai}\langle \Phi_0 | \hat{H}|\Phi_{i}^{a} \rangle C_{i}^{a}+ \sum_{abij}\langle \Phi_0 | \hat{H}|\Phi_{ij}^{ab} \rangle C_{ij}^{ab}, \]

where the energy $E_0$ is the reference energy and $\Delta E$ defines the so-called correlation energy. The single-particle basis functions could be the results of a Hartree-Fock calculation or just the eigenstates of the non-interacting part of the Hamiltonian.

In our notes on Hartree-Fock calculations, we have already computed the matrix $\langle \Phi_0 | \hat{H}|\Phi_{i}^{a}\rangle $ and $\langle \Phi_0 | \hat{H}|\Phi_{ij}^{ab}\rangle$. If we are using a Hartree-Fock basis, then the matrix elements $\langle \Phi_0 | \hat{H}|\Phi_{i}^{a}\rangle=0$ and we are left with a correlation energy given by

\[ E-E_0 =\Delta E^{HF}=\sum_{abij}\langle \Phi_0 | \hat{H}|\Phi_{ij}^{ab} \rangle C_{ij}^{ab}. \]

Inserting the various matrix elements we can rewrite the previous equation as

\[ \Delta E=\sum_{ai}\langle i| \hat{f}|a \rangle C_{i}^{a}+ \sum_{abij}\langle ij | \hat{v}| ab \rangle C_{ij}^{ab}. \]

This equation determines the correlation energy but not the coefficients $C$.

We need more equations. Our next step is to set up

\[ \langle \Phi_i^a | \hat{H} -E| \Phi_0\rangle + \sum_{bj}\langle \Phi_i^a | \hat{H} -E|\Phi_{j}^{b} \rangle C_{j}^{b}+ \sum_{bcjk}\langle \Phi_i^a | \hat{H} -E|\Phi_{jk}^{bc} \rangle C_{jk}^{bc}+ \sum_{bcdjkl}\langle \Phi_i^a | \hat{H} -E|\Phi_{jkl}^{bcd} \rangle C_{jkl}^{bcd}=0, \]

as this equation will allow us to find an expression for the coefficents $C_i^a$ since we can rewrite this equation as

\[ \langle i | \hat{f}| a\rangle +\langle \Phi_i^a | \hat{H}|\Phi_{i}^{a} \rangle C_{i}^{a}+ \sum_{bj\ne ai}\langle \Phi_i^a | \hat{H}|\Phi_{j}^{b} \rangle C_{j}^{b}+ \sum_{bcjk}\langle \Phi_i^a | \hat{H}|\Phi_{jk}^{bc} \rangle C_{jk}^{bc}+ \sum_{bcdjkl}\langle \Phi_i^a | \hat{H}|\Phi_{jkl}^{bcd} \rangle C_{jkl}^{bcd}=EC_i^a. \]

We see that on the right-hand side we have the energy $E$. This leads to a non-linear equation in the unknown coefficients. These equations are normally solved iteratively ( that is we can start with a guess for the coefficients $C_i^a$). A common choice is to use perturbation theory for the first guess, setting thereby

\[ C_{i}^{a}=\frac{\langle i | \hat{f}| a\rangle}{\epsilon_i-\epsilon_a}. \]

The observant reader will however see that we need an equation for $C_{jk}^{bc}$ and $C_{jkl}^{bcd}$ as well. To find equations for these coefficients we need then to continue our multiplications from the left with the various $\Phi_{H}^P$ terms.

For $C_{jk}^{bc}$ we need then

\[ \langle \Phi_{ij}^{ab} | \hat{H} -E| \Phi_0\rangle + \sum_{kc}\langle \Phi_{ij}^{ab} | \hat{H} -E|\Phi_{k}^{c} \rangle C_{k}^{c}+ \]

\[ \sum_{cdkl}\langle \Phi_{ij}^{ab} | \hat{H} -E|\Phi_{kl}^{cd} \rangle C_{kl}^{cd}+\sum_{cdeklm}\langle \Phi_{ij}^{ab} | \hat{H} -E|\Phi_{klm}^{cde} \rangle C_{klm}^{cde}+\sum_{cdefklmn}\langle \Phi_{ij}^{ab} | \hat{H} -E|\Phi_{klmn}^{cdef} \rangle C_{klmn}^{cdef}=0, \]

and we can isolate the coefficients $C_{kl}^{cd}$ in a similar way as we did for the coefficients $C_{i}^{a}$.

A standard choice for the first iteration is to set

\[ C_{ij}^{ab} =\frac{\langle ij \vert \hat{v} \vert ab \rangle}{\epsilon_i+\epsilon_j-\epsilon_a-\epsilon_b}. \]

At the end we can rewrite our solution of the Schroedinger equation in terms of $n$ coupled equations for the coefficients $C_H^P$. This is a very cumbersome way of solving the equation. However, by using this iterative scheme we can illustrate how we can compute the various terms in the wave operator or correlation operator $\hat{C}$. We will later identify the calculation of the various terms $C_H^P$ as parts of different many-body approximations to full CI. In particular, we can relate this non-linear scheme with Coupled Cluster theory and many-body perturbation theory.

3.5. Summarizing FCI and bringing in approximative methods#

If we can diagonalize large matrices, FCI is the method of choice since:

It gives all eigenvalues, ground state and excited states
The eigenvectors are obtained directly from the coefficients $C_H^P$ which result from the diagonalization
We can compute easily expectation values of other operators, as well as transition probabilities
Correlations are easy to understand in terms of contributions to a given operator beyond the Hartree-Fock contribution. This is the standard approach in many-body theory.

3.6. Definition of the correlation energy#

The correlation energy is defined as, with a two-body Hamiltonian,

\[ \Delta E=\sum_{ai}\langle i| \hat{f}|a \rangle C_{i}^{a}+ \sum_{abij}\langle ij | \hat{v}| ab \rangle C_{ij}^{ab}. \]

The coefficients $C$ result from the solution of the eigenvalue problem. The energy of say the ground state is then

\[ E=E_{ref}+\Delta E, \]

where the so-called reference energy is the energy we obtain from a Hartree-Fock calculation, that is

\[ E_{ref}=\langle \Phi_0 \vert \hat{H} \vert \Phi_0 \rangle. \]

However, as we have seen, even for a small case like the four first major shells and a nucleus like oxygen-16, the dimensionality becomes quickly intractable. If we wish to include single-particle states that reflect weakly bound systems, we need a much larger single-particle basis. We need thus approximative methods that sum specific correlations to infinite order.

Popular methods are

Many-body perturbation theory (in essence a Taylor expansion)
Coupled cluster theory (coupled non-linear equations)
Green’s function approaches (matrix inversion)
Similarity group transformation methods (coupled ordinary differential equations)

All these methods start normally with a Hartree-Fock basis as the calculational basis.

3.7. Algorithms for solving eigenvalue problems#

Standard algorithms and Eigenvector continuation