Hartree–Fock theory#

With the state of the system being described by a single Slater determinant, the Hartree–Fock wave function is given as that which minimizes the electronic energy in a variational sense with respect to variations in the spin orbitals. It represents a cornerstone in quantum chemistry and provides total electronic energies that are within 1% of the exact results and a wide range of molecular properties that are within 5–10% accuracy. Moreover, the Hartree–Fock method serves as starting points for the formulation of many other, more accurate, wave function methods as well as the Kohn–Sham formulation of density functional theory.

Hartree–Fock equation#

In the Hartree–Fock approximation, the many-electron wave function takes the form of a Slater determinant

\[\begin{split} | \Phi \rangle = \frac{1}{\sqrt{N!}} \begin{vmatrix} \psi_{1}(\mathbf{r}_1) & \cdots & \psi_{N}(\mathbf{r}_1) \\ \vdots & \ddots & \vdots \\ \psi_{1}(\mathbf{r}_N) & \cdots & \psi_{N}(\mathbf{r}_N) \\ \end{vmatrix} \end{split}\]

where \(\psi_i\) are the single-electron wave functins known as spin orbitals. The Hartree–Fock energy and the associated state is found by minimizing the energy functional

\[ E_\mathrm{HF} = \min_{\psi} E[\psi] \]

under the constraint that the spin orbitals remain orthonormal. Here, \(\psi\) collectively refers to the entire set of \(N\) spin orbitals. Such a contrained minimization is conveniently performed by means of the technique of Lagrange multipliers.


In Hartree–Fock theory, we introduce the real-valued Lagrangian

\[ L[\psi] = E[\psi] - \sum_{i,j=1}^N \varepsilon_{ji} \big( \langle \psi_i | \psi_j \rangle - \delta_{ij} \big) \]

and search for the set of spin orbitals, \(\psi\), that results in a first variation that vanishes

\[ \delta L = 0 \]

Expressing the energy as the expectation value of the electronic Hamiltonian with respect to a Slater determinant and using the general expressions for matrix elements, we arrive at

\[\begin{align*} \delta L & = \sum_{i=1}^N \langle \delta \psi_i | \hat{h} | \psi_i \rangle + \sum_{i,j=1}^N \big( \langle \delta \psi_i \psi_j | \hat{g} | \psi_i \psi_j\rangle - \langle \delta \psi_i \psi_j | \hat{g} | \psi_j \psi_i\rangle - \varepsilon_{ji} \langle \delta \psi_i | \psi_j \rangle \big) + \mbox{complex conjugate} \\ &= \sum_{i=1}^N \langle \delta \psi_i | \big( \hat{f} | \psi_i \rangle - \sum_{j=1}^N \varepsilon_{ji} | \psi_j \rangle \big) + \mbox{complex conjugate} \end{align*}\]

where we have introduced the one-electron Fock operator

\[ \hat{f} = \hat{h} + \sum_{j=1}^N \big( \hat{J}_j - \hat{K}_j \big) \]


\[\begin{align*} \hat{J}_j | \psi_i \rangle & = \Big[ \int \frac{e^2 |\psi_j(\mathbf{r}')|^2}{4\pi\varepsilon_0 |\mathbf{r} - \mathbf{r}'|} d^3\mathbf{r}' \Big] | \psi_i \rangle \\ % \hat{K}_j | \psi_i \rangle & = \Big[ \int \frac{e^2 \psi_j^\dagger(\mathbf{r}')\psi_i(\mathbf{r}')}{4\pi\varepsilon_0 |\mathbf{r} - \mathbf{r}'|} d^3\mathbf{r}' \Big] | \psi_j \rangle \end{align*}\]

Since the first-order variation in the Lagrangian is required to vanish for general variations in the spin orbitals, we have shown that the Hartree–Fock solution is given by

\[ \hat{f} | \psi_i \rangle - \sum_{j=1}^N \varepsilon_{ji} | \psi_j \rangle = 0 \]

This equation is known as the Hartree–Fock equation and it to be solved for the spin orbitals and the associated Lagrange multipliers. We note that the matrix elements of the Fock operator equal the multipliers

\[ f_{ki} = \langle \psi_k | \hat{f} | \psi_i \rangle = \sum_{j=1}^N \varepsilon_{ji} \langle \psi_k | \psi_j \rangle = \varepsilon_{ki} \]

Canonical form#

Apart from a trivial overall phase factor, unitary transformations among the occupied orbitals are shown to leave the Hartree–Fock wave function unchanged. We introduce a unitary transformation that diagonalizes the Hermitian Fock matrix

\[ f' = \langle \overline{\psi}' | \hat{f} | \overline{\psi}' \rangle = U^\dagger \langle \overline{\psi} | \hat{f} | \overline{\psi} \rangle U = U^\dagger f U \]

We have here adopted the compact overline notation of orbitals. In this basis of canonical spin orbitals, the Hartree–Fock equation takes the form

\[ \hat{f} | \psi_i \rangle = \varepsilon_{i} | \psi_i \rangle \]

which we recognize as an eigenvalue equation introducing the orbital energies, \(\varepsilon_{i}\), as the eigenvalues of the Fock operator. With an infinite number of solutions to the Hartree–Fock equation, the Hartree–Fock ground state is given by employing the \(N\) spin orbitals with lowest orbital energies in the Slater determinant.

In AO basis#

The spatial parts of the spin orbitals, or molecular orbitals (MOs), are expanded as linear combination of atomic orbitals (LCAO). In the basis of spin atomic orbitals, the Fock matrix becomes block diagonal

\[\begin{split} \mathbf{F} = \begin{pmatrix} \mathbf{F}^{\alpha\alpha} & \mathbf{0} \\ \mathbf{0} & \mathbf{F}^{\beta\beta} \end{pmatrix} \end{split}\]

Using the bar notation to distinguish \(\alpha\)- and \(\beta\)-spin atomic orbitals, we get

\[\begin{align*} F_{\mu\nu} & = F^{\alpha\alpha}_{\mu\nu} = h_{\mu\nu} + \sum_{\gamma\delta} \Big( D_{\gamma\delta}(\mu\nu|\gamma\delta) - D^\alpha_{\gamma\delta}(\mu\delta|\gamma\nu) \Big) \\ F_{\bar{\mu}\bar{\nu}} & = F^{\beta\beta}_{\mu\nu} = h_{\mu\nu} + \sum_{\gamma\delta} \Big( D_{\gamma\delta}(\mu\nu|\gamma\delta) - D^\beta_{\gamma\delta}(\mu\delta|\gamma\nu) \Big) \\ F_{\mu\bar{\nu}} & = F_{\bar{\mu}\nu} = 0 \end{align*}\]


\[\begin{align*} D_{\gamma\delta} &= D^\alpha_{\gamma\delta} + D^\beta_{\gamma\delta} \\ D^\alpha_{\gamma\delta}& = \sum_{j=1}^{N_\alpha} \big[c_{\gamma j}^\alpha\big]^* c_{\delta j}^\alpha ; \quad D^\beta_{\gamma\delta} = \sum_{j=1}^{N_\beta} \big[c_{\gamma j}^\beta\big]^* c_{\delta j}^\beta \\ \end{align*}\]

The canonical Hartree–Fock equation thereby takes the form

\[ \mathbf{F C} = \mathbf{S C} \boldsymbol{\varepsilon} \, , \]

where \(\mathbf{S}\) is the overlap matrix and \(\boldsymbol{\varepsilon}\) is a diagonal matrix collecting the orbital energies.

Hartree–Fock energy#

For a given density \(\mathbf{D}\), the Hartree–Fock energy becomes equal to

\[ E_\mathrm{HF} = \frac{1}{2} \mathrm{tr} \big[ (\mathbf{h} + \mathbf{F}) \mathbf{D} \big] + V^\mathrm{n-n} \, , \]

where \(V^\mathrm{n-n}\) is the nuclear repulsion energy.

Koopmans’ theorem#

The orbital energies of occupied and unoccupied orbitals, respectively, equal

\[\begin{align*} \varepsilon_i & = \langle \psi_i |\hat{f} | \psi_i \rangle = \langle \psi_i |\hat{h} | \psi_i \rangle + \sum_{j\neq i}^N \big( \langle \psi_i | \hat{J}_j | \psi_i \rangle - \langle \psi_i | \hat{K}_j | \psi_i \rangle \big) \\ \varepsilon_a & = \langle \psi_a |\hat{f} | \psi_a \rangle = \langle \psi_a |\hat{h} | \psi_a \rangle + \sum_{j=1}^N \big( \langle \psi_a | \hat{J}_j | \psi_a \rangle - \langle \psi_a | \hat{K}_j | \psi_a \rangle \big) \end{align*}\]

where the cancellation between Coulomb and exchange terms for \(j=i\) has been used in the former case. It thus appears as if \(\varepsilon_i\) relates to the energy of an electron interacting with \((N-1)\) other electrons, whereas \(\varepsilon_a\) relates to the energy of an electron interacting with \(N\) other electrons. In accordance with these observations, it is readily shown from the expressions for matrix elements that

\[\begin{align*} \mathrm{IP} &= E_i^{N-1} - E_\mathrm{HF}^N = - \varepsilon_i \qquad (\textsf{ionization potential}) \\ \mathrm{EA} &= E_\mathrm{HF}^N - E_a^{N+1} = - \varepsilon_a \qquad (\textsf{electron affinity}) \end{align*}\]

where, in the frozen orbital approximation, \(E_i^{N-1}\) is the energy of the system after the removal of the electron in spin orbital \(i\) and \(E_a^{N+1}\) is the energy of the system after the addition of an electron in spin orbital \(s\).

Brillouin’s theorem#

Based on the expressions for matrix elements, we find

\[ \langle \Phi_\mathrm{HF} | \hat{H} | \Phi_i^a \rangle = \langle \psi_i | \hat{f} | \psi_a \rangle = 0 \]

which shows that there is no coupling between the Hartree–Fock ground state and singly excited determinants. This result is known as the Brillouin theorem.

SCF procedure#

Due to the summation over occupied spin orbitals that expresses the effective electron interactions, the Fock operator depends on its eigenfunctions and the canonical Hartree–Fock equation is therefore solved iteratively by means of a self-consistent field (SCF) procedure as illustrated in Fig. 17.


Fig. 17 Self-consistent field solution of the Hatree–Fock equation by means of the Rothaan–Hall algorithm.#

Rothaan–Hall scheme#

In the following, we will consider the spin-restricted formulation where \(\alpha\)- and \(\beta\)-spin orbitals have identical spatial parts. We also restrict the situation to the common case of a closed-shell system such that

\[\begin{align*} N_\alpha & = N_\beta = \frac{1}{2} N \\ D^\alpha_{\gamma\delta} & = D^\beta_{\gamma\delta} = \frac{1}{2} D_{\gamma\delta} = \sum_{j=1}^{N/2} c_{\gamma j}^* c_{\delta j} \end{align*}\]


When referring to closed-shell systems it is customary to refer to the density matrix as that for either of the spin components.

import veloxchem as vlx
import numpy as np

Writing your own SCF program#

Let us implement the presented Hartree–Fock SCF procedure. We will use the water molecule as an example and, as reference, we will first compute the Hartree–Fock energy using the built-in compute method in VeloxChem.

Setting up the system#

mol_str = """
O    0.000000000000        0.000000000000        0.000000000000
H    0.000000000000        0.740848095288        0.582094932012
H    0.000000000000       -0.740848095288        0.582094932012

molecule = vlx.Molecule.read_str(mol_str, units='angstrom')
basis = vlx.MolecularBasis.read(molecule, 'cc-pVDZ')

norb = vlx.MolecularBasis.get_dimensions_of_basis(basis, molecule)
nocc = molecule.number_of_electrons() // 2
V_nuc = molecule.nuclear_repulsion_energy()

print('Number of contracted basis functions:', norb)
print('Number of doubly occupied molecular orbitals:', nocc)
print(f'Nuclear repulsion energy (in a.u.): {V_nuc : 14.12f}')
Number of contracted basis functions: 24
Number of doubly occupied molecular orbitals: 5
Nuclear repulsion energy (in a.u.):  9.343638157670

Reference calculation#

Let us first perform an reference calculation using the restricted closed-shell SCF driver in VeloxChem.

scf_drv = vlx.ScfRestrictedDriver()
scf_drv.compute(molecule, basis)
print(f'Final HF energy: {scf_drv.get_scf_energy() : 12.8f} Hartree')
Final HF energy: -76.02698419 Hartree

Getting integrals in the AO basis#

# overlap matrix
overlap_drv = vlx.OverlapIntegralsDriver()
S = overlap_drv.compute(molecule, basis).to_numpy()

# one-electron Hamiltonian
kinetic_drv = vlx.KineticEnergyIntegralsDriver()
T = kinetic_drv.compute(molecule, basis).to_numpy()

nucpot_drv = vlx.NuclearPotentialIntegralsDriver()
V = -nucpot_drv.compute(molecule, basis).to_numpy()

h = T + V 

# two-electron Hamiltonian
eri_drv = vlx.ElectronRepulsionIntegralsDriver()
g = np.zeros((norb, norb, norb, norb))
eri_drv.compute_in_mem(molecule, basis, g)

Orthogonalization of the AO basis#

In order to use the np.linalg.eigh function from NumPy to solve the Hartree–Fock equation, we first retrieve an orthogonal AO (OAO) basis by means of a non-unitary transformation matrix \(\mathbf{X}\) such that

\[ | \overline{\chi^\mathrm{OAO}} \rangle = | \overline{\chi} \rangle \mathbf{X} \]


\[ \mathbf{S}^\mathrm{OAO} = \mathbf{X}^\dagger \mathbf{S X} = \mathbf{I} \]

In the absence of linear dependencies in the AO basis, the overlap matrix is symmetric (and positive-definite) and it can first be diagonalized by a unitary transformation

\[ \mathbf{U}^{\dagger} \mathbf{S U} = \boldsymbol{\sigma} \]

where \(\boldsymbol{\sigma}\) is a diagonal matrix collecting the eigenvalues. It is then straightfoward to contruct explicit forms of \(\mathbf{X}\). There exist two common choices

\[\begin{align*} \mathbf{X} = \mathbf{U} \boldsymbol{\sigma}^{-\frac{1}{2}} \mathbf{U}^\dagger \qquad & \textsf{symmetric, or Löwdin} \\ \mathbf{X} = \mathbf{U} \boldsymbol{\sigma}^{-\frac{1}{2}} \qquad & \textsf{canonical} \end{align*}\]

We can readily verify that both expression satisfy \(\mathbf{X}^\dagger \mathbf{S X} = \mathbf{I}\). The expression for the associated transformation of MO coefficients is determined from the relation

\[ | \overline{\phi} \rangle = | \overline{\chi} \rangle \mathbf{C} = | \overline{\chi^\mathrm{OAO}} \rangle \mathbf{X}^{-1} \mathbf{C} \]

We identify

\[ \mathbf{C}^\mathrm{OAO} = \mathbf{X}^{-1} \mathbf{C} \]


\[ \mathbf{C} = \mathbf{X C}^\mathrm{OAO} \]
# symmetric transformation
sigma, U = np.linalg.eigh(S)
X = np.einsum('ik,k,jk->ij', U, 1/np.sqrt(sigma), U)

Solving the Hartree–Fock equation#

For a given Fock matrix, we solve the Hartree–Fock equation by the following steps:

  1. transform the Fock matrix to the OAO basis

  2. diagonalize the Fock matrix

  3. transform the MO coefficients back to AO basis

def get_MO_coeff(F):

    F_OAO = np.einsum('ki,kl,lj->ij', X, F, X)
    epsilon, C_OAO = np.linalg.eigh(F_OAO)    
    C = np.einsum("ik,kj->ij", X, C_OAO)
    return C

SCF iterations#

We form an initial guess for the density based on the core Hamiltonian and thereafter enter the SCF iterations. As a measure of convergence, we adopt the norm of the occupied–virtual block of the Fock matrix in MO basis. It is convenient to scatter the elements of this block into the format of a vector and which we refer to as the error vector. During the course of the SCF iterations, we form a sequence of error vectors, \(\mathbf{e}_i\).

max_iter = 50
conv_thresh = 1e-4

# initial guess from core Hamiltonian
C = get_MO_coeff(h)

print("iter      SCF energy    Error norm")

for iter in range(max_iter):
    D = np.einsum('ik,jk->ij', C[:, :nocc], C[:, :nocc])
    J = np.einsum('ijkl,kl->ij', g, D)
    K = np.einsum('ilkj,kl->ij', g, D)
    F = h + 2*J - K
    E = np.einsum('ij,ij->', h + F, D) + V_nuc

    # compute convergence metric
    F_MO = np.einsum('ki,kl,lj->ij', C, F, C)
    e_vec = np.reshape(F_MO[:nocc, nocc:], -1)
    error = np.linalg.norm(e_vec)

    print(f'{iter:>2d}  {E:16.8f}  {error:10.2e}')

    if error < conv_thresh:
        print('SCF iterations converged!')
    C = get_MO_coeff(F)
iter      SCF energy    Error norm
 0      -68.84975229    2.23e+00
 1      -69.95937641    1.79e+00
 2      -73.34743276    1.74e+00
 3      -73.46688910    1.36e+00
 4      -74.74058933    1.29e+00
 5      -75.55859127    7.91e-01
 6      -75.86908635    4.86e-01
 7      -75.97444165    2.74e-01
 8      -76.00992921    1.60e-01
 9      -76.02143957    8.99e-02
10      -76.02519173    5.15e-02
11      -76.02640379    2.92e-02
12      -76.02679653    1.67e-02
13      -76.02692347    9.45e-03
14      -76.02696455    5.38e-03
15      -76.02697784    3.06e-03
16      -76.02698213    1.74e-03
17      -76.02698352    9.89e-04
18      -76.02698397    5.63e-04
19      -76.02698412    3.20e-04
20      -76.02698416    1.82e-04
21      -76.02698418    1.04e-04
22      -76.02698418    5.89e-05
SCF iterations converged!

Convergence acceleration#

Direct inversion iterative subspace#

The Rothaan–Hall scheme suffer from poor numerical convergence and in practice some version of convergence acceleration is adopted. In the method of direct inversion of the iterative subspace (DIIS) [Pul80, Pul82, Sel93] information is used from not only the present but also previous iterations to form an averaged effective one-electron Hamiltonian, or Fock matrix, according to

\[ \mathbf{F}_n^\mathrm{DIIS} = \sum_{i=1}^n w_i \mathbf{F}_i \]

where \(\mathbf{F}_i\) is the Fock matrix generated in SCF iteration \(i\), \(n\) is present iteration, and the weights, \(w_i\), are to be determined. To guarantee that the one-electron Hamiltonian is preserved in the effective Fock operator, we impose the condition

\[ \sum_{i=1}^n w_i = 1 \]

Under the assumption of a strict linearity between Fock matrices and error vectors, the averaged error vector would read

\[ \mathbf{e}_n^\mathrm{DIIS} = \sum_{i=1}^n w_i \mathbf{e}_i \]

As the molecular orbitals change from one iteration to the next, this is not strictly the case but it is a good approximation. We can then determine the weights by minimizing the squared norm of the averaged error vector under the imposed constraint. The squared norm becomes equal to

\[ \| \mathbf{e}_n^\mathrm{DIIS} \|^2 = \sum_{i,j=1}^n w_i B_{ij} w_j ; \quad B_{ij} = \langle \mathbf{e}_i | \mathbf{e}_j \rangle \]

and the constrained minimization is achieved by introducing a Lagrangian

\[ L = \| \mathbf{e}_n^\mathrm{DIIS} \|^2 - 2\lambda \Big( \sum_{i=1}^n w_i - 1 \Big) \]

where the factor of \(-2\) multiplying the Lagrange multiplier \(\lambda\) is a mere convention as to arrive at an explicit matrix equation of the form

\[\begin{split} \begin{pmatrix} B_{11} & \cdots & B_{1n} & -1 \\ \vdots & \ddots & \vdots & \vdots \\ B_{n1} & \cdots & B_{nn} & -1 \\ -1 & \cdots & -1 & 0 \end{pmatrix} \begin{pmatrix} w_1 \\ \vdots \\ w_n \\ \lambda \end{pmatrix} = \begin{pmatrix} 0 \\ \vdots \\ 0 \\ -1 \end{pmatrix} \end{split}\]

We solve this equation for the weights, \(w_i\), and then determine the averaged Fock matrix, \(\mathbf{F}_n^\mathrm{DIIS}\), and error vector, \(\mathbf{e}_n^\mathrm{DIIS}\), which are both kept in storage for subsequent SCF iterations.

def get_DIIS_fock(F_mats, e_vecs):
    n = len(e_vecs)
    # build DIIS matrix
    B = -np.ones((n + 1, n + 1))
    B[n, n] = 0

    for i in range(n):
        for j in range(n):
            B[i,j] = np.dot(e_vecs[i], e_vecs[j])
    b = np.zeros(n + 1)
    b[n] = -1
    w = np.matmul(np.linalg.inv(B), b)

    F_ave = np.zeros((norb, norb))
    evec_ave = np.zeros(nocc * (norb - nocc))
    for i in range(n):
        F_ave += w[i] * F_mats[i]
        evec_ave += w[i] * e_vecs[i]
    F_mats[-1] = F_ave
    e_vecs[-1] = evec_ave
    return F_ave

SCF iterations#

In principle, the only needed modification in the SCF module to implement the DIIS scheme is a replacement of the original Fock matrix with the weighted averaged counterpart before the determination of the new MO coefficients. But it is also required to save Fock matrices and error vectors from previous SCF iteration. In practice, however, this extra storage requirement does not severely hamper applications, and in particular so as an optimum stabilitity in the DIIS scheme is experienced with the use of information from a limited number (about 10) of the previous iterations.

e_vecs = []
F_mats = []

# initial guess from core Hamiltonian
C = get_MO_coeff(h)

print("iter      SCF energy    Error norm")

for iter in range(max_iter):
    D = np.einsum('ik,jk->ij', C[:, :nocc], C[:, :nocc])
    J = np.einsum('ijkl,kl->ij', g, D)
    K = np.einsum('ilkj,kl->ij', g, D)
    F = h + 2*J - K
    E = np.einsum('ij,ij->', h + F, D) + V_nuc

    # compute convergence metric
    F_MO = np.einsum('ki,kl,lj->ij', C, F, C)
    e_vecs.append(np.reshape(F_MO[:nocc, nocc:], -1))
    error = np.linalg.norm(e_vecs[-1])

    print(f'{iter:>2d}  {E:16.8f}  {error:10.2e}')

    if error < conv_thresh:
        print('SCF iterations converged!')
    F = get_DIIS_fock(F_mats, e_vecs)
    C = get_MO_coeff(F)
iter      SCF energy    Error norm
 0      -68.84975229    2.23e+00
 1      -69.95937641    1.79e+00
 2      -75.55244086    7.92e-01
 3      -75.98441968    2.55e-01
 4      -76.01359974    1.37e-01
 5      -76.02627504    3.17e-02
 6      -76.02602152    3.75e-02
 7      -76.02686952    1.26e-02
 8      -76.02697700    3.01e-03
 9      -76.02698332    8.67e-04
10      -76.02698394    5.41e-04
11      -76.02698416    1.39e-04
12      -76.02698418    5.89e-05
SCF iterations converged!