# Energy gradients
(sec:energy-gradients)=

In order to perform a geometry optimization and find a minimum energy structure or a transition state,
one needs to calculate the derivative of the energy $E$ with respect to the **nuclear coordinates**.
This is usually referred to as the *molecular gradient* and it needs to vanish at any stationary geometry.
The second derivatives of the energy with respect to nuclear displacement (the molecular **Hessian**) can be used to characterize
the stationary structure, i.e., to confirm whether a true minimum or a transition state has been found.
But not only molecular gradients and Hessians can be calculated as derivatives of the energy,
also other properties such as permanent and induced (dipole) moments, polarizabilities, and magnetizabilites when taking the derivative with respect to external **electromagnetic field**, or NMR and EPR parameters when additionally **nuclear magnetic moments** are involved {cite}`jensen2006`.

The energy derivatives themselves can either be calculated **numerically** by using finite differences
or **analytically**. The former is usually simple to implement, but suffers from difficulties
in numerical accuracy and computational efficiency.
For the latter, considerable programming effort is required, but it has the advantages of greater speed, precision, and convenience.

(sec:numerical-gradients)=
## Numerical Gradients

The simplest method to calculate the derivative of the energy $E(x)$ with respect to some parameter $x$ is to use finite difference approximations.
Choosing a small change $x_0$ in $x$, a two-point estimation is given by
```{math}
%:label: eq:divided_difference
\frac{\mathrm{d} E}{\mathrm{d} x} \approx \frac{E(x + x_0) - E(x)}{x_0} 
```
which is known as a first-order **divided difference** and its error to the true derivative is approximately proportional to $x_0$.
The true derivative of $E$ at $x$ is given by the limit
```{math}
%:label: eq:true_derivative
\frac{\mathrm{d} E}{\mathrm{d} x} = \lim_{x_0 \to 0} \frac{E(x + x_0) - E(x)}{x_0} 
```

(eq:symmetric-difference-quotient)=
Another two-point formula is the **symmetric difference quotient** given by
```{math}
%:label: eq:symmetric_difference_quotient
\frac{\mathrm{d} E}{\mathrm{d} x} \approx \frac{E(x + x_0) - E(x - x_0)}{2x_0} 
```
where it can be shown that the first-order error cancels, so it is approximately proportional to $x_0^2$,
which means that for small $x_0$ this is a more accurate approximation to the true derivative
and it is commonly used in numerical derivative codes. VeloxChem uses a default value of $0.001~a_0$.
However, in both cases two calculations of the energy $E(x)$ need to be performed in order to obtain
the derivative with respect to one variable $x$.

There are also higher-order methods for approximating the first derivative, such as the "five-point method" given by
```{math}
%:label: eq:five_point_method
\frac{\mathrm{d} E}{\mathrm{d} x} \approx \frac{E(x - 2 x_0) - 8 E(x - x_0) + 8 E(x + x_0) - E(x + 2 x_0)}{12x_0} 
```
where the error is approximately proportional to $x_0^4$ and it hence gives a very accurate approximation to the gradient.
However, since four individual energy calculations need to be performed for each perturbation,
this expression is usually only employed for debugging of the analytical derivative expressions.





(sec:analytical-gradients)=
## Analytical Gradients
%To add: Introduction, approaches to derive analytical gradients; Lagrangian approach.

To determine the energy gradient analytically, we first need to identify the non-variational components of the energy functional. In the case of self-consistent field (SCF)
methods, i.e., HF and Kohn--Sham DFT, these are the molecular-orbital (MO) coefficients, which exhibit an implicit dependence on the nuclear coordinates when atom-centered basis functions such as Gaussian or Slater-type atomic orbitals (AOs) are used. Considering this implicit dependence, the total energy derivative with respect to a particular nuclear coordinate $x$ is obtained via the chain rule {cite}`Rehn2015,Levchenko2005`
%
```{math}
%:label: eq:energy_functional_general
\frac{\mathrm{d}E}{\mathrm{d}x}=\frac{\partial E}{\partial x}+\frac{\partial E}{\partial\mathbf{C}}\frac{\mathrm{d}\mathbf{C}}{\mathrm{d}x} 
```
%
Here, $\mathbf{C}$ is the MO coefficient matrix, which transforms from a set of AOs $\{\chi_\mu\}$ to a set of MOs $\{\phi_p\}$ via

$$
\phi_p=\sum_\mu C_{\mu p}\chi_\mu 
$$

The first term, $\partial E/\partial x$, is the Hellmann--Feynman contribution which describes the explicit dependence of the energy on the nuclear coordinate $x$ through the nuclear-electron and nuclear-nuclear interaction terms of the Hamiltonian {cite}`Levchenko2005_thesis, Helgaker1988_analytical`. The second term stems from the implicit dependence of the energy on $x$ due to the fact that the molecular orbitals are expanded in a finite atomic-centered basis set {cite}`Helgaker1988_analytical` 
%This term would vanish if a basis set independent on atomic positions, for example a plane wave basis, were used.

It may seem at first surprising that the derivative ${\partial E}/{\partial\mathbf{C}}$ has to be computed. If the MO coefficients are obtained variationally for a specified molecular geometry, how is it that this derivative is not zero? The key to this conundrum lies in the phrase "for a specified molecular geometry". Since the SCF energy and density are constructed using a constrained LCAO parametrization, if we perform a nuclear displacement, the "old" MO coefficients no longer correspond to the minimum energy and must be re-optimized. Thus the partial derivative with respect to the MO coefficients, as well as the derivative of the MO coefficients with respect to $x$ are required. The explicit computation of the latter is complicated, but can be avoided by making use of a new functional, the Lagrangian, for which the partial derivative $\partial L / \partial \mathbf{C}$ is zero and constrained to the HF/DFT configuration space by construction {cite}`Levchenko2005, Helgaker2014`
%
```{math}
%:label: eq:Lagrangian_general
L(\mathbf{C},\boldsymbol{\Lambda})=E(\mathbf{C})+\boldsymbol{\Lambda}f_c(\mathbf{C}) 
```
%
where $\boldsymbol{\Lambda}$ are a set of undetermined Lagrange multipliers and $f_c(\mathbf{C})=0$ define the constraints for the non-variational parameters $\mathbf{C}$. These constraints ensure that we are moving only in the HF/DFT configuration space, rather than in the infinite space of all orthogonally equivalent combinations of orbital bases {cite}`Helgaker1988_analytical`.
%From Helgaker and Jorgensen:"At each geometry we have a set of AOs  from  which an infinite set of orthogonally equivalent orbital bases can be constructed. As the geometry changes we must pick out exactly one of these orbital bases at each geometry $\mathbf{X}$. In this way an orthogonal  orbital connection is established. (A connection is called orthogonal if it  preserves orthonormality between the orbitals.) We further  require that the connection is continuous and differentiable. One may also wish to impose an additional requirement on the connection, namely that it  is translationally and rotationally  invariant. This may seem to be a trivial requirement. However, a connection is conveniently defined in terms of atomic Cartesian displacements rather than in terms of a set of nonredundant internal coordinates. This implies that each molecular geometry may be described in an infinite number of translationally and rotationally equivalent ways"

By using the Lagrangian, we have shifted the difficult problem of computing ${\mathrm{d} \mathbf{C}}/{\mathrm{d}x}$ to the much simpler problem of determining the unknown Lagrange multipliers which satisfy $\partial L / \partial \mathbf{C}=0$. Equations for these are derived by imposing that the explicit form of $\partial L / \partial \mathbf{C}$ is zero {cite}`Helgaker2014`. Once the Lagrange multipliers have been obtained, the total derivative of the energy functional with respect to the nuclear coordinate $x$ can be computed as
\begin{equation*}
\frac{\mathrm{d}E}{\mathrm{d}x}=\frac{\mathrm{d}L}{\mathrm{d}x}=\frac{\partial L}{\partial x}
\end{equation*}




### Ground state

(sec:hf-gradients)=
#### HF
As an example, we will derive in the following the analytical expression for the Hartree--Fock (HF) energy gradient.
The electronic ground-state energy described at the HF level of theory can be written as
(eq:HF_energy_fct)=
 ```{math}
%:label: eq:HF_energy_fct
E_\mathrm{HF}=\sum_{i}F_{ii}-\frac{1}{2}\sum_{ij} \langle ij || ij \rangle
```
where $F_{pq}$ are Fock matrix elements, $\langle pq || rs \rangle$ are anti-symmetrized two-electron integrals in physicists' ("1212") notation {cite}`Szabo2012`. As usual, the indices $i,j,\ldots$ denote occupied molecular orbitals, $a,b,\ldots$ denote unoccupied (virtual) ones, and $p,q,\ldots$ stand for both occupied and virtual, while Greek letters $\mu,\nu,\ldots$ are used for AO indices.

The Lagrangian for this energy functional is constructed as follows

(eq:hf-lagrangian)=
```{math}
%:label: eq:HF_Lagrangian
L(\mathbf{C},\boldsymbol{\Lambda},\boldsymbol{\Omega})=E_\mathrm{HF}+\sum_{p,q}\lambda_{pq}\left(F_{pq}-\delta_{pq}\epsilon_p\right)+\sum_{p,q}\omega_{pq}\left(S_{pq}-\delta_{pq}\right) 
```
with $\boldsymbol{\Lambda}=\{\lambda_{pq}\}$ and $\boldsymbol{\Omega}=\{\omega_{pq}\}$ being the Lagrange multipliers, $\mathbf{F}=\{F_{pq}\}$ the Fock matrix, $\{\epsilon_p\}$ the HF orbital energies, and $\mathbf{S}=\{S_{pq}\}$ the overlap matrix.
Here, the conditions for the Lagrange multipliers $\{\lambda_{pq}\}$ and $\{\omega_{pq}\}$ ensure that the Fock matrix is diagonal and, respectively, the overlap matrix is unity for the HF state.

To calculate the total derivative of the energy with respect to $x$ we must now only calculate the partial derivative of the Lagrangian with respect to the same variable
(eq:partial_L)=
```{math}
%:label: eq:partial_L
\frac{\partial L}{\partial x}=\frac{\partial E_\mathrm{HF}}{\partial x}+\sum_{p,q}\lambda_{pq}\frac{\partial F_{pq}}{\partial x}+\sum_{p,q}\omega_{pq}\frac{\partial S_{pq}}{\partial x}
```
We want to re-write the above derivative of the Lagrangian in terms of effective density matrices.
For this purpose, we express the energy in terms of the one- and two-particle density matrices, $\boldsymbol{\gamma} = \{\gamma_{pq}\}$
and $\boldsymbol{\Gamma} = \{\Gamma_{pqrs}\}$, respectively
(eq:EHF_DM)=
```{math}
%:label: eq:EHF_DM
E_\mathrm{HF}=\sum_{p,q}F_{pq}\gamma_{pq}+\frac{1}{4}\sum_{pqrs}\Gamma_{pqrs}\langle pq || rs \rangle
```

With this definition, the [above equation](eq:partial_L) becomes
(eq:partial_L_final)=
```{math}
%:label: eq:partial_L_final
\frac{\partial L}{\partial x}&=\sum_{p,q}(\lambda_{pq}+\gamma_{pq}) F^{(x)}_{pq}+\frac{1}{4}\sum_{pqrs}\Gamma_{pqrs}\braket{pq||rs}^{(x)}+\sum_{p,q}\omega_{pq}S^{(x)}_{pq}\nonumber\\
&=\sum_{p,q}(\lambda_{pq}+\gamma_{pq}) h^{(x)}_{pq}+\sum_{p,q}(\lambda_{pq}+\gamma_{pq})\sum_{i}\braket{pi||qi}^{(x)}+\frac{1}{4}\sum_{pqrs}\Gamma_{pqrs}\braket{pq||rs}^{(x)}+\sum_{p,q}\omega_{pq}S^{(x)}_{pq}
```
where $h_{pq}$ represents a matrix element of the core-Hamiltonian operator. The superscript $(\xi)$ indicates a partial derivative with respect to variable $x$,
i.e., with fixed MO coefficients $\mathbf{C}$. Explicitly, they are given by {cite}`Levchenko2005`

(eq:hpq_Spq)=
```{math}
%:label: eq:hpq
h_{pq}^{(x)}&=\frac{\partial h_{pq}}{\partial x}=\sum_{\mu,\nu}C_{\mu p}h^{x}_{\mu\nu}C_{\nu q} \\
h^{x}_{\mu\nu}&=\bra{\phi_\mu}\frac{\partial \hat{h}}{\partial x}\ket{\phi_\nu}+\bra{\frac{\partial\phi_\mu}{\partial x}}\hat{h}\ket{\phi_\nu}+\bra{\phi_\mu}\hat{h}\ket{\frac{\partial\phi_\nu}{\partial x}} \\
\braket{pq||rs}^{(x)}&=\frac{\partial \braket{pq||rs}}{\partial x}=\sum_{\mu,\nu,\theta,\sigma}C_{\mu p}C_{\nu q}\braket{\phi_{\mu}\phi_{\nu}||\phi_{\theta}\phi_{\sigma}}^{x}C_{\theta r}C_{\sigma s} \\
\braket{\phi_{\mu}\phi_{\nu}||\phi_{\theta}\phi_{\sigma}}^{x}&=\braket{\frac{\partial\phi_{\mu}}{\partial x}\phi_{\nu}||\phi_{\theta}\phi_{\sigma}}+\braket{\phi_{\mu}\frac{\partial\phi_{\nu}}{\partial x}||\phi_{\theta}\phi_{\sigma}}+\braket{\phi_{\mu}\phi_{\nu}||\frac{\partial\phi_{\theta}}{\partial x}\phi_{\sigma}}\nonumber + \braket{\phi_{\mu}\phi_{\nu}||\phi_{\theta}\frac{\partial\phi_{\sigma}}{\partial x}} \\
S_{pq}^{(x)} &=\frac{\partial S_{pq}}{\partial x}=\sum_{\mu,\nu}C_{\mu p}S_{\mu\nu}^x C_{\nu q} \\
S_{\mu\nu}^x &=\braket{\frac{\partial\phi_\mu}{\partial x}|\phi_\nu}+\braket{\phi_\mu|\frac{\partial\phi_\nu}{\partial x}} 
```

We also made use of the definition of the Fock matrix {cite}`Szabo2012` 
```{math}
%:label: eq:FockMatEl
F_{pq}=h_{pq}+\sum_i\braket{pi||qi}
```
[This equation](eq:partial_L_final) is the working equation to compute the partial derivative of the Lagrangian with respect to variable $x$, and implicitly the equation for the total derivative of the energy with respect to the same variable.
Two ingredients are required to calculate $\partial L/\partial x$: (1) the derivatives of the core-Hamiltonian matrix, of the anti-symmetrized two-electron integrals, and of the overlap matrix,
and (2) finding the density matrices $\boldsymbol{\gamma}$ and $\boldsymbol{\Gamma}$, as well as the Lagrange multipliers $\boldsymbol{\Lambda}$ and $\boldsymbol{\Omega}$.

By comparing [this equation](eq:EHF_DM) to [the HF energy expression](eq:HF_energy_fct), we can immediately identify the non-vanishing blocks of the density matrices
```{math}
%:label: eq:HF_gamma_ij
\gamma_{ij} &= \delta_{ij} \\
\Gamma_{ijkl} &= -2\delta_{ik}\delta_{jl}
```
Equations for the $\{\lambda_{pq}\}$ and $\{\omega_{pq}\}$ multipliers are obtained by imposing the Lagrangian to be stationary with respect to the orbital transformation matrix $\{C_{\mu t}\}$
```{math}
%:label: eq:OrbitalRspCond
\frac{\partial L}{\partial C_{\mu t}}=0 
```
or the programmable version {cite}`Levchenko2005,Rehn2019`
```{math}
%:label: eq:OrbitalRspCond_program
\sum_{\mu}C_{\mu u}\frac{\partial L}{\partial C_{\mu t}}=0
```
To calculate the partial derivative of the Lagrangian with respect to $C_{\mu t}$, we will need the following three expressions
```{math}
%:label: eq:derivFock
\sum_{\mu}C_{\mu u}\frac{\partial F_{pq}}{\partial C_{\mu t}}&=\sum_{\mu}C_{\mu u}\left(\frac{\partial h_{pq}}{\partial C_{\mu t}}+\sum_i\frac{\partial \braket{pi||qi}}{\partial C_{\mu t}}\right)\nonumber\\
&=h_{uq}\delta_{pt}+h_{pu}\delta_{qt}+\sum_i\braket{ui||qi}\delta_{pt}+\sum_i\braket{pi||ui}\delta_{qt}\nonumber\\
&\quad + \sum_i\braket{pu||qi}\delta_{it}+\sum_i\braket{pi||qu}\delta_{it}\nonumber \\
&=F_{uq}\delta_{pt}+F_{pu}\delta_{qt}+\braket{pu||qt}\delta_{t\epsilon_o}+\braket{pt||qu}\delta_{t\epsilon_o} \label{eq:derivFock} \\
\sum_{\mu}C_{\mu u}\frac{\partial \braket{pq||rs}}{\partial C_{\mu t}} &= \braket{uq||rs}\delta_{pt}+\braket{pu||rs}\delta_{qt}+\braket{pq||us}\delta_{rt}+\braket{pq||ru}\delta_{st} \\
\sum_{\mu}C_{\mu u}\frac{\partial S_{pq}}{\partial C_{\mu t}} &= S_{uq}\delta_{pt}+S_{pu}\delta_{qt} \label{eq:derivOverlap}
```
where we have used the definitions of the Fock matrix, two-electron integrals and overlap matrix in terms of the orbital transformation matrix $\{C_{\mu p}\}$ -- see [here](eq:hpq_Spq). The Kronecker delta $\delta_{t\epsilon_\mathrm{o}}$ is equal to one if $t$ is an occupied orbital and is zero otherwise.

Using the Lagrangian expressed in terms of the density matrices:
```{math}
%:label: eq:L_DMs
L=\sum_{p,q}\gamma_{pq}F_{pq}+\sum_{p,q}\lambda_{pq}(F_{pq}-\delta_{pq}\epsilon_p)+\frac{1}{4}\sum_{pqrs}\Gamma_{pqrs}\braket{pq||rs}+\sum_{p,q}\omega_{pq}(S_{pq}-\delta_{pq})
```

the partial derivative of the Lagrangian with respect to $\mathbf{C}$ can be written as:
```{math}
%:label: eq:OrbRsp1
\sum_{\mu}C_{\mu u}\frac{\partial L}{\partial C_{\mu t}}&=\sum_{p,q}\left(\gamma_{pq}+\lambda_{pq}\right)\left[F_{uq}\delta_{pt}+F_{pu}\delta_{qt}+\braket{pu||qt}\delta_{t\epsilon_o}+\braket{pt||qu}\delta_{t\epsilon_o}\right]\nonumber \\
&+\frac{1}{4}\sum_{p,q,r,s}\Gamma_{pqrs}\left[\braket{uq||rs}\delta_{pt}+\braket{pu||rs}\delta_{qt}+\braket{pq||us}\delta_{rt}+\braket{pq||ru}\delta_{st}\right]\nonumber\\
&+\sum_{p,q}\omega_{pq}\left(S_{uq}\delta_{pt}+S_{pu}\delta_{qt}\right)
```
By using the conditions $F_{pq}=\epsilon_p\delta_{pq}$ and $S_{pq}=\delta_{pq}$, the above equation becomes:
```{math}
%:label: eq:OrbRsp2
\sum_{\mu}C_{\mu u}\frac{\partial L}{\partial C_{\mu t}}&=\sum_{p,q}\left(\gamma_{pq}+\lambda_{pq}\right)\left[\epsilon_u\delta_{uq}\delta_{pt}+\epsilon_u\delta_{pu}\delta_{qt}+\braket{pu||qt}\delta_{t\epsilon_o}+\braket{pt||qu}\delta_{t\epsilon_o}\right]\nonumber \\
&+\frac{1}{4}\sum_{p,q,r,s}\Gamma_{pqrs}\left[\braket{uq||rs}\delta_{pt}+\braket{pu||rs}\delta_{qt}+\braket{pq||us}\delta_{rt}+\braket{pq||ru}\delta_{st}\right]\nonumber\\
&+\sum_{p,q}\omega_{pq}\left(\delta_{uq}\delta_{pt}+\delta_{pu}\delta_{qt}\right) \nonumber \\
&=2\left(\gamma_{ut}+\lambda_{ut}\right)\epsilon_u+2\sum_{p,q}\left(\gamma_{pq}+\lambda_{pq}\right)\braket{pu||qt}\delta_{t\epsilon_o}+\sum_{p,q,r}\Gamma_{tpqr}\braket{up||qr}+2\omega_{ut}
```
where we have used $\gamma_{pq}=\gamma_{qp}$, $\braket{pu||qt}=\braket{qt||pu}$, $\Gamma_{pqrs}=\Gamma_{qpsr}=\Gamma_{srpq}$ (real orbitals), and we have imposed that $\lambda_{pq}=\lambda_{qp}$, and $\omega_{pq}=\omega_{qp}$ (symmetric representation). Some of the indices have been renamed.

To obtain equations for the orbital response Lagrange multipliers, we first have to decouple $\boldsymbol{\Lambda}$ from $\boldsymbol{\Omega}$ by taking the difference
```{math}
%:label: eq:decouple
\sum_{\mu}C_{\mu u}\frac{\partial L}{\partial C_{\mu t}}-\sum_{\mu}C_{\mu t}\frac{\partial L}{\partial C_{\mu u}}&=2(\gamma_{ut}+\lambda_{ut})(\epsilon_u-\epsilon_t)\nonumber\\&+2\sum_{p,q}(\gamma_{pq}+\lambda_{pq})\left(\braket{pu||qt}\delta_{t\epsilon_o}-\braket{pt||qu}\delta_{u\epsilon_o}\right)\nonumber\\
&+\,\,\,\,\sum_{p,q,r}\left(\Gamma_{tpqr}\braket{up||qr}-\Gamma_{upqr}\braket{tp||qr}\right)
```
The system of equations for $\boldsymbol{\Lambda}$ is then obtained by choosing $u$ and $t$ from different orbital spaces in the following equation
(eq:OrbRspEq)=
```{math}
%:label: eq:OrbRspEq
&2(\gamma_{ut}+\lambda_{ut})(\epsilon_u-\epsilon_t)+2\sum_{p,q}(\gamma_{pq}+\lambda_{pq})\left(\braket{pu||qt}\delta_{t\epsilon_o}-\braket{pt||qu}\delta_{u\epsilon_o}\right)\nonumber\\
&+\sum_{p,q,r}\left(\Gamma_{tpqr}\braket{up||qr}-\Gamma_{upqr}\braket{tp||qr}\right)=0
```
Once $\boldsymbol{\Lambda}$ is determined, $\boldsymbol{\Omega}$ can be calculated in a similar way, using the following equation

(eq:omega)=
```{math}
%:label: eq:omega
2\left(\gamma_{ut}+\lambda_{ut}\right)\epsilon_u+2\sum_{p,q}\left(\gamma_{pq}+\lambda_{pq}\right)\braket{pu||qt}\delta_{t\epsilon_o}+\sum_{p,q,r}\Gamma_{tpqr}\braket{up||qr}+2\omega_{ut}=0
```

If we explicitly write the equations for different blocks of $\boldsymbol{\Lambda}$, we find that they are all zero. This simplifies the equation for $\boldsymbol{\Omega}$ to
```{math}
%:label: eq:omega_HF
2\gamma_{ut}\epsilon_u+2\sum_{p,q}\gamma_{pq}\braket{pu||qt}\delta_{t\epsilon_o}+\sum_{p,q,r}\Gamma_{tpqr}\braket{up||qr}+2\omega_{ut}=0
```
The only non-zero block of $\boldsymbol{\Omega}$ is the occupied-occupied block
```{math}
%:label: eq:omega_hf_oo
&u=i,\, t=j \nonumber\\
&2\gamma_{ij}\,\epsilon_i+2\sum_{p,q}\gamma_{pq}\braket{pi||qj}\delta_{j\epsilon_o}+\sum_{p,q,r}\Gamma_{jpqr}\braket{ip||qr}+2\omega_{ij} = 0 \nonumber \\
&\Leftrightarrow
2\delta_{ij}\epsilon_i+2\sum_{k,l}\delta_{kl}\braket{ki||lj}+\sum_{klm}(-2\delta_{jl}\delta_{km})\braket{ik||lm}+2\omega_{ij} =0\nonumber \\
&\Leftrightarrow
\omega_{ij}=-\epsilon_i\delta_{ij}
```

(eq:HF_grad_final)=
Using the expressions for the density matrices and the non-zero Lagrange multipliers, we get the following expression for the electronic HF energy derivative:
```{math}
%:label: eq:HF_grad_final
\frac{\mathrm{d}E_{\text{HF}}}{\mathrm{d} x}&=\sum_{i,j}\delta_{ij} h^{(x)}_{ij}+\sum_{i,j}\delta_{ij}\sum_{k}\braket{ik||jk}^{(x)} - \frac{1}{2}\sum_{i,j,k,l}\delta_{ik}\delta_{jl}\braket{ij||kl}^{(x)}-\sum_{i,j}\epsilon_{i}\delta_{ij}S^{(x)}_{ij}\nonumber \\
&=\sum_{i}h^{(x)}_{ii}+\frac{1}{2}\sum_{i,j}\braket{ij||ij}^{(x)}-\sum_{i}\epsilon_{i}S^{(x)}_{ii}
```
Finally, the derivative of the total energy is obtained by adding the trivial contribution from the nuclear repulsion energy term {cite}`Szabo2012` $\mathrm{d} V_{nn}/\mathrm{d} x$.



(sec:dft-gradients)=
#### DFT

The [DFT](kohn-sham-equation) gradient can be derived in a similar way, (partially) replacing the exact exchange integrals with the corresponding exchange-correlation (xc) functional contributions.
Here, we only give equations for the simplest case of the [local density approximation](lda) (LDA),
but the approach is easily generalizable to other types of xc functionals.
Instead of using the Kohn--Sham (KS) matrix in the energy expression, we employ the core Hamiltonian

$$
E_{\text{DFT}} = \sum_{i} h_{ii} + \sum_{i<j} \big( \langle ij | ij \rangle - c_{\text{x}} \langle ij | ji \rangle \big) + E_{\text{xc}} 
$$

where $0 \leq c_{\text{x}} < 1$ is the fraction of exact exchange in [hybrid functionals](sec:hybrid-functionals) and $E_{\text{xc}}$ is the exchange-correlation energy contribution, which is often written in the form

(eq:xc-energy)=

$$
E_{\text{xc}} = \int e_{\text{xc}}[n(\mathbf{r})] \mathrm{d} \mathbf{r} 
$$

Setting up the Lagrangian in the same way as for [HF](eq:hf-lagrangian) with $E_{\text{HF}}$ replaced by $E_{\text{DFT}}$ yields the same results for the Lagrange multipliers.
Thus, the only additional consideration we need to take into account is the partial derivative of $E_{\text{xc}}$.
For this, we consider the [KS electron density](eq:ks-density)

$$
n(\mathbf{r}) = \sum_{i} | \phi_i(\mathbf{r}) |^2 = \sum_{\mu \nu} D_{\mu \nu} \chi_{\mu}(\mathbf{r}) \chi_{\nu}(\mathbf{r}) 
$$

where $\mathbf{D}$ is the AO density matrix given in terms of the (real) MO coefficients as $D_{\mu \nu} = \sum_{i} C_{\mu i} C_{\nu i}$.
The partial derivative of $E_{\text{xc}}$ with respect to $\xi$ can thus be written as

$$
E_{\text{xc}}^{(x)} = \frac{\partial E_{\text{xc}}}{\partial x} = \int \frac{\partial e_{\text{xc}}}{\partial n(\mathbf{r})} n^{(x)}(\mathbf{r}) \mathrm{d} \mathbf{r} 
$$

where the partial derivative of the density is given by

(eq:density-partial-deriv)=
$$
n^{(x)} = \frac{\partial n(\mathbf{r})}{\partial x} = \sum_{\mu \nu} D_{\mu \nu} \frac{\partial (\chi_{\mu} \chi_{\nu})}{\partial x} 
$$

It should be noted that the exchange-correlation functional contribution to the DFT energy and its molecular gradient is evaluated via [numerical integration](sec:kernel-integration).
Thus, the molecular gradient includes grid point weight contributions, which arise from the explicit dependence of the grid partitioning function on the molecular geometry.
Neglecting these contributions to the molecular gradient leads to the breakdown of rotation-translation invariance of the molecular gradient.
Despite this, if a fine integration grid is used in practical calculations, the grid point weight contribution to the molecular gradient can safely be neglected.

% Could numerical examples be a good addition here?


(sec:mp2-gradients)=
#### MP2
In the case of MÃ¸ller--Plesset (MP) perturbation theory as well as coupled-cluster theory, the energy functional has additional non-variational parameters that have to be considered when computing the gradient.
These are the so-called $t$-amplitudes $\mathbf{T} = \{ t_{ijab} \}$, so the corresponding term which has to be determined is called amplitude response.
```{math}
%:label: eq:energy_functional_MP
\frac{\mathrm{d}E}{\mathrm{d} x}=\frac{\partial E}{\partial x}+\frac{\partial E}{\partial\mathbf{C}}\frac{\mathrm{d}\mathbf{C}}{\mathrm{d}x}+\frac{\partial E}{\partial\mathbf{T}}\frac{\mathrm{d}\mathbf{T}}{\mathrm{d}x} 
```
The analytic expression for the MP energy gradient is obtained in a very similar way as we have done for the SCF ground state. The difference is that the Lagrangian contains additional Lagrange multipliers and constraints for the $t$-amplitudes. After obtaining the corresponding amplitude response Lagrange multipliers, these additional contributions will be written in terms of one- and two-particle density matrices, exactly as the total energy. Let's illustrate the procedure for the second-order [MP perturbation theory](sec:mp2) (MP2).
At this level of theory, the total energy functional can be [written as](https://kthpanor.github.io/echem/docs/elec_struct/mp2.html)

(eq:MP2_energy_fct)=
```{math}
%:label: eq:MP2_energy_fct
E_\mathrm{MP2} = E_\mathrm{HF}+E_0^{(2)} = \sum_{i}F_{ii}-\frac{1}{2}\sum_{ij}\braket{ij||ij}-\frac{1}{4}\sum_{i,j,a,b}\braket{ij||ab}t_{ijab} \label{eq:MP2_energy_fct}
```
where
```{math}
%:label: eq:MP2_tamplitudes
t_{ijab} = \frac{\braket{ij||ab}}{\epsilon_a+\epsilon_b-\epsilon_i-\epsilon_j} \label{eq:MP2_tamplitudes}
```

are the MP2 $t$-amplitudes.

The Lagrangian corresponding to this energy functional is

(eq:MP2_Lagrangian)=
```{math}
%:label: eq:MP2_Lagrangian
L(\mathbf{C}, \mathbf{T},\boldsymbol{\Lambda},\boldsymbol{\Omega},\mathbf{\tilde{T}})=E_\mathrm{0}+\sum_{p,q}\lambda_{pq}\left(F_{pq}-\delta_{pq}\epsilon_p\right)+\sum_{p,q}\omega_{pq}\left(S_{pq}-\delta_{pq}\right)+\sum_{i,j,a,b}\tilde{t}_{ijab}f_t(t_{ijab})
```
Here, $\tilde{\mathbf{T}}=\{\tilde{t}_{ijab}\}$ are the amplitude response Lagrange multipliers and $f_t(\mathbf{T})=0$ is the constraint. For MP2, this is
```{math}
%:label: eq:t_condition
f_t(\mathbf{T})=t_{ijab}\left(\epsilon_a+\epsilon_b-\epsilon_i-\epsilon_j\right)-\braket{ij||ab}
```

The amplitude response Lagrange multipliers are determined by imposing the Lagrangian to be stationary with respect to the $t$-amplitudes.
```{math}
%:label: eq:amplitude_rsp_general
\frac{\partial L}{\partial \mathbf{T}}=0
```
Replacing $L$ and $\mathbf{T}$ in the equation above with the corresponding MP2 expressions, we get
```{math}
%:label: eq:t_cond_explicit
\frac{\partial L}{\partial t_{ijab}}&=\frac{\partial E_\mathrm{MP2}}{\partial t_{ijab}}+\sum_{k,l,c,d}\frac{\partial \tilde{t}_{klcd}f(t_{klcd})}{\partial t_{ijab}}\nonumber\\
&=-\frac{1}{4}\sum_{k,l,c,d}\braket{kl||cd}\frac{\partial t_{klcd}}{\partial t_{ijab}}+\sum_{k,l,c,d}\tilde{t}_{klcd}\frac{\partial t_{klcd}}{\partial t_{ijab}}\left(\epsilon_c+\epsilon_d-\epsilon_k-\epsilon_l\right)\nonumber\\
&=-\braket{ij||ab}+4\tilde{t}_{ijab}\left(\epsilon_a+\epsilon_b-\epsilon_i-\epsilon_j\right) \label{eq:t_cond_explicit}
```
From the two equations above it follows that
(eq:ampl_rsp_multipliers)=
```{math}
%:label: eq:ampl_rsp_multipliers
\tilde{t}_{ijab} = \frac{1}{4}\frac{\braket{ij||ab}}{\epsilon_a+\epsilon_b-\epsilon_i-\epsilon_j}=\frac{1}{4}t_{ijab}
```
From here, we can follow the same procedure as we did for the SCF gradient. We first rewrite the Lagrangian in terms of one- and two-particle density matrices
```{math}
%:label: eq:L_DMs_MP
L&=\sum_{p,q}\gamma_{pq}F_{pq}+\sum_{p,q}\lambda_{pq}(F_{pq}-\delta_{pq}\epsilon_p)+\frac{1}{4}\sum_{pqrs}\Gamma_{pqrs}\braket{pq||rs}\nonumber\\
&+\sum_{p,q}\omega_{pq}(S_{pq}-\delta_{pq})+\sum_{i,j,a,b}\tilde{t}_{ijab}\left[t_{ijab}\left(\epsilon_a+\epsilon_b-\epsilon_i-\epsilon_j\right)-\braket{ij||ab}\right]\nonumber\\
&=\sum_{p,q}\gamma_{pq}F_{pq}+\sum_{p,q}\lambda_{pq}(F_{pq}-\delta_{pq}\epsilon_p)+\frac{1}{4}\sum_{pqrs}\Gamma_{pqrs}\braket{pq||rs}\nonumber\\
&+\sum_{p,q}\omega_{pq}(S_{pq}-\delta_{pq})+\sum_{p,q}\gamma^\mathrm{A}_{pq}F_{pq}+\frac{1}{4}\sum_{pqrs}\Gamma^\mathrm{A}_{pqrs}\braket{pq||rs}
```
where we have written also the amplitude contribution in terms of one- and two-particle density matrices, $\gamma^\mathrm{A}_{pq}$ and $\Gamma^\mathrm{A}_{pqrs}$ respectively.
Denoting $\boldsymbol{\gamma}'=\boldsymbol{\gamma}+\boldsymbol{\gamma}^\mathrm{A}$ and $\boldsymbol{\Gamma}'=\boldsymbol{\Gamma}+\boldsymbol{\Gamma}^\mathrm{A}$, the Lagrangian becomes
```{math}
%:label: eq:final_L_DMs_MP
L=\sum_{p,q}\gamma'_{pq}F_{pq}+\sum_{p,q}\lambda_{pq}(F_{pq}-\delta_{pq}\epsilon_p)+\frac{1}{4}\sum_{pqrs}\Gamma'_{pqrs}\braket{pq||rs}+\sum_{p,q}\omega_{pq}(S_{pq}-\delta_{pq})
```
To be able to obtain the gradient, we now must identify the density matrices and then solve the orbital response equations. The density matrices corresponding to the HF contribution are the same as in the previous section. There are additional contributions from the MP2 energy correction, as well as the amplitude response terms. The MP2 energy contribution can be easily identified from the last term of the corresponding [explicit expression](eq:MP2_energy_fct) and gives rise to the following two-particle density matrix: 
```{math}
%:label: eq:MP2_2pdm_oovv
\Gamma_{ijab} = -\frac{1}{2} t_{ijab} 
```
The amplitude response ($R^\mathrm{A}_\mathrm{MP2}$) contributions are also reasonably easy to identify
```{math}
%:label: eq:RA_explicit
R^\mathrm{A}_\mathrm{MP2}&=\sum_{i,j,a,b}\tilde{t}_{ijab}\left[t_{ijab}(\epsilon_a+\epsilon_b-\epsilon_i-\epsilon_j)-\braket{ij||ab}\right]\nonumber\\
&=-\sum_{i,j,a,b}\braket{ij||ab}\tilde{t}_{ijab}+\sum_{i,j,a,b}\tilde{t}_{ijab}\left[\sum_{c}\left(\epsilon_{a}\delta_{ac}+\epsilon_b\delta_{bc}\right)t_{ijab}-\sum_k\left(\epsilon_{i}\delta_{ik}+\epsilon_j\delta_{jk}\right)t_{ijab}\right]\nonumber\\
&=-\sum_{i,j,a,b}\braket{ij||ab}\tilde{t}_{ijab}+\sum_{i,j,a,b}\tilde{t}_{ijab}\left(\sum_c F_{ac}t_{ijcb}+\sum_c f_{bc}t_{ijac}-\sum_k f_{ik}t_{kjab}-\sum_k F_{jk}t_{ikab}\right)\nonumber \\
\nonumber \\
&\Downarrow \mathrm{renaming\,\, indices} \nonumber \\
R^\mathrm{A}_\mathrm{MP2}&=-\sum_{i,j,a,b}\braket{ij||ab}\tilde{t}_{ijab}+\sum_{a,b}F_{ab}\sum_{i,j,c}\left(\tilde{t}_{ijac}t_{ijbc}+\tilde{t}_{ijbc}t_{ijac}\right)\nonumber\\
&-\sum_{i,j}f_{ij}\sum_{k,a,b}\left(\tilde{t}_{ikab}t_{jkab}+\tilde{t}_{jkab}t_{ikab}\right)\,,\label{eq:RA_explicit}
```
resulting in the following density matrices
```{math}
%:label: eq:gammaA_ij
\gamma_{ij}^\mathrm{A}=-\sum_{k,a,b}\left(\tilde{t}_{ikab}t_{jkab}+\tilde{t}_{jkab}t_{ikab}\right)
```
```{math}
%:label: eq:gammaA_ab
\gamma_{ab}^\mathrm{A}=\sum_{i,j,c}\left(\tilde{t}_{ijac}t_{ijbc}+\tilde{t}_{ijbc}t_{ijac}\right)
```
```{math}
%:label: eq:GammaA_ijab
\Gamma_{ijab}^\mathrm{A}=-2\,\tilde{t}_{ijab} 
```
Combining all density matrices together and replacing the amplitude response Lagrange multipliers with the corresponding [explicit expression](eq:ampl_rsp_multipliers), we have
```{math}
%:label: eq:mp2_gamma_ij
\gamma'_{ij}=\gamma_{ij}+\gamma_{ij}^\mathrm{A}=\delta_{ij}-\sum_{k,a,b}\left(\tilde{t}_{ikab}t_{jkab}+\tilde{t}_{jkab}t_{ikab}\right)=\delta_{ij}-\frac{1}{2}\sum_{k,a,b}t_{ikab}t_{jkab}  
```
```{math}
%:label: eq:mp2_gamma_ab
\gamma'_{ab}=\gamma_{ab}^\mathrm{A}=\sum_{i,j,c}\left(\tilde{t}_{ijac}t_{ijbc}+\tilde{t}_{ijbc}t_{ijac}\right)=\frac{1}{2}\sum_{i,j,c}t_{ijac}t_{ijbc}
```
```{math}
%:label: eq:mp2_Gamma_ijkl
\Gamma'_{ijkl} = \Gamma_{ijkl} = -2\delta_{ik}\delta_{jl}
```
```{math}
%:label: eq:mp2_Gamma_ijab
\Gamma'_{ijab} = \Gamma_{ijab}+\Gamma_{ijab}^\mathrm{A}=-\frac{1}{2}t_{ijab}-2\,\tilde{t}_{ijab}=-t_{ijab} 
```
Finally, to determine the orbital response Lagrange multipliers $\boldsymbol{\Lambda}$, we insert these density matrices into the [orbital response equation](eq:OrbRspEq).
The only non-zero block of $\boldsymbol\Lambda$ is the occupied-virtual block and the HF density matrices cancel out, so the orbital response equation is
```{math}
%:label: eq:lambda_mp2_ov
&u=i,\, t=a \nonumber\\
&2\lambda_{ia}(\epsilon_i-\epsilon_a)-2\sum_{p,q}(\gamma'_{pq}+\lambda_{pq})\braket{pa||qi}\delta_{i\epsilon_o}\nonumber+\sum_{p,q,r}\left(\Gamma'_{apqr}\braket{ip||qr}-\Gamma'_{ipqr}\braket{ap||qr}\right)=0\nonumber \\
&{^{(1)}}\Leftrightarrow 2\lambda_{ia}(\epsilon_i-\epsilon_a)-2\sum_{j,k}\gamma^\mathrm{A}_{jk}\braket{ja||ki}-2\sum_{b,c}\gamma^\mathrm{A}_{bc}\braket{ba||ci}-2\sum_{j,b}\lambda_{jb}\left(\braket{ja||bi}+\braket{ba||ji}\right)\nonumber\\
&+\sum_{b,j,k}\Gamma'_{abjk}\braket{ib||jk}-\sum_{j,b,c}\Gamma'_{ijbc}\braket{aj||bc}=0\nonumber \\
&{^{(2)}}\Leftrightarrow
\lambda_{ia}(\epsilon_i-\epsilon_a)-\sum_{j,b}\lambda_{jb}\left(\braket{ja||bi}+\braket{ba||ji}\right)=\sum_{j,k}\gamma^\mathrm{A}_{jk}\braket{ki||ja}-\sum_{b,c}\gamma^\mathrm{A}_{bc}\braket{ic||ba}\nonumber \\
&-\frac{1}{2}\sum_{b,j,k}\Gamma'_{abjk}\braket{ib||jk}+\frac{1}{2}\sum_{j,b,c}\Gamma'_{ijbc}\braket{aj||bc}\nonumber \\
&{^{(3)}}\Leftrightarrow \lambda_{ia}(\epsilon_i-\epsilon_a)+\sum_{j,b}\lambda_{jb}\left(\braket{ib||ja}-\braket{ij||ab}\right)= \nonumber \\ 
&=\sum_{j,k}\gamma^{\mathrm{A}}_{jk}\braket{ki||ja}+\sum_{b,c}\gamma^\mathrm{A}_{bc}\braket{ic||ab}-\frac{1}{2}\sum_{b,j,k}\Gamma'_{jkab}\braket{jk||ib}-\frac{1}{2}\sum_{j,b,c}\Gamma'_{ijbc}\braket{ja||bc}
```
Once the $\boldsymbol\Lambda$ multipliers are determined using an iterative technique, such as the conjugate gradient algorithm, the different blocks of the $\boldsymbol\Omega$ multipliers can be computed using [this equation](eq:omega). Explicitly
```{math}
%:label: eq:omega_mp2_oo
\omega_{ij} = &-\delta_{ij}\epsilon_i - \gamma_{ij}^\mathrm{A}\epsilon_i - \sum_{k,l} \gamma_{kl}^\mathrm{A}\braket{ki||lj} - \sum_{k,a}\lambda_{ka}\left(\braket{ik||ja}+\braket{jk||ia}\right)\nonumber\\
&-\sum_{ab}\gamma_{ab}^\mathrm{A}\braket{ia||jb}-\frac{1}{2}\sum_{k,a,b}\Gamma_{jkab}^\mathrm{A}\braket{ik||ab}\\
```
```{math}
%:label: eq:omega_mp2_ov
\omega_{ia} = -\lambda_{ia}\epsilon_i-\frac{1}{2}\sum_{j,k,c}\Gamma^\mathrm{A}_{jkac}\braket{jk||ic} 
```
```{math}
%:label: eq:omega_mp2_vv
\omega_{ab} = -\gamma_{ab}^\mathrm{A}\epsilon_a-\frac{1}{2}\sum_{i,j,c}\Gamma^\mathrm{A}_{ijbc}\braket{ij||ac}
```




### Excited states
The derivation of excited state gradients follows the same procedure as illustrated above for the ground state:

1. Identify the one- and two-particle density matrices that contribute to the energy,
2. Construct the Lagrangian with appropriate constraints,
3. If required by the theory level, determine the amplitude response Lagrange multipliers and construct the corresponding density matrices,
4. Set up and solve the $\boldsymbol{\Lambda}$ orbital response equations,
5. Determine the $\boldsymbol{\Omega}$ Lagrange multipliers,
6. Determine the energy gradient.

(cis:label)=
#### CIS

To illustrate this procedure, we will take the configuration interaction singles (CIS) method {cite}`Foresman1992`
as an example, which is equivalent to linear-response time-dependent Hartree--Fock (TDHF) theory within the Tamm--Dancoff approximation {cite}`Dreuw2005`.
Note that for excitation energies and excited-state properties, CIS also yields the same results as the ADC(1) scheme {cite}`Dreuw2015`.
The approach is then easily generalizable to TDHF and TDDFT.
%Furthermore, excited-state gradients of linear-response time-dependent density functional theory (TDDFT) within the TDA %{cite}`hirata99`
%can be derived in a completely analogous way, only the additional exchange-correlation terms need to be taken into account {cite}`Furche2002`.
%%%To illustrate this procedure, we will refer to ADC(1) (add link). Note that ADC(1) is equivalent to time-dependent Hartree-Fock (TDHF) in the Tamm--Dancoff approximation (TDA) (add link) and configuration interaction singles (CIS) (add link).

In the CIS scheme, a Hermitian eigenvalue equation of the following form is solved
```{math}
%:label: eq:cis_eigenvalue_eq
  \mathbf{AX}_n = \omega_n \mathbf{X}_n 
```
where $\omega_n$ is the excitation energy for excited state $n$ with corresponding eigenvector $\mathbf{X}_n$
(normalized according to $\mathbf{X}_n^\dagger \mathbf{X}_n = 1$),
and $\mathbf{A}$ is the CIS matrix given by the elements

(eq:cis-matrix-elemts)=
$$
A_{ia,jb} = (\epsilon_i - \epsilon_a) \delta_{ij} \delta_{ab} - \braket{ja||ib} 
$$

(The CIS matrix elements correspond to the sum of [the zeroth- and first-order terms](adcmat_phph) of the ADC(1) matrix elements.)
Besides the density matrices required for the HF reference state derived [above](sec:hf-gradients),
we need to identify additional one- and two-particle density matrices for the excitation energy.
We therefore formally represent the excitation energy $\omega_n = \mathbf{X}_n^\dagger\mathbf{A}\mathbf{X}_n$
in terms of one- and two-particle density matrices:
```{math}
%:label: eq:excitation_energy_adc1
\mathbf{X}_n^\dagger\mathbf{A}\mathbf{X}_n = \sum_{p,q}\gamma^{(n)}_{pq}F_{pq} + \frac{1}{4}\sum_{p,q,r,s}\Gamma^{(n)}_{pqrs}\braket{pq||rs}
```

where the superscript $(n)$ indicates that we are referring to the difference density matrices for the $n$-th excited state.
To identify these density matrices, we carry out explicitly the matrix-vector multiplication on the left hand side,
```{math}
%:label: eq:adc1_identify_dms
\mathbf{X}_n^\dagger\mathbf{A}\mathbf{X}_n &= \sum_{i,a,j,b} x_{jb} \left[(\epsilon_a - \epsilon_i)\delta_{ab}\delta_{ij}-\braket{ja||ib}\right] x_{ia}\nonumber\\
&=\sum_{i,a,j,b} \left(x_{jb}x_{ia}f_{ab}\delta_{ij} - x_{jb}x_{ia}f_{ij}\delta_{ab}-x_{jb}x_{ia}\braket{ja||ib}\right)
```
where we have used the explicit form of the CIS matrix elements, and $x_{ia}$ are the elements of a specific eigenvector $\mathbf{X}_n$.

(eq:cis-densities)=
From the above equation, we identify the excited-state density matrix contributions:
```{math}
%:label: eq:adc1_gamma_oo
\gamma^{(n)}_{ij} = - \sum_{a}x_{ja}x_{ia}\label{eq:adc1_gamma_oo}
```
```{math}
%:label: eq:adc1_gamma_vv
\gamma^{(n)}_{ab} = \sum_{i} x_{ib}x_{ia}\label{eq:adc1_gamma_vv}
```
```{math}
%:label: eq:adc1_Gamma_ovov
\Gamma^{(n)}_{iajb} = -x_{ib}x_{ja}\label{eq:adc1_Gamma_ovov}
```
where the indices of the vectors in the two-particle density matrix have been renamed.

To obtain the molecular gradient of the excited state ${n}$, the density matrices of the ground state must also be included.
For CIS, this is the [HF reference state](sec:hf-gradients), so the density matrices for the excited state are

(eq:cis_density_matrices)=
$$
\gamma'_{ij} &= \gamma_{ij}+ \gamma^{(n)}_{ij}= \delta_{ij} - \sum_{a}x_{ja}x_{ia} \\
\gamma'_{ab} &= \gamma^{(n)}_{ab} =\sum_{i} x_{ib}x_{ia}\\
\Gamma'_{ijkl} &= \Gamma_{ijkl} = -2\delta_{ik}\delta_{jl} \\
\Gamma'_{iajb} &= \Gamma^{(n)}_{iajb} = -x_{ib}x_{ja}

$$

Since, when written in terms of one- and two-particle density matrices,
the excited state Lagrangian is virtually identical to the Lagrangians written for [HF](sec:hf-gradients) and [MP2](sec:mp2-gradients), we leave the exercise of plugging in the expressions of the DMs to the reader.
The orbital response equations are also straightforwad to derive by using the above [density matrices](eq:cis_density_matrices) in the general [orbital response](eq:OrbRspEq) [equations](eq:omega).
We leave the step-by-step derivation to the reader, with a note that care should be given to the symmetry of the two-particle density matrix $\Gamma_{pqrs}$.
Here, we provide the final expressions for $\boldsymbol{\Lambda}$ (to be determined iteratively)
```{math}
%:label: eq:lambda_adc1_ov
(\epsilon_i-\epsilon_a)\lambda_{ia} - \sum_{j,b}\lambda_{jb}(\braket{ji||ba} - \braket{ja||ib}) &= \sum_{j,k}\gamma^{(n)}_{jk}\braket{ji||ka} - \sum_{b,c}\gamma^{(n)}_{bc}\braket{ib||ca} \nonumber\\
&+ \sum_{j,k,b}\Gamma^{(n)}_{jakb}\braket{ij||kb} + \sum_{b,k,c}\Gamma_{ibkc}^{(n)}\braket{kc||ab}  \label{eq:lambda_adc1_ov}
```
and $\boldsymbol{\Omega}$

$$
\omega_{ij} &= -(\delta_{ij} +\gamma^{(n)}_{ij})\epsilon_{i}-\sum_{k,l}\gamma^{(n)}_{kl}\braket{ki||lj}+\sum_{k,a}\lambda_{ka}\left(\braket{ki||ja}+\braket{kj||ia}\right) \nonumber \\
&\quad-\sum_{a,b}\gamma^{(n)}_{ab}\braket{ia||jb} - \sum_{a,k,b} \Gamma^{(n)}_{jakb}\braket{ia||kb}\label{eq:omega_adc1_oo} \\
\omega_{ia} &= - \lambda_{ia}\epsilon_{i} + \sum_{k,b,j}\Gamma^{(n)}_{kajb}\braket{ik||jb}\label{eq:omega_adc1_ov} \\
\omega_{ab} &=-\gamma^{(n)}_{ab}\epsilon_a - \sum_{k,c,j}\Gamma^{(n)}_{kajc}\braket{kb||jc}\label{eq:omega_adc1_vv} 
$$

Using the density matrices and Lagrange multipliers, the analytical CIS gradient can now be determined from the partial derivative of the Lagrangian with respect to $x$.


(sec:tdhf-gradients)=
#### TDHF

In linear-response time-dependent Hartree--Fock (TDHF), also known as the _random phase approximation_ (RPA) {cite}`Dreuw2005`, one solves a pseudo-eigenvalue equation of the form

$$
\begin{pmatrix} \mathbf{A} & \mathbf{B} \\ \mathbf{B}^* & \mathbf{A}^* \end{pmatrix} \begin{pmatrix} \mathbf{X}_n \\ \mathbf{Y}_n \end{pmatrix} = \omega_n \begin{pmatrix} \mathbf{1} & \mathbf{0} \\ \mathbf{0} & -\mathbf{1} \end{pmatrix} \begin{pmatrix} \mathbf{X}_n \\ \mathbf{Y}_n \end{pmatrix} 
$$

where $\mathbf{X}_n$ and $\mathbf{Y}_n$ are referred to as the _excitation_ and _de-excitation_ parts of the response vectors, respectively, the matrix $\mathbf{A}$ corresponds to the [one](eq:cis-matrix-elemts) from CIS,
and the $\mathbf{B}$ matrix is given by the elements $B_{ia,jb} = -\braket{ab||ij}$.
The vectors are usually normalized according to $\mathbf{X}_n^\dagger \mathbf{X}_n - \mathbf{Y}_n^\dagger \mathbf{Y}_n = 1$.

The following procedure is analogous to the CIS case and only a few changes are required in the definition of the one- and two-particle density matrices. The non-vanishing contributions to $\boldsymbol{\gamma}^{(n)}$ and $\boldsymbol{\Gamma}^{(n)}$ for TDHF are given in the following in terms the elements $x_{ia}$ and $y_{ia}$ of $\mathbf{X}_n$ and $\mathbf{Y}_n$,
respectively, or rather their linear combinations $\mathbf{X}_n \pm \mathbf{Y}_n$:

$$
    \gamma_{ij}^{(n)} &= -\frac12 \sum_{a} \Big[ (x_{ia} + y_{ia})(x_{ja} + y_{ja})
        + (x_{ia} - y_{ia})(x_{ja} - y_{ja}) \Big] \\
    \gamma_{ab}^{(n)} &= +\frac12 \sum_{i} \Big[ (x_{ia} + y_{ia})(x_{ib} + y_{ib})
        + (x_{ia} - y_{ia})(x_{ib} - y_{ib}) \Big] \\
    \Gamma_{iajb}^{(n)} &= -\frac{1}{2} \big[ (x_{ib}+y_{ib})(x_{ja}+y_{ja})+(x_{ib}-y_{ib})(x_{ja}-y_{ja}) \big]\\
    \Gamma_{ijab}^{(n)} &= - \big[ (x_{jb}+y_{jb})(x_{ia}+y_{ia})-(x_{jb}-y_{jb})(x_{ia}-y_{ia}) \big] 
$$

Note that there is an additional non-vanishing block in the two-particle density matrix as compared to CIS,
namely $\Gamma_{ijab}^{(n)}$, that enters both the right-hand side of the $\boldsymbol{\Lambda}$ orbital response equations and the $\boldsymbol{\Omega}$ multipliers.

(sec:tddft-gradients)=
#### TDDFT

Analogous to the [ground state](sec:dft-gradients), analytical gradients in the case of linear-response [time-dependent density functional theory](sec:tddft) (TDDFT) are virtually identical to the TDHF ones {cite}`Furche2002`. Only the exchange-correlation terms need to be considered additionally, meaning the matrix elements of $\mathbf{A}$ and $\mathbf{B}$ need to be modified accordingly.

The excitation energy $\omega_n$ for a general hybrid functional can be written as

$$
\omega_n = \sum_{pq}\gamma_{pq}^{(n)} \, F_{pq} + \frac{1}{4}\sum_{pqrs}\Gamma_{pqrs}^{(n)} \, \big( \langle pq | rs \rangle - c_{\text{x}} \langle pq | sr \rangle + f^{\text{xc}}_{pqrs} \big) 
$$

where the KS matrix $\mathbf{F}$ is given by

$$
F_{pq} = h_{pq} + \sum_{i} \big( \langle pi | qi \rangle - c_{\text{x}} \langle pi | iq \rangle \big) + v_{pq}^{\text{xc}} 
$$

and the xc contributions $v_{pq}^{\text{xc}}$ and $f^{\text{xc}}_{pqrs}$ introduced above, sometimes referred to as the xc potential and xc kernel, respectively {cite}`Dreuw2005`, are given in the LDA
in a real MO basis as

(eq:xc-potential-kernel)=
$$
v_{pq}^{\text{xc}} &= \int \frac{\partial e_{\text{xc}}}{\partial n(\mathbf{r})} \phi_{p} \phi_{q} \mathrm{d} \mathbf{r} \\
f_{pqrs}^{\text{xc}} &= \int \frac{\partial^2 e_{\text{xc}}}{\partial n(\mathbf{r}) \partial n'(\mathbf{r})} \phi_p \phi_q \phi_r \phi_s \mathrm{d}\mathbf{r} 
$$

For the orbital response, derivatives of these two quantities with respect to the MO coefficients $\mathbf{C}$ are required.
Derivatives of the orbitals $\phi_p$ behave exactly like for the normal integrals given above, however, additional terms occur since $e_{\text{xc}}$ depends on $\mathbf{C}$ through the density.
The derivative of $v_{pq}^{\text{xc}}$ thus gives a term analogous to $f_{pqrs}^{\text{xc}}$,
while the derivative of the latter gives a term with a third-order functional derivative {cite}`Furche2002`,

$$
g_{pqrstu}^{\text{xc}} = \int \frac{\partial^3 e_{\text{xc}}}{\partial n(\mathbf{r}) \partial n'(\mathbf{r}) \partial n''(\mathbf{r})} \phi_p \phi_q \phi_r \phi_s \phi_t \phi_u \mathrm{d} \mathbf{r} 
$$

For the nuclear gradient of the TDDFT excitation energy we thus need the partial derivatives of those two terms.
(eq:vxc-deriv)=
For the xc potential, this is given by:

$$
v_{pq}^{\text{xc}(\xi)} = \frac{\partial}{\partial x} \int \frac{\partial e_{\text{xc}}}{\partial n(\mathbf{r})} \phi_{p} \phi_{q} \mathrm{d} \mathbf{r}
= \int \frac{\partial e_{\text{xc}}}{\partial n(\mathbf{r})} \frac{\partial (\phi_{p} \phi_{q})}{\partial x} \mathrm{d} \mathbf{r}
+ \int \frac{\partial^2 e_{\text{xc}}}{\partial n(\mathbf{r}) \partial n'(\mathbf{r})} n(\mathbf{r})^{(x)} \phi_p \phi_q \mathrm{d} \mathbf{r} 
$$

where the first term is completely analogous to the ground-state contribution, except that it gets contracted with the relaxed one-particle density matrix $\boldsymbol{\gamma} + \boldsymbol{\Lambda}$ (instead of the ground-state density matrix $\mathbf{D}$), and the second term corresponds to an $f^{\text{xc}}$-like term with the partial derivative of the [ground-state density](eq:density-partial-deriv). The partial derivative of the xc kernel with respect to $x$ gives

$$
f_{pqrs}^{\text{xc}(x)} &= \frac{\partial}{\partial x} \int \frac{\partial^2 e_{\text{xc}}}{\partial n(\mathbf{r}) \partial n'(\mathbf{r})} \phi_p \phi_q \phi_r \phi_s \mathrm{d}\mathbf{r}
= \int \frac{\partial^2 e_{\text{xc}}}{\partial n(\mathbf{r}) \partial n'(\mathbf{r})} \frac{\partial(\phi_p \phi_q \phi_r \phi_s)}{\partial x} \mathrm{d}\mathbf{r}\\
&+ \int \frac{\partial^3 e_{\text{xc}}}{\partial n(\mathbf{r}) \partial n'(\mathbf{r}) \partial n''(\mathbf{r})} n(\mathbf{r})^{(x)} \phi_p \phi_q \phi_r \phi_s \mathrm{d} \mathbf{r} 
$$

where the first term is again $f^\text{xc}$-like with an orbital derivative, and the second term is analogous to $g^\text{xc}$ from the orbital response contributions, including again the partial derivative of the ground-state density. This term eventually gets contracted with the two-particle density matrix $\boldsymbol{\Gamma}$, so with two excitation or response vectors.

TDDFT within the Tamm--Dancoff approximation (TDA) {cite}`hirata99` is obtained by setting $\mathbf{B} = \mathbf{0}$ (or $\mathbf{Y}_n = \mathbf{0}$). All the above considerations are equally valid for TDDFT/TDA, with the one- and two-particle density matrices simplifying to the ones from [CIS](eq:cis-densities). Note that TDHF (and thus CIS, which is TDHF within the TDA) can be considered a special case of general hybrid TDDFT with $c_{\text{x}} = 1$ and $e_{\text{xc}} = 0$ {cite}`Furche2002`.

(sec:first-order-prop)=
### First-order properties

Not only nuclear gradients, but many other time-independent (or "static") molecular properties can be calculated as derivatives of the energy {cite}`jensen2006`. For instance, the electric dipole moment can be calculated as the derivative of the energy with respect to an external electric field, and the magnetic dipole moment with respect to an external magnetic field. Second derivatives with respect to the external field give the electric polarizability and magnetizability, respectively. Properties that can be calculated as first derivatives of the energy are referred to as __first-order properties__.

For exact wave functions $\ket{\Psi}$ and energies $E = \langle \Psi | \hat{H} | \Psi \rangle$, the **Hellmann--Feynman theorem** holds {cite}`jensen2006`

(eq:hellmann_feynman)=
```{math}
  \frac{\mathrm{d} E}{\mathrm{d} \xi} = \frac{\mathrm{d}}{\mathrm{d} \xi} \langle \Psi | \hat{H} | \Psi \rangle
  = \langle \Psi | \frac{\mathrm{d} \hat{H}}{\mathrm{d} \xi} | \Psi \rangle 
```
stating that derivative of the energy with respect to an external perturbation $\xi$ is identical to the expectation value of the perturbed Hamiltonian with the unperturbed wave function. The above derivative is to be taken at zero perturbation strength, $\xi = 0$. If the basis functions do not depend on the perturbation, the [above equation](eq:hellmann_feynman) also holds for fully variational methods like the SCF and MCSCF schemes.

Consider the electric dipole moment $\boldsymbol{\mu}$ as an example. Numerically, each component of $\boldsymbol{\mu}$ can be obtained by calculating the energy $E$ in presence of a static electric field $\mathcal{\boldsymbol{F}}$, once with a positive and once with a negative sign in one of its components, and then applying the formula of the [symmetric difference quotient](eq:symmetric-difference-quotient).
Starting from a Lagrangian $L$ of [this form](eq:MP2_Lagrangian), dipole moments are obtained analytically from a perturbed Hamiltonian, $\hat{H}_{\mathcal{\boldsymbol{F}}} = \hat{H} + \mathcal{\boldsymbol{F}} \hat{\mu}$,
where $\hat{\mu}$ is the dipole operator, in the Lagrange expression followed by differentiation,
(eq:dipmom_derivative)=
```{math}
%:label:
  \boldsymbol{\mu} = \frac{\partial L}{\partial \mathcal{\boldsymbol{F}}} = \frac{\partial E}{\partial \mathcal{\boldsymbol{F}}}
  + \boldsymbol{\Lambda} \frac{\partial f_c(\mathbf{C})}{\partial \mathcal{\boldsymbol{F}}}
  + \mathbf{\tilde{T}} \frac{\partial f_t(\mathbf{T})}{\partial \mathcal{\boldsymbol{F}}} 
```

The determination of the Lagrange multipliers $\boldsymbol{\Lambda}$ and $\mathbf{\tilde{T}}$ is identical to the procedures described above.
The dipole moment as a derivative of the energy can then be calculated as:
```{math}
%:label: eq:dipmom_explicit
  \boldsymbol{\mu} = \sum_{pq} ( \gamma_{pq}' + \lambda_{pq} ) \mu_{qp} 
```
where $\boldsymbol{\gamma}' = \boldsymbol{\gamma} + \boldsymbol{\gamma}^{\text{A}}$ is the (orbital) __unrelaxed__ one-particle density matrix, which includes contributions from the amplitude response, $\boldsymbol{\gamma}' + \boldsymbol{\Lambda}$ is referred to as the (orbital) __relaxed density matrix__, and $\mu_{qp}$ are elements of the dipole operator in the MO basis.
Taking all terms of the [above equation](eq:dipmom_derivative) into account is referred to as the _relaxed_ dipole moment, whereas neglecting $\boldsymbol{\Lambda}$ yields so-called _unrelaxed_ dipole moments. The latter do not correspond to a proper energy derivative, but the computation is somewhat simplified since the iterative solution of the orbital response equations is avoided.

Another approach to one-electron properties such as dipole moments is to set out from the expectation value
of the corresponding operator with the wave function, such as $\boldsymbol{\mu} = \langle \Psi | \hat{\mu} | \Psi \rangle$, see the [Hellmann-Feynman theorem](eq:hellmann_feynman) above. Depending on the wave-function model, this approach can be equivalent to the orbital-unrelaxed approach, such as in configuration interaction, but in particular for schemes based on perturbation theory, all three different approaches to dipole moments differ {cite}`Hodecker2019`.
