# Spectrum analysis#

For the analysis of obtained spectra we will focus on the carbon X-ray absorption spectrum of vinylfluoride, with emphasis on how the obtained spectra can be decomposed in terms of the polarization of incoming radiation, and the atomic origin of the features. Additional analysis in terms of MOs, different visualization schemes and descriptors thereof, will be added at a later date, as well as more details on assignment. Analysis of features in X-ray emission spectra tends to be easier to interpret (fewer states, probing occupied states of relatively large energy shifts, etc.), and focus will thus here be on XAS.

Vinylfluride (C\(_2\)H\(_3\)F) is a fluorine-substituted ethene derivate, which possess large shifts in features for the -**C**HF site due to interaction with the very electronegative fluorine. This *chemical shift* is significant and a strong indicator of the local chemical environment. This can be seen by comparing the experimental spectra of ethene, vinylfluoride, and 1,1-difluoroethene, as digitized from experimental measurements. We here mark position (and number of contributing carbon sites) of strong \(1s \rightarrow \pi^{\ast}\) transitions.

We see that a single substitution yields a shift of about 2 eV, and a double substitution (at the same site) of about 4 eV. The behaviour of the chemical shifts for different spectroscopies will be further deliberated on in exercises.

Calculating the vinylfluoride spectrum using CVS-ADC(2)-x and CPP-DFT (with the B3LYP xc-functional):

```
vinyl_mol_str = """
C 0.000000 -0.246412 -1.271068
C 0.000000 0.457081 -0.154735
F 0.000000 -0.119195 1.052878
H 0.000000 0.272328 -2.210194
H 0.000000 -1.319906 -1.249847
H 0.000000 1.530323 -0.095954
"""
# Construct structure and basis objects
struct = gator.get_molecule(vinyl_mol_str)
basis = gator.get_molecular_basis(struct, "6-31G")
```

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:

`jupyter labextension install jupyterlab_3dmol`

```
# Perform SCF calculation
scf_gs = gator.run_scf(struct, basis)
# Calculate the 10 lowest eigenstates with CVS restriction to MOs #1-#3 (fluorine and carbon 1s)
adc_xas = gator.run_adc(
struct, basis, scf_gs, method="cvs-adc2x", singlets=10, core_orbitals=[1, 2, 3]
)
```

Note

If you have a recent version of adcc, there may be convergence issues due to a mismatch between conda and adcc. This is being looked into, and can for the moment typically be avoided by increasing the basis set size.

```
# Prepare molecule and basis objects
molecule = vlx.Molecule.read_molecule_string(vinyl_mol_str)
basis = vlx.MolecularBasis.read(molecule, "6-31G")
# SCF settings and calculation
scf_drv = vlx.ScfRestrictedDriver()
scf_settings = {"conv_thresh": 1.0e-6}
method_settings = {"xcfun": "b3lyp"}
scf_drv.update_settings(scf_settings, method_settings)
scf_results = scf_drv.compute(molecule, basis)
# Define spectrum region to be resolved
freqs = np.arange(272.0, 285.0, 0.25) / au2ev
freqs_str = [str(x) for x in freqs]
# Calculate the response
cpp_prop = LinearAbsorptionCrossSection(
{"frequencies": ",".join(freqs_str), "damping": 0.4 / au2ev}, method_settings
)
cpp_prop.init_driver()
cpp_prop.compute(molecule, basis, scf_results)
# Extract the imaginary part of the complex response function and convert to absorption cross section
sigma = []
for w in freqs:
axx = -cpp_prop.rsp_property["response_functions"][("x", "x", w)].imag
ayy = -cpp_prop.rsp_property["response_functions"][("y", "y", w)].imag
azz = -cpp_prop.rsp_property["response_functions"][("z", "z", w)].imag
alpha_bar = (axx + ayy + azz) / 3.0
sigma.append(4.0 * np.pi * w * alpha_bar / 137.035999)
```

The resulting spectra are shown below, and look very similar (although with a large difference in absolute energy):

```
plt.figure(figsize=(8, 5))
# CPP-DFT results
plt.subplot(211)
plt.title("CPP-DFT (B3LYP)")
x = np.arange(min(au2ev * freqs), max(au2ev * freqs), 0.01)
y = interp1d(au2ev * freqs, sigma, kind="cubic")
plt.plot(x, y(x), "b")
plt.xlim(272, 284)
# CVS-ADC(2)-x results
plt.subplot(212)
plt.title("CVS-ADC(2)-x")
x, y = au2ev * adc_xas.excitation_energy, adc_xas.oscillator_strength
xi, yi = lorentzian(x, y, 280, 300, 0.01, 0.8)
plt.plot(xi, yi, "b")
plt.xlim(284, 296)
plt.tight_layout()
plt.show()
```

## Looking at solution vectors#

The solution vectors, *i.e.* the eigenvectors for eigenstate calculations and the response vectors for CPP calculations, can be plotted in order to see involved occupied and virtual MOs. This can give some insight into the different features, but easily becomes cumbersome when the density of states is high, or when the final state consists of a mix of many contributions.

### Eigenvectors#

From the ADC calculations there is the function `state.describe_amplitudes`

, which print the amplitudes using one of three formats, controlled by `index_format`

. The amplitudes are then printed when above a `tolerance`

threshold. Splitting the returning string according to `state`

and printing only the first two states, we get:

```
print("From 'hf' format, which yield MO number:")
des_amp = adc_xas.describe_amplitudes(tolerance=0.1, index_format="hf")
split_line = "\n\n"
tmp_amp = des_amp.split(split_line)
print(tmp_amp[1])
hf_exc1 = tmp_amp[1]
print(tmp_amp[2])
hf_exc2 = tmp_amp[1]
print()
print("\nFirst transition from 'adcc' format, with indexing as used in adcc:")
des_amp = adc_xas.describe_amplitudes(tolerance=0.1, index_format="adcc")
split_line = "\n\n"
tmp_amp = des_amp.split(split_line)
print(tmp_amp[1])
adcc_exc1 = tmp_amp[1]
print()
print("\nFirst transition from 'homolumo' format, which use HOMO/LUMO/HOCO format:")
des_amp = adc_xas.describe_amplitudes(tolerance=0.1, index_format="homolumo")
split_line = "\n\n"
tmp_amp = des_amp.split(split_line)
homo_exc1 = tmp_amp[1]
print(tmp_amp[1])
```

```
From 'hf' format, which yield MO number:
+---------------------------------------+
| State 1 , 10.63675 au |
+---------------------------------------+
| 1 -> 12 a ->a -0.601 |
| 1 -> 22 a ->a -0.105 |
| 11 1 -> 12 12 ab->ab +0.149 |
+---------------------------------------+
| State 1 , 10.63675 au |
+---------------------------------------+
| 1 -> 12 a ->a -0.601 |
| 1 -> 22 a ->a -0.105 |
| 11 1 -> 12 12 ab->ab +0.149 |
First transition from 'adcc' format, with indexing as used in adcc:
+-------------------------------------------------------+
| State 1 , 10.63675 au, 289.4407 eV |
+-------------------------------------------------------+
| (o2 1) -> (v1 0) a ->a -0.601 |
| (o2 1) -> (v1 10) a ->a -0.105 |
| (o1 8) (o2 1) -> (v1 0) (v1 0) ab->ab +0.149 |
First transition from 'homolumo' format, which use HOMO/LUMO/HOCO format:
+-------------------------------------------------------+
| State 1 , 10.63675 au, 289.4407 eV |
+-------------------------------------------------------+
| HOCO -1 -> LUMO a ->a -0.601 |
| HOCO -1 -> LUMO+10 a ->a -0.105 |
| HOMO HOCO -1 -> LUMO LUMO ab->ab +0.149 |
```

From this, it is clear that the two first transitions originate from two different occupied MOs, which correspond to the -**C**H\(_2\) and -**C**HF atoms, respectively. The transitions are dominated by single-excitations to the LUMO and LUMO+1, with largest relaxation contributions coming from the HOMO.

## Decomposition of spectra#

For a more visual illustration, we can plot the spectra as decomposed according to atomic contributions and polarization of absorbed radiation.

### Atomic contributions#

Decomposing the spectra according to atomic contributions provides information on which core orbitals contribute to the various features. For vinylfluoride, this can be done by performing CVS calculations with tailored CVS spaces for each carbon site:

```
# Construct structure and basis objects
struct = gator.get_molecule(vinyl_mol_str)
basis = gator.get_molecular_basis(struct, "6-31G")
# Perform SCF calculation
scf_gs = gator.run_scf(struct, basis)
# Calculate the 6 lowest eigenstates with CVS restriction to MOs #2 and #3 (two carbons)
adc_chf = gator.run_adc(
struct, basis, scf_gs, method="cvs-adc2x", singlets=6, core_orbitals=[2]
)
adc_ch2 = gator.run_adc(
struct, basis, scf_gs, method="cvs-adc2x", singlets=6, core_orbitals=[3]
)
```

From which the contributions can be plotted:

```
plt.figure(figsize=(7, 4))
# Spectrum from -CHF
x1, y1 = au2ev * adc_chf.excitation_energy, adc_chf.oscillator_strength
x1i, y1i = lorentzian(x1, y1, 280, 300, 0.01, 0.8)
# Spectrum from -CH2
x2, y2 = au2ev * adc_ch2.excitation_energy, adc_ch2.oscillator_strength
x2i, y2i = lorentzian(x2, y2, 280, 300, 0.01, 0.8)
# Plotting line spectra for full, area spectra for contributions
plt.plot(x2i, y1i + y2i, "k-", linewidth=2.0)
plt.fill_between(x2i, 0, y1i, alpha=0.6)
plt.fill_between(x2i, y1i, y1i + y2i, alpha=0.6)
plt.legend(("Full", "-CHF", "-CH2"), loc="upper left")
plt.xlim(284, 296)
plt.tight_layout()
plt.show()
```

We here clearly see the chemical shift due to fluorine-substitution.

This manner of decomposing the atomic contributions should be taken with some care, as it involves explicit calculation for each atomic site in turn, which:

Can require a larger number of calculations, and care should be taken so that the studied region does contain all states of one atomic site, but not another.

This approach does not strictly fit with the CVS philosophy, which uses the large separation in energy and spatial extent as a basis for separation. Energy separation can now be small, and for some systems degeneracies and near-degeneracies may lead to delocalized core orbitals (see here and here). If so, all delocalized sites must be included in the same CVS space.

Nevertheless, this approach has some advantages in terms of ease of analysis, as well as exhibiting lower computational cost for each calculation (albeit here requiring two calculations instead of just one).

An alternative to this is to perform the decomposition on the full solution vectors, as will here be done using CPP-DFT response vectors:

```
# Number of occupied and unoccupied (alpha) MOs
nocc = molecule.number_of_alpha_electrons()
nvirt = scf_drv.mol_orbs.number_mos() - nocc
# Extract solution vectors
solution_vecs = cpp_prop.get_property("solutions")
x_solution, y_solution, z_solution = [], [], []
for w in freqs:
x_solution.append(cpp_prop.get_full_solution_vector(solution_vecs[("x", w)]))
y_solution.append(cpp_prop.get_full_solution_vector(solution_vecs[("y", w)]))
z_solution.append(cpp_prop.get_full_solution_vector(solution_vecs[("z", w)]))
x_solution, y_solution, z_solution = (
np.array(x_solution),
np.array(y_solution),
np.array(z_solution),
)
# Extract polarization-resolved response
sX, sY, sZ = [], [], []
for w in freqs:
axx = -cpp_prop.rsp_property["response_functions"][("x", "x", w)].imag
ayy = -cpp_prop.rsp_property["response_functions"][("y", "y", w)].imag
azz = -cpp_prop.rsp_property["response_functions"][("z", "z", w)].imag
sX.append(4.0 / 3.0 * np.pi * w * axx / 137.035999)
sY.append(4.0 / 3.0 * np.pi * w * ayy / 137.035999)
sZ.append(4.0 / 3.0 * np.pi * w * azz / 137.035999)
# Create empty object of dimension nocc (to add spectrum contributions from each occupied orbital)
spec_comp = []
for occ in np.arange(0, nocc):
spec_comp.append([])
# Appending spec_comp object with decomposed response
for i in np.arange(len(freqs)):
x_tmp = x_solution[i]
y_tmp = y_solution[i]
z_tmp = z_solution[i]
for occ in np.arange(nocc):
comp_X = np.sum(
np.abs(np.imag(x_tmp[occ * nvirt : (occ + 1) * nvirt]))
) / np.sum(np.abs(np.imag(x_tmp[:])))
comp_Y = np.sum(
np.abs(np.imag(y_tmp[occ * nvirt : (occ + 1) * nvirt]))
) / np.sum(np.abs(np.imag(y_tmp[:])))
comp_Z = np.sum(
np.abs(np.imag(z_tmp[occ * nvirt : (occ + 1) * nvirt]))
) / np.sum(np.abs(np.imag(z_tmp[:])))
spec_comp[occ].append((comp_X * sX[i] + comp_Y * sY[i] + comp_Z * sZ[i]))
# Extract contribution of two carbon atoms, as well as the full spectrum
cont_chf = np.array(spec_comp[1])
cont_ch2 = np.array(spec_comp[2])
cont_full = sigma
plt.figure(figsize=(7, 4))
# Create splined spectra for each component
x = np.arange(min(au2ev * freqs), max(au2ev * freqs), 0.01)
ychf = interp1d(au2ev * freqs, cont_chf, kind="cubic")
ych2 = interp1d(au2ev * freqs, cont_ch2, kind="cubic")
yfull = interp1d(au2ev * freqs, cont_full, kind="cubic")
# Spectrum from the non-carbon atoms
yrest = yfull(x) - ychf(x) - ych2(x)
# Plot full spectrum
plt.plot(x, yfull(x), "k-", linewidth=2)
# Plot components
plt.fill_between(x, yrest, yrest + ychf(x), alpha=0.6)
plt.fill_between(x, yrest + ychf(x), yrest + ych2(x) + ychf(x), alpha=0.6)
plt.fill_between(x, 0, yrest, alpha=0.6)
plt.legend(("Total", "-CHF", r"-CH$_2$", "Rest"))
plt.xlim((272, 284))
plt.tight_layout()
plt.show()
```

Where we again clearly see the chemical shift and different contributions.

### Polarization dependence#

For structured molecular samples, the local orientation of the molecules can be studied by looking at the polarization-resolved spectra. This will not work for the gas phase, where the isotropic signal is observed, but it can provide very useful information on, *e.g.*, the orientation and structuring of self-assembly monolayers.

Resolving the polarization-dependence from ADC and CPP-DFT calculations is straightforward:

```
# Extract excitation energies
adc_freq = adc_xas.excitation_energy
# Extract Cartesian components from the ADC state (here skipping the prefactors)
adc_sX, adc_sY, adc_sZ = [], [], []
for i in np.arange(len(adc_freq)):
tmp_tms = adc_xas.transition_dipole_moment[i]
adc_sX.append(adc_freq[i] * tmp_tms[0] ** 2)
adc_sY.append(adc_freq[i] * tmp_tms[1] ** 2)
adc_sZ.append(adc_freq[i] * tmp_tms[2] ** 2)
plt.figure(figsize=(7, 4))
# Resolved spectrum region
xmin, xmax = 284, 296
# Broadened spectra for each component
x, y = au2ev * adc_freq, adc_sX
xX, yX = lorentzian(x, y, xmin, xmax, 0.01, 0.8)
x, y = au2ev * adc_freq, adc_sY
xY, yY = lorentzian(x, y, xmin, xmax, 0.01, 0.8)
x, y = au2ev * adc_freq, adc_sZ
xZ, yZ = lorentzian(x, y, xmin, xmax, 0.01, 0.8)
# Plot total spectrum (line) and Cartesian contributions (area)
plt.plot(xX, yX + yY + yZ, "k-", linewidth=2)
plt.fill_between(xX, 0, yX, alpha=0.6)
plt.fill_between(xX, yX, yX + yY, alpha=0.6)
plt.fill_between(xX, yX + yY, yX + yY + yZ, alpha=0.6)
plt.legend(("Isotropic", "x-pol.", "y-pol.", "z-pol."))
plt.xlim((xmin, xmax))
plt.tight_layout()
plt.show()
```

```
plt.figure(figsize=(7, 4))
# Extract frequencies
x = np.arange(min(au2ev * freqs), max(au2ev * freqs), 0.01)
# Spline spectra for each Cartesian contribution
yX = interp1d(au2ev * freqs, sX, kind="cubic")
yY = interp1d(au2ev * freqs, sY, kind="cubic")
yZ = interp1d(au2ev * freqs, sZ, kind="cubic")
# Plot total spectrum (line) and Cartesian contributions (area)
plt.plot(x, yX(x) + yY(x) + yZ(x), "k-", linewidth=2)
plt.fill_between(x, 0, yX(x), alpha=0.6)
plt.fill_between(x, yX(x), yX(x) + yY(x), alpha=0.6)
plt.fill_between(x, yX(x) + yY(x), yX(x) + yY(x) + yZ(x), alpha=0.6)
plt.legend(("Isotropic", "x-pol.", "y-pol.", "z-pol."))
plt.xlim((272, 284))
plt.tight_layout()
plt.show()
```

We see that the first two transitions are out-of-plane, correspond to the two strong \(1s \rightarrow \pi^{\ast}\) excitations. As such, if the molecule is rotated in space, these two lines would provide very good probes on the angle with respect to the out-of-plane direction.