Finding transition states is of crucial importance for computational chemistry, and there are many methods for doing so. However, most of these methods require some initial guess of the transition state structure. If the guess is not good enough, the method might not converge. Reaction force field interpolation allows for generating such guesses at a very cheap computational level by relying mostly on molecular mechanics.
Force field interpolation works by describing the reactant and product states with classical force fields and respectively (where is a vector containing all positions). By taking a linear combination
one can crudely describe the potential energy surface (PES) along the reaction. By doing an MM optimisation of the structure at various values of , the relative MM energy can be obtained. The optimised structure can also be used to perform a single-point quantum mechanical (QM) calculation, which will give a more accurate energy. The structure at which these energies are highest is a good guess for the transition state structure. This workflow can be executed with just a couple lines of code. Let’s consider the bromine substitution of ethyl chloride:
The following code will perform the entire workflow at the MM level and visualise the results:
import veloxchem as vlxrea1 = vlx.Molecule.read_smiles("[Br]")
rea2 = vlx.Molecule.read_smiles("CCl")
rea1.set_charge(-1)
pro1 = vlx.Molecule.read_smiles("[Cl]")
pro2 = vlx.Molecule.read_smiles("CBr")
pro1.set_charge(-1)
tsguesser = vlx.TransitionStateGuesser()
tsguesser.results_file = "ts_results_sn2.h5"
results_sn2 = tsguesser.find_transition_state([rea1, rea2], [pro1, pro2])Output
* Info * Building forcefields. Disable mute_ff_build to see detailed output.
Reaction summary
1 breaking bonds:
ReaType ProType ID - ReaType ProType ID
c3 c3 2 - cl cl 3
1 forming bonds:
ReaType ProType ID - ReaType ProType ID
br br 1 - c3 c3 2
* Info * System has charge -1.0 and multiplicity 1. Provide correct values if this is wrong.
* Info * Rounding lambda vector to 3 decimal places: [np.float64(0.0), np.float64(0.05), np.float64(0.1), np.float64(0.15), np.float64(0.2), np.float64(0.25), np.float64(0.3), np.float64(0.35), np.float64(0.4), np.float64(0.45), np.float64(0.5), np.float64(0.55), np.float64(0.6), np.float64(0.65), np.float64(0.7), np.float64(0.75), np.float64(0.8), np.float64(0.85), np.float64(0.9), np.float64(0.95), np.float64(1.0)]
* Info * Building MM systems for the transition state guess. Disable mute_ff_build to see detailed output.
Starting MM Scan
MD steps : 1000
conf. snapshots : 1
MD temperature : 600 K
MD step size : 0.001 ps
folder name : ts_data_1781179886
saving MD traj : False
Lambda | E1 (kj/mol) | E2 (kj/mol) | V (kj/mol) | n_conf
------------------------------------------------------------
0.00 -26.95 508.42 -26.95 1
0.05 -27.09 461.99 -2.63 1
0.10 -25.10 416.12 19.02 1
0.15 -20.64 372.79 38.38 1
0.20 -13.39 331.71 55.63 1
0.25 -3.24 292.90 70.80 1
0.30 10.23 255.56 83.83 1
0.35 27.22 219.80 94.62 1
0.40 48.49 184.82 103.02 1
0.45 74.55 150.72 108.82 1
0.50 106.19 117.30 111.75 1
0.55 143.58 85.22 111.49 1
0.60 185.91 55.68 107.78 1
0.65 231.18 30.17 100.52 1
0.70 278.58 8.98 89.86 1
0.75 327.44 -7.82 76.00 1
0.80 377.85 -20.50 59.17 1
0.85 430.25 -29.35 39.59 1
0.90 484.15 -34.47 17.39 1
0.95 538.88 -36.23 -7.48 1
1.00 591.74 -35.24 -35.24 1
* Info * Found highest MM E: 111.746 at Lambda: 0.5 and conformer index: 0.
* Info * Saving results to ts_results_sn2.h5
The results can the be visualised as follows:
try:
tsguesser.show_results(results_sn2)
except NameError:
vlx.TransitionStateGuesser().show_results(filename="ts_results_sn2.h5")* Info * Loading results from ts_results_sn2.h5
* Info * Loading results from ts_results_sn2.h5
To better understand the workflow and its limitations, let’s go through it step by step.
Step by step calculation of reaction¶
To create interpolated force fields, we need to provide reactant and product structures as molecule objects, either as a single molecule or as a list of molecules. The program will then calculate partial charges, but it is also possible to provide those manually. The program automatically figures out which atom in the reactant corresponds to which atom in the product, and from this mapping derives which bonds are broken and which are formed. It then performs an MM optimisation of the combined reactant and product systems, which will be used as starting points for the scan.
The returned dictionary contains the breaking, forming and static bonds (as 0-indexed atom pairs), as well as the generated reactant and product force fields. The output of the force field generation is normally muted, but this can be turned on by setting mute_ff_build = False before calling build_forcefields(). The output will then print out details of every step of the force field generation.
rea1 = vlx.Molecule.read_smiles("[Br]")
rea2 = vlx.Molecule.read_smiles("CCl")
rea1.set_charge(-1)
pro1 = vlx.Molecule.read_smiles("[Cl]")
pro2 = vlx.Molecule.read_smiles("CBr")
pro1.set_charge(-1)
tsguesser = vlx.TransitionStateGuesser()
tsguesser.mute_ff_build = False
tsguesser.results_file = "ts_results_sn2_steps.h5"
tsguesser.build_forcefields([rea1, rea2], [pro1, pro2])Output
Building force field for REA_1
* Info * Assigning partial charges based on electronegativity
* Info * Creating topology
* Info * Using GAFF (v2.11) parameters.
Reference: J. Wang, R. M. Wolf, J. W. Caldwell, P. A. Kollman, D. A. Case, J. Comput. Chem. 2004,
25, 1157-1174.
Building force field for REA_2
* Info * Assigning partial charges based on electronegativity
* Info * Creating topology
* Info * Sum of partial charges is not a whole number.
* Info * Compensating by removing 1.000e-06 from the largest charge.
* Info * Using GAFF (v2.11) parameters.
Reference: J. Wang, R. M. Wolf, J. W. Caldwell, P. A. Kollman, D. A. Case, J. Comput. Chem. 2004,
25, 1157-1174.
* Info * Creating combined reactant force field
* Info * max_x: 0.0, min_x: 0.0, shifting next molecule by 2.0 angstrom
* Info * Setting multiplicity of the combined molecule to 1 based on the multiplicities of the provided molecules.
* Info * Combined reactant with total charge -1.0 and multiplicity 1
Building force field for PRO_1
* Info * Assigning partial charges based on electronegativity
* Info * Creating topology
* Info * Using GAFF (v2.11) parameters.
Reference: J. Wang, R. M. Wolf, J. W. Caldwell, P. A. Kollman, D. A. Case, J. Comput. Chem. 2004,
25, 1157-1174.
Building force field for PRO_2
* Info * Assigning partial charges based on electronegativity
* Info * Creating topology
* Info * Using GAFF (v2.11) parameters.
Reference: J. Wang, R. M. Wolf, J. W. Caldwell, P. A. Kollman, D. A. Case, J. Comput. Chem. 2004,
25, 1157-1174.
* Info * Creating combined reactant force field
* Info * max_x: 0.0, min_x: 0.0, shifting next molecule by 2.0 angstrom
* Info * Setting multiplicity of the combined molecule to 1 based on the multiplicities of the provided molecules.
* Info * Combined reactant with total charge -1.0 and multiplicity 1
* Info * Matching reactant and product force fields with no forced breaking bonds and no forced forming bonds.
* Info * Searching for matching subgraphs to assign assist ids
* Info * Assigned 0 assist ids: {}
* Info * Set max breaking depth to 4 from max breaking attempts 10000000.0
* Info * Trying all 720 permutations of unassigned nodes. Expecting to consider 6 permutations based on element counts.
* Info * Multiple mappings with same least amount of changing bonds found, picking one the most changing H-bonds.
* Info * 6 equally good mappings found, picking first one.
* Info * Mapping {0: 2, 1: 1, 2: 0, 3: 3, 4: 4, 5: 5} with breaking bonds: {(1, 2)} and forming bonds: {(1, 0)}
* Info * Mapping {0: 2, 1: 1, 2: 0, 3: 3, 4: 5, 5: 4} with breaking bonds: {(1, 2)} and forming bonds: {(1, 0)}
* Info * Mapping {0: 2, 1: 1, 2: 0, 3: 4, 4: 3, 5: 5} with breaking bonds: {(1, 2)} and forming bonds: {(1, 0)}
* Info * Mapping {0: 2, 1: 1, 2: 0, 3: 4, 4: 5, 5: 3} with breaking bonds: {(1, 2)} and forming bonds: {(1, 0)}
* Info * Mapping {0: 2, 1: 1, 2: 0, 3: 5, 4: 3, 5: 4} with breaking bonds: {(1, 2)} and forming bonds: {(1, 0)}
* Info * Mapping {0: 2, 1: 1, 2: 0, 3: 5, 4: 4, 5: 3} with breaking bonds: {(1, 2)} and forming bonds: {(1, 0)}
* Info * Picking mapping {0: 2, 1: 1, 2: 0, 3: 3, 4: 4, 5: 5} with breaking bonds: {(1, 2)} and forming bonds: {(1, 0)}
* Info * Total subgraph isomorphism checks: 0, total subgraph monomorphism checks: 0
* Info * Mapping: {3: 1, 2: 2, 1: 3, 4: 4, 5: 5, 6: 6}
Reaction summary
1 breaking bonds:
ReaType ProType ID - ReaType ProType ID
c3 c3 2 - cl cl 3
1 forming bonds:
ReaType ProType ID - ReaType ProType ID
br br 1 - c3 c3 2
* Info * Guessing intra-molecular constraints based on breaking bonds. This option can be turned off with mm_opt_constrain_bonds.
* Info * Adding distance restraint for atoms (1, 2) with r0 0.443 and k 1000.0 for MM equilibration of the reactant
* Info * Running conformational sampling with 10000 steps and 10 snapshots at 600 K for reactant molecule.
* Info * This can be turned off with optimize_ff = False.
* Info * Found 1 conformers with energies [-24.480082925970315] during optimization of the reactant molecule.
* Info * Guessing intra-molecular constraints based on breaking bonds. This option can be turned off with mm_opt_constrain_bonds.
* Info * Adding distance restraint for atoms (2, 3) with r0 0.435 and k 1000.0 for MM equilibration of the product
* Info * Running conformational sampling with 10000 steps and 10 snapshots at 600 K for product molecule.
* Info * This can be turned off with optimize_ff = False.
* Info * Found 1 conformers with energies [-32.70609154743284] during optimization of the product molecule.
{'breaking_bonds': {(1, 2)},
'forming_bonds': {(0, 1)},
'static_bonds': {(1, 3), (1, 4), (1, 5)},
'reactant': <veloxchem.mmforcefieldgenerator.MMForceFieldGenerator at 0x7efae024c440>,
'product': <veloxchem.mmforcefieldgenerator.MMForceFieldGenerator at 0x7efae024cc20>}The optimised structures can be inspected as follows:
print("Reactant:")
tsguesser.reactant.molecule.show()
print("Product:")
tsguesser.product.molecule.show()Reactant:
Product:
If this looks reasonable, we can proceed to scan the reaction. By default, the reaction is split up into 21 evenly spaced values, but here we will only use 11 to speed it up. At every step, the structure is first minimised with respect to the interpolated force field, then sampled with molecular dynamics (by default 1000 steps at 600 K with a 0.001 ps step size, controlled by mm_steps, mm_temperature and mm_step_size), and then optimised again. The printed table shows the reactant force field energy , the product force field energy , the interpolated energy and the number of conformers found at each value.
Note that by default, only a results file (in HDF5 format) is written to disk. Set save_intermediates = True to also save the force fields, OpenMM systems and topologies for debugging purposes, and see Saving and loading results below for more details on the results file.
tsguesser.lambda_vector = [
0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0
]
tsguesser.build_systems()
mm_results_sn2 = tsguesser.scan_mm()Output
* Info * temperature: 300 (default)
* Info * nb_cutoff: 1.0 (default)
* Info * bonded_integration_bond_fac: 0.2 (default)
* Info * bonded_integration_angle_fac: 0.1 (default)
* Info * torsion_lambda_switch: 0.25 (default)
* Info * soft_core_coulomb_pes_static: False (default)
* Info * soft_core_lj_pes_static: False (default)
* Info * soft_core_coulomb_pes_dynamic: True (default)
* Info * soft_core_lj_pes_dynamic: True (default)
* Info * soft_core_coulomb_int: False
* Info * soft_core_lj_int: False
* Info * int_nb_const_exceptions: True (default)
* Info * dynamic_bond_tightening: 3.25 (default)
* Info * dynamic_bond_fc_factor: 0.8 (default)
* Info * pressure: -1.0 (default)
* Info * solvent: None (default)
* Info * padding: 1.5 (default)
* Info * no_reactant: False (default)
* Info * E_field: [0, 0, 0] (default)
* Info * neutralize: False (default)
* Info * morse_D_default: 500 (default)
* Info * morse_couple: 1 (default)
* Info * posres_k: 1000 (default)
* Info * posres_residue_radius: -1 (default)
* Info * coul14_scale: 0.833 (default)
* Info * lj14_scale: 0.5 (default)
* Info * pdb: None (default)
* Info * pdb_active_res: None (default)
* Info * no_force_groups: False (default)
* Info * nb_switching_function: True (default)
* Info * graphene: False (default)
* Info * graphene_size_nm: 4 (default)
* Info * CNT: False (default)
* Info * CNT_radius_nm: 0.5 (default)
* Info * decompose_nb: None (default)
* Info * decompose_bonded: True (default)
* Info * implicit_solvent_model: None (default)
* Info * solute_dielectric: 1.0 (default)
* Info * solvent_dielectric: 78.39 (default)
* Info * conformer_phi_reactant: 0.0 (default)
* Info * conformer_phi_product: 0.0 (default)
* Info * conformer_k: 200.0 (default)
* Info * System has charge -1.0 and multiplicity 1. Provide correct values if this is wrong.
* Info * Rounding lambda vector to 3 decimal places: [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
Starting MM Scan
MD steps : 1000
conf. snapshots : 1
MD temperature : 600 K
MD step size : 0.001 ps
folder name : ts_data_1781179895
saving MD traj : False
Lambda | E1 (kj/mol) | E2 (kj/mol) | V (kj/mol) | n_conf
------------------------------------------------------------
0.00 -26.95 508.58 -26.95 1
0.10 -25.14 416.48 19.02 1
0.20 -13.39 331.68 55.63 1
0.30 10.19 255.65 83.83 1
0.40 48.43 184.91 103.02 1
0.50 106.25 117.25 111.75 1
0.60 185.91 55.68 107.77 1
0.70 278.56 8.99 89.86 1
0.80 378.01 -20.54 59.17 1
0.90 484.10 -34.46 17.39 1
1.00 591.57 -35.25 -35.25 1
* Info * Found highest MM E: 111.754 at Lambda: 0.5 and conformer index: 0.
* Info * Saving results to ts_results_sn2_steps.h5
The results can then be visualised interactively. The slider selects the value and the dropdown selects the conformer shown (if more than one conformer was found at that value). The plot shows the relative MM energy along the scan, and the 3D view shows the corresponding structure with the forming and breaking bonds indicated.
try:
tsguesser.show_results(mm_results_sn2)
except NameError:
vlx.TransitionStateGuesser().show_results(
filename="ts_results_sn2_steps.h5")While the MM energies give a reasonable description of the energetic contributions from coulombic and van der Waals interactions, they do not capture bond breaking and formation. To get a better estimate of the energy, we can perform single-point QM calculations on the MM optimised structures with scan_qm(). The functional and basis set are controlled by the qm_xcfun and qm_basis attributes, which default to PBE0 and def2-SVP. At every value, up to max_qm_conformers (default 5) of the lowest-MM-energy conformers are evaluated. If an SCF calculation does not converge, it is retried with a looser convergence threshold, and if that also fails the energy is reported as NaN and that conformer is ignored.
A QM scan can also be included directly in find_transition_state() by setting do_qm_scan = True. It is furthermore possible to provide your own (possibly cheaper) driver through the scf_drv attribute, for example an XTB driver, and to unmute the SCF output with mute_scf = False. See the bottom of this page for a complete list of attributes that can be set to control the workflow.
tsguesser.qm_xcfun = "pbe0" # default
tsguesser.qm_basis = "def2-svp" # default
qm_results_sn2 = tsguesser.scan_qm()Output
* Info * Disable mute_scf to see detailed output.
Starting QM Scan
basis set : def2-svp
DFT xc functional : pbe0
max conformers : 5
Lambda | Conf. i | Rel. E (kJ/mol) | MM V (kJ/mol)
------------------------------------------------------------
0.00 1 0.000 -26.945
0.10 1 4.593 19.031
0.20 1 23.294 55.627
0.30 1 49.809 83.827
0.40 1 76.661 103.022
0.50 1 95.763 111.749
0.60 1 96.567 107.771
0.70 1 79.452 89.862
0.80 1 58.192 59.179
0.90 1 46.640 17.389
1.00 1 53.358 -35.254
* Info * Found highest QM E: -8068983.011 at Lambda: 0.6 and conformer index: 0.
* Info * Saving results to ts_results_1781171538.h5
When the results are visualised again, the plot now also shows the QM energies. These results show how the energies of the MM scan are in general different from the QM energies because the MM scan does not capture the energetics of bond breaking and formation.
Energy difference bteween reactant and product end structures. Complete MM energy profile is only qualitatively correct, and the actual energy values are not meaningful. The QM energies are more accurate, but still not perfect because they are single point energies on MM optimised structures. In this example, the peak of the MM energy is very close to the peak of the QM energy, showing that even just using the MM guess can give a good estimate of the transition state structure. For simpler systems, this is usually the case. For more complex systems, it is thus recomended to perform the QM scan.
try:
tsguesser.show_results(qm_results_sn2)
except NameError:
vlx.TransitionStateGuesser().show_results(
filename="ts_results_sn2_steps.h5")The transition state guess is available as a molecule object through tsguesser.molecule, with which we can proceed to a proper QM transition state optimisation.
basis = vlx.MolecularBasis.read(tsguesser.molecule, "def2-svp")
scf_drv = vlx.ScfRestrictedDriver()
scf_drv.ostream.mute()
scf_drv.xcfun = "pbe0"
scf_results = scf_drv.compute(tsguesser.molecule, basis)
scf_opt_drv = vlx.OptimizationDriver(scf_drv)
scf_opt_drv.ostream.mute()
scf_opt_drv.transition = True # We set the option to find transition states
opt_results = scf_opt_drv.compute(tsguesser.molecule, basis, scf_results)
vlx.Molecule.read_xyz_string(opt_results['final_geometry']).show()Conformer searching¶
Some systems have many conformers along the PES, and in order to find the correct transition state, it is important to find the lowest lying conformer at every value. The TransitionStateGuesser provides three options for this, all turned off by default because they are more time consuming:
peak_conformer_search: after the initial scan, a conformer search is performed around the peak of the PES. The search covers the peak ±peak_conformer_search_range(default 2) values, and is repeated iteratively if the peak moves to a value that has not been searched yet.discont_conformer_search: along the scan, should increase monotonically and should decrease monotonically. If this is not the case, the obtained structures could be part of different valleys of the energy landscape. With this option enabled, such discontinuities trigger a conformer search at the affected values, repeated until no discontinuities remain.force_conformer_search: performs a conformer search at every value.
During a conformer search, the MD sampling is repeated conformer_snapshots (default 10) times, and every minimised snapshot is kept as a conformer candidate. Conformers whose interpolated energies differ by less than mm_conformer_equivalence_threshold (default 0.1 kJ/mol) are considered duplicates and pruned. It is also possible to scan the PES from product to reactant in addition to the forward direction by setting mm_scan_backward = True, which can likewise help to find lower lying conformers.
Depending on the system and the use case, the user has to decide which options are appropriate. In the following example, the bromide ion attacks a branched substrate. The flexible backbone gives the system more conformational freedom: in vacuum, there is nothing to counteract the strong electrostatic attraction, and the alkyl side groups curl around the ion. Rescanning the entire PES is not necessary, but performing a conformer search at the peak is useful to obtain the correct conformer for the transition state guess.
rea1 = vlx.Molecule.read_smiles("[Br-]")
rea2 = vlx.Molecule.read_smiles("ClCCCCN1C=C[N+](C)=C1")
pro1 = vlx.Molecule.read_smiles("[Cl-]")
pro2 = vlx.Molecule.read_smiles("BrCCCCN1C=C[N+](C)=C1")
tsguesser_conf = vlx.TransitionStateGuesser()
tsguesser_conf.peak_conformer_search = True
tsguesser_conf.results_file = "ts_results_conformer_vacuum.h5"
vacuum_results = tsguesser_conf.find_transition_state([rea1, rea2],
[pro1, pro2])Output
* Info * Building forcefields. Disable mute_ff_build to see detailed output.
Reaction summary
1 breaking bonds:
ReaType ProType ID - ReaType ProType ID
cl cl 2 - c3 c3 3
1 forming bonds:
ReaType ProType ID - ReaType ProType ID
br br 1 - c3 c3 3
* Info * System has charge 0.0 and multiplicity 1. Provide correct values if this is wrong.
* Info * Rounding lambda vector to 3 decimal places: [np.float64(0.0), np.float64(0.05), np.float64(0.1), np.float64(0.15), np.float64(0.2), np.float64(0.25), np.float64(0.3), np.float64(0.35), np.float64(0.4), np.float64(0.45), np.float64(0.5), np.float64(0.55), np.float64(0.6), np.float64(0.65), np.float64(0.7), np.float64(0.75), np.float64(0.8), np.float64(0.85), np.float64(0.9), np.float64(0.95), np.float64(1.0)]
* Info * Building MM systems for the transition state guess. Disable mute_ff_build to see detailed output.
Starting MM Scan
MD steps : 1000
conf. snapshots : 1
MD temperature : 600 K
MD step size : 0.001 ps
folder name : ts_data_1781179902
saving MD traj : False
Lambda | E1 (kj/mol) | E2 (kj/mol) | V (kj/mol) | n_conf
------------------------------------------------------------
0.00 -79.96 766.69 -79.96 1
0.05 -67.86 563.34 -36.30 1
0.10 -75.55 507.21 -17.27 1
0.15 -74.14 431.93 1.77 1
0.20 -64.48 386.04 25.62 1
0.25 -54.27 364.21 50.35 1
0.30 -37.88 323.42 70.51 1
0.35 -12.91 259.64 82.48 1
0.40 13.59 219.13 95.81 1
0.45 45.20 179.32 105.55 1
0.50 83.35 140.25 111.80 1
0.55 127.55 102.09 113.55 1
0.60 178.62 65.47 110.73 1
0.65 230.90 34.09 102.97 1
0.70 287.14 6.90 90.98 1
0.75 345.82 -15.38 74.92 1
0.80 479.30 -58.96 48.69 1
0.85 535.13 -72.27 18.84 1
0.90 590.40 -80.18 -13.12 1
0.95 645.27 -83.34 -46.91 1
1.00 699.97 -81.73 -81.73 1
* Info * Found peak MM E: 113.548 at Lambda: 0.55 (iteration 1).
* Info * Doing conformer search from Lambda: 0.45 to Lambda: 0.65.
Starting MM Scan (with conformer search)
MD steps : 1000
conf. snapshots : 10
MD temperature : 600 K
MD step size : 0.001 ps
folder name : ts_data_1781179902
saving MD traj : False
Lambda | E1 (kj/mol) | E2 (kj/mol) | V (kj/mol) | n_conf
------------------------------------------------------------
0.45 45.05 179.37 105.50 4
0.50 83.61 139.91 111.76 4
0.55 178.48 55.82 111.02 5
0.60 224.14 21.27 102.42 3
0.65 263.72 -6.70 87.95 4
* Info * Found highest MM E: 111.759 at Lambda: 0.5 and conformer index: 1.
* Info * Saving results to ts_results_conformer_vacuum.h5
The results now show some extra stripes, indicating higher lying conformers. The cation on the flexible side-chain now stabilises the charged transition state structure.
try:
tsguesser_conf.show_results(vacuum_results)
except NameError:
vlx.TransitionStateGuesser().show_results(
filename="ts_results_conformer_vacuum.h5")Implicit solvation¶
In the gas phase, the electrostatic interaction between the ion and the substrate is unscreened. This is why in the previous example, cation coordinates with the active site. In solution this picture changes qualitatively: the solvent screens the Coulomb interaction and stabilises the localised charge of the Chlorine and Bromine, so the ionic ligand float around more freely instead of coordinating tightly to the charge.
These effects can be captured at almost no extra cost with an implicit solvent model. Setting the implicit_solvent_model attribute to one of the generalised Born models 'gbn', 'gbn2', 'obc1', 'obc2' or 'hct' adds the corresponding solvation term to the MM systems. The dielectric constants can be adjusted with solute_dielectric (default 1.0) and solvent_dielectric (default 78.39, corresponding to water). Note that when a solvent model is enabled, find_transition_state automatically switches to RESP charges for the force field generation, which requires a (cheap) SCF calculation per molecule and makes the force field build somewhat more expensive.
Let’s rerun the previous example with implicit solvation:
rea1 = vlx.Molecule.read_smiles("[Br-]")
rea2 = vlx.Molecule.read_smiles("ClCCCCN1C=C[N+](C)=C1")
pro1 = vlx.Molecule.read_smiles("[Cl-]")
pro2 = vlx.Molecule.read_smiles("BrCCCCN1C=C[N+](C)=C1")
tsguesser_impl = vlx.TransitionStateGuesser()
tsguesser_impl.peak_conformer_search = True
tsguesser_impl.implicit_solvent_model = "obc2"
tsguesser_impl.results_file = "ts_results_conformer_solvent.h5"
solvent_results = tsguesser_impl.find_transition_state([rea1, rea2],
[pro1, pro2])Output
* Info * Building forcefields. Disable mute_ff_build to see detailed output.
Reaction summary
1 breaking bonds:
ReaType ProType ID - ReaType ProType ID
cl cl 2 - c3 c3 3
1 forming bonds:
ReaType ProType ID - ReaType ProType ID
br br 1 - c3 c3 3
* Info * System has charge 0.0 and multiplicity 1. Provide correct values if this is wrong.
* Info * Rounding lambda vector to 3 decimal places: [np.float64(0.0), np.float64(0.05), np.float64(0.1), np.float64(0.15), np.float64(0.2), np.float64(0.25), np.float64(0.3), np.float64(0.35), np.float64(0.4), np.float64(0.45), np.float64(0.5), np.float64(0.55), np.float64(0.6), np.float64(0.65), np.float64(0.7), np.float64(0.75), np.float64(0.8), np.float64(0.85), np.float64(0.9), np.float64(0.95), np.float64(1.0)]
* Info * Building MM systems for the transition state guess. Disable mute_ff_build to see detailed output.
Starting MM Scan
MD steps : 1000
conf. snapshots : 1
MD temperature : 600 K
MD step size : 0.001 ps
folder name : ts_data_1781179930
saving MD traj : False
implicit solvent : obc2
solvent dielectric : 78.39
solute dielectric : 1.00
Lambda | E1 (kj/mol) | E2 (kj/mol) | V (kj/mol) | n_conf
------------------------------------------------------------
0.00 -511.72 307.42 -511.72 1
0.05 -503.84 180.84 -469.60 1
0.10 -495.86 130.29 -433.24 1
0.15 -488.31 73.44 -404.05 1
0.20 -474.56 29.66 -373.71 1
0.25 -456.35 -13.00 -345.51 1
0.30 -435.53 -54.04 -321.08 1
0.35 -410.51 -94.52 -299.92 1
0.40 -380.22 -136.76 -282.84 1
0.45 -345.97 -177.15 -270.00 1
0.50 -306.34 -216.95 -261.65 1
0.55 -262.69 -254.89 -258.40 1
0.60 -215.64 -290.27 -260.42 1
0.65 -166.04 -322.16 -267.52 1
0.70 -115.66 -350.21 -279.84 1
0.75 -65.07 -374.27 -296.97 1
0.80 -12.51 -394.86 -318.39 1
0.85 42.36 -411.61 -343.52 1
0.90 96.35 -423.12 -371.17 1
0.95 153.15 -432.21 -402.95 1
1.00 334.56 -439.63 -439.63 1
* Info * Found peak MM E: -258.400 at Lambda: 0.55 (iteration 1).
* Info * Doing conformer search from Lambda: 0.45 to Lambda: 0.65.
Starting MM Scan (with conformer search)
MD steps : 1000
conf. snapshots : 10
MD temperature : 600 K
MD step size : 0.001 ps
folder name : ts_data_1781179930
saving MD traj : False
implicit solvent : obc2
solvent dielectric : 78.39
solute dielectric : 1.00
Lambda | E1 (kj/mol) | E2 (kj/mol) | V (kj/mol) | n_conf
------------------------------------------------------------
0.45 -346.01 -177.18 -270.04 4
0.50 -306.13 -217.10 -261.61 3
0.55 -262.66 -254.62 -258.24 3
0.60 -215.77 -290.20 -260.43 7
0.65 -166.79 -322.06 -267.71 5
* Info * Found highest MM E: -258.400 at Lambda: 0.55 and conformer index: 0.
* Info * Saving results to ts_results_conformer_solvent.h5
And indeed, now the results show a structure where the ionic ligand is not coordinating.
try:
tsguesser_impl.show_results(solvent_results)
except NameError:
vlx.TransitionStateGuesser().show_results(
filename="ts_results_conformer_solvent.h5")Comparing the two scans, the difference in behaviour is clearly visible in the structures: in vacuum the substrate wraps itself to the chlorine and bromine, while in implicit solvent the chain stays extended and the ion sits much more loosely at the reactive carbon.
When a QM scan is performed with implicit solvation enabled, the solvent-model micro-environment is mirrored on the QM side using the SMD solvation model. The solvent is selected with the smd_solvent attribute (default 'water'). If a custom scf_drv that does not support SMD (such as XTB) is provided, a warning is printed and the QM scan proceeds without solvation.
Conformational transition states¶
Not every transition state involves bond breaking and formation. The rotation around a single bond also passes through a barrier, and such conformational transition states are relevant for example for rotamer populations. If the reactant and product have identical bond topologies, no bonds are broken or formed, and the interpolated force fields would be identical. In this case, the TransitionStateGuesser automatically switches to a different strategy: it identifies the active torsion — the dihedral angle that differs most between the reactant and product geometries — and replaces it in the sampling systems with a periodic restraint whose minimum shifts gradually from the reactant angle () to the product angle ().
The active torsion is detected automatically by measuring every dihedral in the force field and selecting the one with the largest (shortest-path) angle difference. If no dihedral differs by more than 20°, an error is raised. The detection can be bypassed by setting the active_torsion attribute to a 1-indexed tuple of four atom indices. The strength of the restraint is controlled by conformer_k (default 200 kJ/mol).
As an example, let’s find the barrier for the classic anti to gauche rotation of butane. We read the molecule from SMILES and set the backbone dihedral of the reactant and product to 180° and 60° respectively:
rea = vlx.Molecule.read_smiles("CCCC")
pro = vlx.Molecule.read_xyz_string(rea.get_xyz_string())
rea.set_dihedral_in_degrees([1, 2, 3, 4], 180.0) # anti
pro.set_dihedral_in_degrees([1, 2, 3, 4], 60.0) # gauche
tsguesser = vlx.TransitionStateGuesser()
tsguesser.results_file = "ts_results_butane.h5"
# tsguesser.active_torsion = (1, 2, 3, 4) # Optional if the active torsion is not detected correctly
results_butane = tsguesser.find_transition_state(rea, pro)Output
* Info * Building forcefields. Disable mute_ff_build to see detailed output.
Reaction summary
0 breaking bonds:
0 forming bonds:
* Info * System has charge 0.0 and multiplicity 1. Provide correct values if this is wrong.
* Info * Rounding lambda vector to 3 decimal places: [np.float64(0.0), np.float64(0.05), np.float64(0.1), np.float64(0.15), np.float64(0.2), np.float64(0.25), np.float64(0.3), np.float64(0.35), np.float64(0.4), np.float64(0.45), np.float64(0.5), np.float64(0.55), np.float64(0.6), np.float64(0.65), np.float64(0.7), np.float64(0.75), np.float64(0.8), np.float64(0.85), np.float64(0.9), np.float64(0.95), np.float64(1.0)]
* Info * Building MM systems for the transition state guess. Disable mute_ff_build to see detailed output.
* Info * No forming or breaking bonds detected. Treating as a conformational transition state: using a shifting dihedral restraint in the integration systems.
* Info * Active torsion (1-indexed): (1, 2, 3, 4), phi_reactant = 180.0°, phi_product = 60.0°, |Δφ| = 120.0°.
* Info * Conformer restraint force constant: 200.0 kJ/mol.
Starting MM Scan
MD steps : 1000
conf. snapshots : 1
MD temperature : 600 K
MD step size : 0.001 ps
folder name : ts_data_1781180011
saving MD traj : False
conformational TS : yes
active torsion (1-idx) : (1, 2, 3, 4)
phi reactant : 180.0 deg
phi product : 60.0 deg
restraint k : 200.0 kJ/mol
Lambda | E1 (kj/mol) | E2 (kj/mol) | V (kj/mol) | n_conf
------------------------------------------------------------
0.00 19.05 17.55 19.05 1
0.05 19.31 17.82 19.24 1
0.10 20.05 18.55 19.90 1
0.15 21.23 19.74 21.01 1
0.20 22.87 21.37 22.57 1
0.25 24.69 23.17 24.31 1
0.30 26.55 25.03 26.09 1
0.35 28.29 26.77 27.76 1
0.40 29.92 28.39 29.31 1
0.45 30.92 29.37 30.22 1
0.50 31.39 29.83 30.61 1
0.55 31.20 29.65 30.35 1
0.60 30.38 28.81 29.44 1
0.65 29.22 27.65 28.20 1
0.70 27.92 26.36 26.83 1
0.75 26.57 25.01 25.40 1
0.80 25.31 23.74 24.06 1
0.85 24.38 22.78 23.02 1
0.90 23.81 22.23 22.39 1
0.95 23.54 21.93 22.01 1
1.00 23.69 22.07 22.07 1
* Info * Found highest MM E: 30.610 at Lambda: 0.5 and conformer index: 0.
* Info * Saving results to ts_results_butane.h5
try:
tsguesser.show_results(results_butane)
except NameError:
vlx.TransitionStateGuesser().show_results(filename="ts_results_butane.h5")Atom mapping and forced bond breaking¶
To correctly interpolate between the reactant and the product, each atom in the reactant needs to be matched to an atom in the product. This mapping is detected automatically by searching for the assignment that breaks and forms as few bonds as possible, but it can also be provided manually through the product_mapping keyword (a 1-indexed dictionary from reactant to product atoms), in which case the matching step is skipped entirely.
Sometimes the most economical mapping is not the reaction of interest. A typical case is catalysis: if a catalyst molecule appears unchanged on both sides of the reaction, the optimal mapping simply maps the catalyst onto itself and leaves it untouched, even though the actual mechanism proceeds through the catalyst. For these situations, bonds can be forced to break with the forced_breaking_bonds keyword (a set of 1-indexed atom-index pairs). Forced bonds are not allowed to reconnect, so the mapping has to route the reaction through the catalyst. Bonds can analogously be forced to form with forced_forming_bonds.
The following example demonstrates this with the tautomerisation of formamide to formamidic acid, assisted by a water molecule — a classic model for water-catalysed proton transfer. Without any forced bonds, the optimal mapping leaves the water untouched and transfers the proton directly from the nitrogen to the carbonyl oxygen (one breaking and one forming bond). By forcing the water O–H bond to break, the water instead acts as a proton shuttle: it donates its proton to the carbonyl oxygen and accepts a proton from the nitrogen. The atom_indices parameter for molecule.show() can be used to show the atom indices.
rea1 = vlx.Molecule.read_smiles("C(=O)N") # formamide, atoms 1-6
rea2 = vlx.Molecule.read_smiles("O") # water, atoms 7-9
rea1.show(atom_indices=True)
rea2.show(atom_indices=True, starting_index=rea1.number_of_atoms() + 1)
pro1 = vlx.Molecule.read_smiles("C(=N)O") # formamidic acid
pro2 = vlx.Molecule.read_smiles("O")tsguesser = vlx.TransitionStateGuesser()
tsguesser.results_file = "ts_results_formamide.h5"
results_formamide = tsguesser.find_transition_state(
[rea1, rea2],
[pro1, pro2],
forced_breaking_bonds={(7, 8)}, # the water O-H bond
)Output
* Info * Building forcefields. Disable mute_ff_build to see detailed output.
Reaction summary
2 breaking bonds:
ReaType ProType ID - ReaType ProType ID
ow ow 7 - hw ho 8
nt n2 3 - hn hw 6
2 forming bonds:
ReaType ProType ID - ReaType ProType ID
o oh 2 - hw ho 8
hn hw 6 - ow ow 7
* Info * System has charge 0.0 and multiplicity 1. Provide correct values if this is wrong.
* Info * Rounding lambda vector to 3 decimal places: [np.float64(0.0), np.float64(0.05), np.float64(0.1), np.float64(0.15), np.float64(0.2), np.float64(0.25), np.float64(0.3), np.float64(0.35), np.float64(0.4), np.float64(0.45), np.float64(0.5), np.float64(0.55), np.float64(0.6), np.float64(0.65), np.float64(0.7), np.float64(0.75), np.float64(0.8), np.float64(0.85), np.float64(0.9), np.float64(0.95), np.float64(1.0)]
* Info * Building MM systems for the transition state guess. Disable mute_ff_build to see detailed output.
Starting MM Scan
MD steps : 1000
conf. snapshots : 1
MD temperature : 600 K
MD step size : 0.001 ps
folder name : ts_data_1781180018
saving MD traj : False
Lambda | E1 (kj/mol) | E2 (kj/mol) | V (kj/mol) | n_conf
------------------------------------------------------------
0.00 -67.30 1182.04 -67.30 1
0.05 -62.67 898.72 -14.60 1
0.10 -51.69 825.24 36.01 1
0.15 -32.41 753.27 85.44 1
0.20 -3.88 681.19 133.13 1
0.25 39.10 609.35 181.66 1
0.30 84.60 536.35 220.13 1
0.35 136.43 463.51 250.90 1
0.40 192.30 391.62 272.03 1
0.45 245.50 323.26 280.49 1
0.50 311.48 254.82 283.15 1
0.55 372.66 192.78 273.72 1
0.60 433.22 136.58 255.24 1
0.65 490.33 88.11 228.89 1
0.70 554.33 44.70 197.59 1
0.75 615.47 9.41 160.92 1
0.80 677.96 -18.62 120.70 1
0.85 743.89 -38.89 78.53 1
0.90 814.26 -51.70 34.89 1
0.95 894.55 -58.63 -10.97 1
1.00 1230.13 -61.32 -61.32 1
* Info * Found highest MM E: 283.151 at Lambda: 0.5 and conformer index: 0.
* Info * Saving results to ts_results_formamide.h5
The reaction summary shows that besides the forced water O–H bond, an N–H bond of the formamide is broken, and that two new bonds are formed: one from the water proton to the carbonyl oxygen and one from the amide proton to the water oxygen — the water indeed acts as a proton shuttle.
try:
tsguesser.show_results(results_formamide)
except NameError:
vlx.TransitionStateGuesser().show_results(
filename="ts_results_formamide.h5")Saving and loading results¶
By default, the results are saved to an HDF5 file after every MM and QM scan (ts_results_<timestamp>.h5), so that no data is lost if a later step crashes. This behaviour is controlled by the save_results_file attribute, and the file name can be set through the results_file attribute. The file contains the scan data (structures and energies of all conformers at every value), the breaking, forming and static bonds, the reactant and product force fields, and the transition state guess.
Results can also be saved manually with save_results(filename, results), and loaded back with the static method TransitionStateGuesser.load_results(filename), which returns the results dictionary. The visualiser can read such a file directly through its filename argument, which makes it easy to inspect a scan that was run earlier (for example on a cluster):
tsguesser.save_results("ts_results_formamide.h5", results_formamide)
tsguesser.show_results(filename="ts_results_formamide.h5")* Info * Saving results to ts_results_formamide.h5
* Info * Loading results from ts_results_formamide.h5
* Info * Loading results from ts_results_formamide.h5
Available parameters and options¶
Below is a complete list of all options with their default values. They are all set as attributes of the TransitionStateGuesser object before calling find_transition_state or the individual steps.
MM scan
| Option | Default | Description |
|---|---|---|
lambda_vector | 21 evenly spaced values from 0 to 1 | The values of the scan |
mm_temperature | 600 | Temperature (K) of the MD sampling |
mm_steps | 1000 | Number of MD steps per value |
mm_step_size | 0.001 | MD step size (ps) |
save_mm_traj | False | Save the MD trajectories to disk |
mm_scan_backward | False | Also scan from product to reactant |
mute_ff_build | True | Suppress detailed force field build output |
Conformer searching
| Option | Default | Description |
|---|---|---|
peak_conformer_search | False | Conformer search around the PES peak after the scan |
peak_conformer_search_range | 2 | Number of values on each side of the peak to include |
discont_conformer_search | False | Conformer search at discontinuities in / |
force_conformer_search | False | Conformer search at every value |
conformer_snapshots | 10 | MD snapshots (conformer candidates) per search |
mm_conformer_equivalence_threshold | 0.1 | Energy threshold (kJ/mol) below which conformers are considered duplicates |
Conformational transition states
| Option | Default | Description |
|---|---|---|
active_torsion | None | 1-indexed atom indices of the active torsion; automatically detected if None |
conformer_k | 200.0 | Force constant (kJ/mol) of the shifting dihedral restraint |
Implicit solvation
| Option | Default | Description |
|---|---|---|
implicit_solvent_model | None | One of 'gbn', 'gbn2', 'obc1', 'obc2', 'hct'; None runs in vacuum |
solute_dielectric | 1.0 | Solute dielectric constant |
solvent_dielectric | 78.39 | Solvent dielectric constant |
smd_solvent | 'water' | Solvent name for the SMD model in the QM scan |
QM scan
| Option | Default | Description |
|---|---|---|
do_qm_scan | False | Include the QM scan in find_transition_state |
qm_xcfun | 'PBE0' | DFT exchange-correlation functional |
qm_basis | 'def2-svp' | Basis set |
max_qm_conformers | 5 | Maximum number of conformers evaluated per value |
mute_scf | True | Suppress detailed SCF output |
scf_drv | None | Custom QM driver; a restricted SCF driver is created if None |
File output
| Option | Default | Description |
|---|---|---|
save_results_file | True | Save the results to an HDF5 file after every scan |
results_file | ts_results_<timestamp>.h5 | Name of the results file |
save_intermediates | False | Save force fields (JSON), OpenMM systems (XML) and topologies (PDB) to disk |
folder_name | ts_data_<timestamp> | Folder for the intermediate files |
Force field builder options¶
Under the hood, the TransitionStateGuesser builds the reactant and product force fields with a ReactionForceFieldBuilder. This builder is exposed as the ffbuilder attribute, and its options can be changed before calling find_transition_state or build_forcefields:
tsguesser = vlx.TransitionStateGuesser()
# skip the automatic atom mapping (e.g. when a product_mapping is supplied)
tsguesser.ffbuilder.skip_reaction_matching = True
# recompute guessed bond/angle parameters from a QM hessian
tsguesser.ffbuilder.reparameterize_bonds = True
tsguesser.ffbuilder.reparameterize_angles = True
# disable the MM equilibration of the matched reactant/product geometries
tsguesser.ffbuilder.optimize_ff = FalseKeyword arguments of build_forcefields¶
The keyword arguments accepted by find_transition_state and build_forcefields are forwarded directly to the builder. In addition to forced_breaking_bonds, forced_forming_bonds and product_mapping discussed above, these are:
| Argument | Default | Description |
|---|---|---|
reactant_partial_charges | None | Partial charges for the reactant (per molecule); calculated if not provided |
product_partial_charges | None | Partial charges for the product (per molecule); calculated if not provided |
reactant_hessians | None | Hessians for the reactant, used by the Seminario method; calculated if not provided |
product_hessians | None | Hessians for the product; calculated if not provided |
reactant_total_multiplicity | -1 | Override the total multiplicity of the reactant (-1 keeps the calculated value) |
product_total_multiplicity | -1 | Override the total multiplicity of the product |
forced_breaking_bonds | () | Set of 1-indexed atom pairs forced to break |
forced_forming_bonds | () | Set of 1-indexed atom pairs forced to form |
product_mapping | None | 1-indexed reactant→product atom mapping; skips the matching step |
Builder attributes (tsguesser.ffbuilder)¶
| Option | Default | Description |
|---|---|---|
calculate_resp | set automatically | Use RESP charges (requires an SCF) instead of electronegativity-based charges. The TransitionStateGuesser enables this only when an implicit solvent model is set |
skip_reaction_matching | False | Skip the automatic atom mapping entirely |
optimize_mol | False | Optimise the geometry of each input molecule with XTB before building the force field |
optimize_ff | True | After matching, equilibrate the reactant and product geometries with a short MM run (only done when bonds break or form) |
optimize_steps | 1000 | MD steps for the force field equilibration |
optimize_conformer_snapshots | 10 | Number of snapshots taken during the equilibration |
optimize_temp | 600 | Temperature (K) of the equilibration |
optimize_dist_restraint_offset | 0.5 | Offset (Å) added to the equilibrium distance of the restraints used during equilibration |
optimize_dist_restraint_k | 1000.0 | Force constant (kJ/mol/nm²) of those distance restraints |
mm_opt_constrain_bonds | True | Guess intramolecular constraints from the breaking bonds during equilibration |
reparameterize_bonds | False | Recompute guessed bond parameters from a QM hessian |
reparameterize_angles | False | Recompute guessed angle parameters from a QM hessian |
hessian_xc_fun | 'B3LYP' | Functional used for the reparameterisation hessian |
hessian_basis | 'def2-SV(P)' | Basis set used for the reparameterisation hessian |
water_model | 'cspce' | Water model used when building topologies |
mute_scf | True | Suppress SCF output during charge and hessian calculations |