11-11 Koopmans’ Theorem

Chapter 11 The SCF-LCAO-MO Method and Extensions Recall from Chapter 5 that the numbers in parentheses stand for the spatial coordinates of an electron; that is, φ1(1) really means φ1(x1, y1, z1)α(1) or φ1(r1, θ1, φ1)α(1).1 In other words, if we pick values of r, θ , and φ for each of the four electrons and insert them into Eq. (11-22) we will be able to evaluate each function and we will obtain a determinant of numbers which can be evaluated to give a numerical value for ψ and ψ2. The latter number (times dν) can be taken as the probability for ﬁnding one electron in the volume element around r1, θ1, and φ1, another electron simultaneously in dν2 at r2, θ2, and φ2, etc. The important point to notice is that the effect on ψ2 of a particular choice of r1, θ1, and φ1, is not dependent on choices of r, θ, φ for other electrons because the form of the wavefunction is products of functions of independent
coordinates. Physically, this corresponds to saying that the probability for ﬁnding an electron in dν1, at some instant is not inﬂuenced by the presence or absence of another electron in some nearby element dν2, at the same instant. This is consistent with the fact that the Fock operator ˆ F [Eq. (11-7)] treats each electron as though it were moving in the time-averaged potential ﬁeld due to the other electrons.

Because electrons repel each other, there is a tendency for them to keep out of each other’s way. That is, in reality, their motions are correlated. The HF energy is higher than the true energy because the HF wavefunction is formally incapable of describing correlated motion. The energy difference between the HF and the “exact” (for a simpliﬁed nonrelativistic hamiltonian) energy for a system is referred to as the correlation energy.

11-11 Koopmans’ Theorem

Despite the fact that the total electronic energy is not given by the sum of SCF oneelectron energies, it is still possible to relate the i’s to physical measurements. If certain assumptions are made, it is possible to equate orbital energies with molecular ionization energies or electron afﬁnities. This identiﬁcation is related to a theorem due to Koopmans.

Koopmans [1] proved2 that the wavefunction obtained by removing one electron from φk, or adding one electron to the virtual (i.e., unoccupied) MO φj in a Hartree– Fock wavefunction is stable with respect to any subsequent variation in φk, or φj .

Notice that this ignores the question of subsequent variation of all of the MOs φ with unchanged occupations. It is not necessarily true that they remain optimized, since the potential they experience is changed by addition or removal of an electron. Nevertheless, Koopmans’ theorem suggests a model. It suggests that we approximate the wavefunction for a positive ion by removing an electron from one of the occupied HF MOs for a neutral molecule without reoptimizing any of the MOs. Let us do this and compare the electronic energies for the two wavefunctions.

For the neutral molecule, which we assume is a closed-shell system,





E =

2H

 ii + (2Jij − Kij )

(11-23)

i

j 1Note that φ1 in parentheses represents a coordinate of electron 1, whereas φ1 outside the parentheses represents an MO.

2See also Smith and Day [2].

Section 11-11 Koopmans’ Theorem

For the cation, produced by removing an electron from φk,





E+ = 2H

(2J

 + H

(2J

k

ii + ij − Kij ) kk + ik − Kik )

(11-24)

i=k j =k

i=k

The ﬁrst sum in Eq. (11-24) gives the total electronic energy due to all but the unpaired electron in φk. Hkk gives the kinetic and nuclear attraction energies for the unpaired electron and the ﬁnal sum gives the repulsion and exchange energy between this electron and all the others. Now we note that the last sum is exactly equal to the void produced in the ﬁrst sum due to the restriction j = k. Therefore, we can combine these by removing the index restriction and deleting the last sum. This gives





E+ = 2H

(2J

 + H

k

ii + ij − Kij ) kk

(11-25)

i=k

j

To compare this with E of (11-23) we should remove the remaining index restriction.

We do this by allowing i to equal k in the sum and simultaneously subtracting the new terms thus produced:





E+ = 2H

(2J

 − H

(2J

k

ii + ij − Kij ) kk − kj − Kkj )

(11-26)

i

j

j

But, by virtue of Eqs. (11-15) and (11-23), this is E+ = E −

k

k

(11-27)

Hence, the ionization energy I 0, for ionization from the φ

k

k is

I 0 =

−

k

E+ E = −

k

k

(11-28)

This illustrates that, within the context of this simpliﬁed model, the negative of the orbital energies for occupied HF MOs are to be interpreted as ionization energies.

Another way to see the relation between I 0 and −

k

k , is to recognize that the physical interactions lost upon removal of an electron from ϕk, are precisely those that constitute k, [See Eq. (11-15).]

A similar result holds for orbital energies of unoccupied HF MOs and electron afﬁnities. (However, this is less successful in practice; see Problem 11-3.)

In actuality, the relation (11-28) is only approximately obeyed. One reason for this has to do with our assumption that doubly occupied SCF MOs produced by a variational procedure on the neutral molecule will be suitable for the doubly occupied MOs of the cation as well. These MOs minimize the energy of the neutral molecule but give an energy for the cation that is higher than what would be produced by an independent variational calculation. For this mathematical reason, we expect the Koopmans’ theorem prediction for the ionization energy to be higher than the value predicted by taking the difference between separate SCF calculations on the molecule and cation (which we will symbolize SCF). The corresponding physical argument is that use of Eq. (11-28)

views ionization as removal of an electron without any reorganization of the remaining Chapter 11 The SCF-LCAO-MO Method and ExtensionsTABLE 11-1 Ionization Energies (in electron volts) of Water as Measured Experimentally and as Predicted from SCF Calculations SCF (near HF limit)b

Cation state

Observeda

Koopmans SCF

2B2

12.62

13.79

11.08

2A1

14.74

15.86

13.34

2B2

18.51

19.47

17.61 a From Potts and Price [3].

b From Dunning et al. [4].

electronic charge. This neglects a process that stabilizes the cation and lowers the ionization energy. Whichever argument we choose, we have here a reason for expecting − to be an overestimate of the value obtained by independent calculations, SCF.

Another error results from the neglect of change in correlation energy. We have seen that the total SCF energy for the molecule is too high because the single determinantal form of the wavefunction cannot allow for correlated electronic motion. The SCF energy for the cation is too high for the same reason, but the error is different for the two cases because there are fewer electrons in the cation. We expect the neutral molecule to have the greater correlation energy (since it has more electrons)3 so that proper inclusion of this feature would lower the energy of the neutral molecule more than the cation, making the true I 0 larger than that obtained by neglect of correlation. Hence,

k this leads us expect SCF to underestimate I 0. Since − overestimates SCF, and

k

SCF underestimates the ionization energy, we can expect some cancellation of errors in using Eq. (11-28).

An illustration of these relations is provided in Table 11-1, where observed vertical ionization energies (i.e., no nuclear relaxation), the appropriate values of −, and the values of SCF are compared.

11-12 Conﬁguration Interaction

There are several techniques for going beyond the SCF method and thereby including some effects of electron correlation. Some extremely accurate calculations on small atoms and molecules, making explicit use of interparticle coordinates, were described in Section 7-8. There is one general technique, however, that has traditionally been used for including effects of correlation in many-electron systems. This technique is called conﬁguration interaction (CI).

The mathematical idea of CI is quite obvious. Recall that we restricted our SCF wavefunction to be a single determinant for a closed-shell system. To go beyond the optimum (restricted Hartree–Fock) level, then, we allow the wavefunction to be a linear 3This reasoning is rather naive. Signiﬁcant correlation energy contribution can result from a small energy-level separation between ﬁlled and empty MOs (rather than from merely the number of electrons), but production of a cation should normally increase this gap and lead to reduced correlation.

Section 11-12 Conﬁguration Interaction combination of determinants. Suppose we choose two determinants D1 and D2, each corresponding to a different orbital occupation scheme (i.e., different conﬁgurations).

Then we can let ψ = c1D1 + c2D2

(11-29)

and minimize E as a function of the linear mixing coefﬁcients c1 and c2.

If we go through the mathematical formalism and express ¯ E as ψ| ˆ

H |ψ/ψ|ψ, expand this as integrals over D1 and D2, and require ∂ ¯ E/∂ci = 0, we obtain the same sort of 2 × 2 determinantal equation that we ﬁnd when minimizing an MO energy as a function of mixing of two AOs. That is, we obtain

H

11 − ¯ES11

H12 − ¯

ES12

H21 − ¯

ES21 H22 − ¯ ES22 = 0

(11-30)

where now Hij = Di ˆ H Dj

(11-31)

Sij = Di|Dj

(11-32)

We see that, whereas before we might have had two AOs interacting to form two MOs, here we have two conﬁgurations (i.e., two determinantal functions) interacting to form two approximate wavefunctions. Our example involves only two conﬁgurations, but there is no limit to the number of conﬁgurations that can be mixed in this way.

Since each conﬁguration D contains products of MOs, each of which is typically a sum of AOs, the integrals Hij and Sij can result in very large numbers of integrals over basis functions when they are expanded. This is the sort of situation where a computer is essential, and CI on atoms and molecules, while still expensive compared to SCF, have become routine on modern computers.

Our purpose in this chapter is not to describe how to carry out a CI calculation, but rather to convey what a CI calculation is and what its predictive capabilities are. Therefore, we will not concern ourselves with the mathematical complexities of evaluating Hij and Sij .4 But we will consider one practical aspect of CI calculations, namely, how one goes about choosing which conﬁgurations should be mixed together, and which ones may be safely ignored.

We begin by considering the H2 molecule. The LCAO-MO-SCF method expresses the ground state wavefunction for H2 as

1σg(1)α(1) 1σg(2)α(2) ψ(1, 2) =

1σg(1)β(1) 1σg(2)β(2)

(11-33)

that is, as the conﬁguration 1σ 2 g . The SCF procedure mixes the AO basis functions together in the optimum way to produce the 1σg MO.

We have noted at several points in this book that, if one begins with a basis set of n linearly independent functions, one ultimately arrives at n independent MOs. Hence, the 1σg MO of Eq. (11-33) is but one of several MOs produced by the SCF procedure.

4In most actual calculations, the D’s are orthonormal, and Sij = δij .

Chapter 11 The SCF-LCAO-MO Method and Extensions

It is called an occupied MO because it is occupied with electrons in this conﬁguration.

All the other MOs in this case are unoccupied or virtual MOs. The virtual MOs of H2 have symmetry properties related to the molecular hamiltonian, just as does the occupied MO. Thus, we can refer to 1σu, 2σg, 2σu, 1πu, 1πg, etc., virtual MOs of H2.

Which of these virtual MOs are produced by an SCF calculation depends on the number and nature of the AO basis set provided at the outset. If no π -type AOs are provided, no π -type MOs will be produced. If only a minimal basis (1sa and 1sb) is provided, 1σu will be the only virtual MO produced.

It is important to distinguish between the physical content of occupied versus virtual SCF MOs. The SCF procedure ﬁnds the set of occupied MOs for a system leading to the lowest SCF electronic energy. The virtual orbitals are the residue of this process. The virtual MOs span that part of the basis set function space that the SCF procedure found least suitable for describing ψ. The subspace is sometimes referred to as the orthogonal
complement of the occupied orbital subspace. (Note that this situation differs from that pertaining to H¨uckel-type calculations, where MOs and energy levels are calculated without regard for electron occupancy. Only after the variational procedure are electrons added.)

Our concern with virtual MOs is due to the fact that they provide a ready means for constructing new conﬁgurations to mix with our 1σ 2 g conﬁguration for H2. Thus, using some of the above-mentioned virtual MOs, we could write determinantal functions corresponding to the excited conﬁgurations 1σg1σu, 1σg2σg, 1σg2σu, 1σg1πu, etc.5 These are commonly referred to as singly excited conﬁgurations because one electron has been promoted from a ground-state-occupied MO to a virtual MO. (This is not meant to imply that the orbital energy difference is equal to the expected spectroscopic energy of the transition.) It is also possible to construct doubly excited conﬁgurations, such as 1σ 2 u , 1σu2σg, 2σ 2 g , 1σu2σu, 1σu1πu, etc. For systems having more electrons, one can write determinants corresponding to triple, quadruple, etc., excitations. If one has a reasonably large number, say 50, of virtual orbitals and, say, 10 electrons to distribute among them, then there is an enormous number of possible conﬁgurations. A major step in doing a CI calculation is deciding which conﬁgurations might be important in affecting the results and ought therefore to be included.

We can gain insight into this problem by considering our minimal basis set H2 problem in more detail. We have 1σg = Ng(1sA + 1sB)

(11-34)

1σu = Nu(1sA − 1sB)

(11-35)

where Ng and Nu are normalization constants. The spatial part of the ground conﬁguration is ψspace = 1σg(1)1σg(2)

(11-36)

5As was shown in Chapter 5, the symmetry requirements of the wavefunction require that each of these open shell conﬁgurations be expressed as a linear combination of two 2 × 2 determinants; for example, 1σg2σu stands for the combination

√

1/ 2

1σ

g(1)2 ¯ σu(2) ± 1 ¯σg(1)2σu(2)

Section 11-12 Conﬁguration Interaction which expands to ψspace = N2g [1sA (1) 1sA (2) + 1sB (1) 1sB (2) + 1sA (1) 1sB (2) + 1sB (1) 1sA (2)]

If both electrons are near nucleus A, the ﬁrst term is quite large. This may be rephrased to say that ψ2 gives a sizable probability for ﬁnding both electrons near nucleus A. The second term gives a similar likelihood for ﬁnding both electrons near B.

These two terms are referred to as ionic terms because they become large whenever the instantaneous electronic dispositions correspond to H−H+ and H+H−, respectively.

A

B

A

B

The last two terms cause ψ2 to be sizable whenever an electron is near each nucleus.

Hence, these are called covalent terms, and their presence means that ψ contains signiﬁcant “covalent character.” In fact, because all four terms have the same coefﬁcient, the conﬁguration 1σ 2 g is said to have 50% covalent and 50% ionic character.

Is this bad? It turns out to be no problem at all when the nuclei are close together.

Indeed, in the united-atom (helium) limit, the ionic-covalent distinction vanishes. But at large internuclear separations it is very inaccurate to describe H2 as 50% ionic. In reality, H2 dissociates to two neutral ground state H atoms—that is, 100% “covalent,” with an electron near each nucleus. In short, the SCF-MO description does not properly describe the molecule as it dissociates. This means that the calculation of ¯

E versus RAB for H2 will deviate from experiment more and more as RAB increases. This defect in the SCF treatment of H2 occurs for many other molecular species also.

Can we correct this defect through use of CI? We ask the question this way: “What conﬁguration could we mix with 1σ 2 g in order to make the mixture of covalent and ionic character variable?” Since 1σ 2 g expands to give us covalent and ionic terms of the same sign, we need an additional conﬁguration that will give them with opposite sign. Then admixture of the two conﬁgurations will affect the two kinds of term differently. The conﬁguration that will accomplish this is 1σ 2 u : 1σu (1) 1σu (2) = N2u [1sA (1) 1sA (2) + 1sB (1) 1sB (2) − 1sA (1) 1sB (2) − 1sB (1) 1sA (2)]

(11-37)

Mixing these two conﬁgurations together gives ψ (c1/c2) = c11σg (1) 1σg (2) + c21σu (1) 1σu (2)

= c1N2 +

g

c2N2u [1sA (1) 1sA (2) + 1sB (1) 1sB (2)]

+ c1N2 −

g

c2N2u [1sA (1) 1sB (2) + 1sB (1) 1sA (2)] (11-38) If c1/c2 is readjusted at each value of RAB to minimize ¯ E, it is evident that the relative weights of covalent and ionic character in Eq. (11-38) will change to suit the circumstances. Actual calculations on this system show that, as RAB gets large, c1/c2 approaches a value such that c1N2 +

g

c2N2u approaches zero, so that the ionic component of ψ vanishes.

This example illustrates that CI of this sort has an associated physical picture.

It suggests that, in any CI calculation involving the dissociation (or extensive stretching) of a covalent bond, important conﬁgurations are likely to include double excitations into the antibonding virtual “mates” of occupied bonding MOs.

Chapter 11 The SCF-LCAO-MO Method and Extensions What about other conﬁgurations for H2? What will 1σg2σg do for the calculation, assuming now an extended basis set has produced a 2σg MO? Suppose we take as our trial function ψ = c11σ 2 +

g

c21σg2σg

(11-39)

where the conﬁgurations are understood to stand for determinants. If the 1σg MO has been produced by an SCF calculation on the ground state, and 2σg is a virtual MO from that SCF calculation, then it is possible to show that the CI energy minimum occurs when c2 in Eq. (11-39) is zero. In other words, these determinants will not mix when they are combined in this way. An equivalent statement is that the mixing element H12 = 1σ 2 | ˆ

g

H |1σg2σg vanishes. Hence, the CI determinant (11-33) is already in diagonal form, and no variational mixing will occur. This is an example of Brillouin’s
theorem, which may be stated as follows: EXAMPLE 11-1 If D1 is an optimized single determinantal function and Dj is a determinant corresponding to any single excitation out of an orbital φj occupied in D1 and into the virtual subspace (orthogonal complement) of D1, then no improvement in energy is possible by taking ψ = c1D1 + c2Dj .

The proof of Brillouin’s theorem is very simple. We start with a basis set that spans a function space. An SCF calculation is performed, which produces the best single-determinantal wavefunction we can possibly get within this function space.

This is D1. Dj differs from D1 in only one orbital, which means they differ in only one row. A general property of determinants is that, if two of them differ in only one row or column, any linear combination of the two can be written as a single determinant (see Problem 11-4). This means that any combination c1D1 + c2Dj is still expressible as a single determinant. Since Dj makes no use of functions outside our original basis set, c1D1 + c2Dj is a single determinant within our original function space. However, D1 is already known to be the single determinant within this function space that gives the lowest energy, and c1D1 + c2Dj cannot do better. QED.

A doubly excited conﬁguration differs from D1 in two rows, and mixing such a conﬁguration with D1 produces a result that cannot be expressed as a single determinant.

Because of Brillouin’s theorem, one might decide to omit all single excitations from CI calculations. But it is important to recognize that singly excited conﬁgurations can affect the results of CI calculations in the presence of doubly excited conﬁgurations.

This comes about because nonzero mixing elements can occur between singly and doubly excited conﬁgurations in the CI determinant. To illustrate, let ψ0 be an SCF single determinant, ψ1 be a singly excited conﬁguration, and ψ2 be a “double.” Then the CI determinant could be, assuming orthogonal determinants,

H

00 − E

0

H02

=

0

H11 − E

H12

0

(11-40)

H

02

H12 H22 − E