STRUCTURE SOLUTION (Or How to Solve the Phase Problem) Lecture 8 110209

STRUCTURE SOLUTION (or how to solve the phase problem) Lecture 8 110209

→

ρX,Y,Z = (1/V) Σ Σ Σ F hkl · exp[-i 2π( hX + kY + lZ)]

h k l

where we know both magnitudeand phase of the structure factors,

ρX,Y,Z = (1/V) Σ Σ Σ │F hkl │cos 2π( hX + kY + lZ – α’ hkl) , where we

h k l

know the magnitude, but have to determine the phase α’ hkl of the structure factors.

Note: the XYZ in this expression is ANY position inside the unit cell where we wish to calculate the electron density. It is NOT the x,y,z fractional coordinates of the atoms!

Note: the phase problem is different for centro- and non-centrosymmetric structures.

Fhkl is real for centrosymmetric structures and α hkl = 2π α’ hkl = 0 or π. There are only 2 possible values for the phase angle of each reflection.

Therefore, the problem is to determine the sign of each reflection (either 0 or π) of each reflection. For 2000 data, there are only 22000 possibilities = BIG NUMBER!!!!!

For non-centrosymmetric structures, Fhkl is both real AND imaginary, and each reflection has its own phase (0 ≤ α hkl 2π) = anywhere between 0o and 359.9 o.

Guides for solving structures and for knowing when structures are correct:

1-structure has to make chemical sense (bond distances and angles have to agree with literature values). Ex.: paper published with a C-O distance of 0.2Ǻ - ridiculous!!

2-Temperature factors should not be unusually large nor unusually small. If the temp. factors are unusually large, then we have placed too much electron density at that site; therefore, we have the wrong atom type, or NO atom exists there at all; vice-versa, if temperature factors are too small, then we have placed too little electron density at that site = wrong kind of atom at that site (probably have to increase Z). (Experience is best teacher).

3-R-factor (residual) should be relatively low (and comparable to other like structures) at the end of refinement

Σ w││Fobs│ - │Fccalc││

hkl i hkl hkl

R = ______

Σ w││Fobs│

hkl i hkl

If the R value is low, then the coordinates and the α hkl are most likely correct.

Rule of thumb:

“poor” structures: 0.10 < R < 0.12, and σ on C-C bond 0.02-0.10 Ǻ “good” structures: 0.05 < R < 0.08, and σ on C-C bond 0.007-0.02 Ǻ

“excellent” structures: 0.02 < R < 0.05, and σ on C-C bond 0.002-0.007 Ǻ

This residual (R) depends on the size of the structure (# atoms / asymmetric unit), the nature of the crystal (amount of mosaic character inherent in the crystal), whether or not there is disorder (and how well it has been modeled), decomposition (low temperature helps here), loss of solvent during data collection (low temperature helps here as well), etc.

4-In general, there should be no “holes” in the structure (i.e., the atoms should

pack closely together to fill the entire space = which is the complete unit cell).

5-R should show no unusual trends with respect to: sinθ, │F│, or different classes

of reflections.

Ex.: R = 0.09 for Fhkl when h = k = odd

but R = 0.03 for Fhkl when h = k = even. This suggests that there may be

something wrong with the structure.

6-At beginning of the structure (before finding any real atoms):

R random (centric structures) = 0.83.

R random (acentric structures) = 0.59.

What this suggests is that you should be better than these values at the START of the solution.

2 main techniques for solving structures:

1- Heavy-atom technique = Patterson method

Patterson Function (1935):

PUVW = (1/V) Σ Σ Σ│F hkl│2cos 2π( hU + kV + lW), which looks very

h k l

much like the electron density equation, which is:

→

ρX,Y,Z = (1/V) Σ Σ Σ F hkl · exp[-i 2π( hX + kY + lZ)]

h k l

except that the Fourier coefficients here are the magnitudes of F2, and not the

vectors F.

Therefore, we can calculate PUVW from the X-ray intensities, with no phases

required.

Ihkl ≈ │Fhkl│2 = (F)(F*) = (A+iB)(A-iB) = A2 + B2

Since PUVW is a 3-D Fourier series, it should represent a periodically repeating 3-D function with the same volume as the crystal’s unit cell, where UVW are the coords of ANY point in Patterson space.

What is the nature of this Patterson periodic function?

Peaks in PUVW space occur at positions corresponding to all possible interatomic vectors in the real space unit cell.

Consider 1 pair of atoms

in a real space unit cell:

At COORDINATES x,y,z: ρX,Y,Z is large, and therefore there is an atom.

At coords x+U,y+V,z+W: ρX+U,Y+V,Z+W is large, and therefore there is an atom.

The product (ρX,Y,Z)(ρX+U,Y+V,Z+W) is large (but only when the tip or the tail of the vector is at the atom’s electron density).

Ex.: Calculate the positions of peaks in PUVW for a 2-D structure:

Atom A (0.25, 0.25)

Atom B (0.75, 0.25)

Atom C (0.50, 0.75)

N2 vectors = 32 = 9 vectors

[includes all null vectors,

AA,BB,CC, which are at (0,0)]

Vector A-A (null)(0,0)

A-B (0.50,0)

A-C (0.25,0.50)

B-A (-0.50,0)

B-B (null)(0,0)

B-C (-0.25,0.50)

C-A (-0.25,-0.50)

C-B (0.25,-0.50)

C-C (null)(0,0)

Real space ------ Patterson space = easy

Patterson space------ Real space = hard

If one can decipher Patterson space, then one can determine the atomic coordinates and therefore get αhkl

Properties of PUVW :

1- # of peaks: for structure of N atoms/cell, get N2 peaks total, but N2-N unique

peaks, when we disregard the null vectors.

2- The density of the peaks: in real space, d = Natoms/Volume

in Patterson space, d = (N2-N) / Volume. Therefore,

much more crowded.

For large # of atoms / cell (~100), the large density of peaks is overwhelming

and obscures any features of the map.

3- Breadth of the peaks = (width at ½ height):

Real space 1-DPatterson space 1-D

It ends up that the peaks in Patterson space are about 2X as broad as the peaks in

real space. There are ways to sharpen the peaks, and these are used routinely

these days (all built into the programs).

4-Peak heights in Patterson space:

Real spacePatterson space

Total information:

a)Patterson = densely packed broad peaks.

b)Relatively featureless for a large number of equal atoms & very difficult to interpret.

c)Heavy atoms give large peaks, which are EASILY distinguishable.

d)Pattersons are ALL centrosymmetric.

e)Moving all of the vectors to the origin suppresses the translational symmetry. Therefore, there are only 24 Patterson space groups, instead of 230 space groups.

HARKER ANALYSIS: Helps to pick out certain peaks in the Patterson map. We will look at specific sections of the Patterson map for symmetry-related atoms in real space.

OR: Symmetry-related atoms in real space lead to vectors in very special positions in

Patterson space.

Ex.: space group P21/c: x,y,z; -x,-y,-z; -x,½+y, ½-z; x, ½-y, ½+z

Every atom in the molecule has 3 other equivalent positions in the unit cell. The

easiest way to determine general coordinates for Patterson vectors from

symmetry-related atoms is to construct a matrix:

x,y,z -x,-y,-z -x,½+y, ½-z x, ½-y, ½+z

x,y,z 0 -2x -2y -2z -2x ½ ½-2z 0 ½-2y ½

-x,-y,-z 2x 2y 2z 0 0 ½+2y ½ 2x ½ ½+2z

-x,½+y, ½-z 2x -½ -½+2z 0 -½-2y -½ 0 2x -2y 2z

x, ½-y, ½+z 0 -½+2y -½ -2x -½ -½-2z -2x 2y -2z 0

The matrix contains all vectors (42 = 16) with 4 diagonal elements (null vectors). Therefore, there are 12 non-origin or unique vectors.

There are General position vectors (they contain no 0 or ½): 2x 2y 2z

and its centro counterpart -2x -2y -2z

and 2x -2y 2z

and its centro counterpart -2x 2y -2z

****All of these vectors only appear ONCE and, therefore get a weight of 1.

There are Harker PLANES (only ONE value is EITHER a 0 or ½): 2x ½ ½+2z

and its centro counterpart 2x -½ -½+2z = 2x ½ ½+2z

Also: -2x ½ ½-2z

and its centro counterpart -2x -½ -½-2z = -2x ½ ½-2z

****These vectors appear TWICE each and, therefore get a weight of 2.

------

There are Harker LINES (TWO of the values are EITHER a 0 or ½): 0 ½-2y ½

and its centro counterpart 0 -½-2y -½ = 0 ½-2y ½

Also: 0 ½+2y ½

and its centro counterpart 0 -½+2y -½ = 0 ½+2y ½

****These vectors appear TWICE each and, therefore get a weight of 2.

Summary: since the Patterson is centrosymmetric, only need ½ of the matrix to get the necessary information. Therefore, you get the following:

General interactions: 2x 2y 2z = U V W weight = 1

2x -2y 2z = U V W weight = 1

Harker Planes:2x ½ ½+2z = U ½ W weight = 2

-2x ½ ½-2z = U ½ W weight = 2

Harker Lines:0 ½+2y ½ = 0 V ½ weight = 2

0 ½-2y ½ = 0 V ½ weight = 2

Assume that a structure has 2 atoms in the asymmetric unit: 1 C and 1 Cl:

General interaction: C-C vector at UVW: 6 x 6 x 1(weight) = 36 electrons2/Ǻ3

Cl-Cl vector at UVW: 17 x 17 x 1(weight) = 289 electrons2/Ǻ3

Harker Plane: C-C vector at U 0 W: 6 x 6 x 2 (weight) = 72 electrons2/Ǻ3

Cl-Cl vector at U 0 W: 17 X 17 x 2(weight) = 578 electrons2/Ǻ3

Harker Line: C-C vector at 0 V ½: 6 x 6 x 2 (weight) = 72 electrons2/Ǻ3

Cl-Cl vector at 0 V ½: 17 X 17 x 2(weight) = 578 electrons2/Ǻ3

Origin Peak (null vectors): (4 peaks x 6 x 6) + (4 peaks x 17 x17) = 1300 electrons2/Ǻ3

Heavy Atom method (50-70% of all structures solved this way until about 1970-1972).

The HA in real space has a large Z (atomic #), therefore large # of electrons. This leads to large distinguishable peaks in Patterson space. This then leads to the coordinates of HA in real space via Harker analysis (IF there are no great complications).

Outline the method: Ex.: structure contains 1 HA and 10 light atoms LA in the asymmetric unit.

1-use F2hkl (obs) to calculate the Patterson function:

PUVW = (1/V) Σ Σ Σ Fhkl2exp -2πi( hU + kV + lW)

h k l

2-use Patterson map and Harker analysis to find large peaks in UVW to give xHA,

yHA, zHA in real space.

3- use xHA, yHA, zHA to calculate approximate phases for ALL the reflections in a

structure factor calculation:

Fhkl (calc) = Σ fi exp2πi(hxi + kyi + lzi), but break it up this way:

i=1

Fhkl (calc) = Σ fHA exp2πi(hxHA + kyHA + lzHA) +

i=1

Σ fLA exp2πi(hxLA + kyLA + lzLA)

i=2

AND, ignore all the LA terms at this point!!

To yield:

Fhkl (calc) = fHA[cos2π(hxHA + kyHA + lzHA) + isin2π(hxHA + kyHA + lzHA)]

If CENTROsymmetric space group, then Bhkl = 0 and can calculate Ahkl,

and α = 0 or π for each reflection from the HA contribution alone. This will be INCORRECT, since we have disregarded the LA. However, it is better than having no structure at all!!

If NONCENTROsymmetric space group, then calculate Ahkl, Bhkl, and αhkl all from

the HA contribution alone. Again, this will be INCORRECT, since we have disregarded the LA. However, it is better than having no structure at all!!

4- Use the calculatedαhkl and the observed│Fhkl│(obs) to synthesize an electron

density (Fourier) map.

ρX,Y,Z = (1/V) Σ Σ Σ Fhkl(obs) · exp[-i 2π( hX + kY + lZ - αhkl)]

where αhklis the phase of each reflection based on the position of the HAalone.

ρX,Y,Z will show the positions of some of the light atoms, since it contains more

information than just the HA coordinates and phase.

5- Examine ρX,Y,Z map for light atom (LA) positions; typically, they occur at about 1/3

their expected height. Suppose that we find 3LA in our example of HL10, and that

they make chemical sense.

6- Calculate improved phases now using HA and 3 LA now in the calculations of Ahkl

and αhkl (and Bhkl in the non-centrosymmetric case):

1 4

Fhkl(calc) = Σ fHA exp2πi(hxHA + kyHA + lzHA) + Σ fLA exp2πi(hxLA + kyLA + lzLA)

i=1 i=2

7- Calculate new ρX,Y,Z map based on the new Fhkl(calc) to give improved αhkl combined

with Fhkl(obs). Keep iterating this process until you have all the atoms = complete

structure.

-For CENTROsymmetric crystals, SOME phases will change sign during this process.

-For NONCENTROsymmetric crystals, ALL the phases will change somewhat, and will approach their correct values, if real atoms have been found.

How does one monitor this iteration process to determine if the atoms are added at the correct positions?

1-atom should make chemical sense (bond distances and angles).

2-RF should get lower as add good atoms, and the opposite for bad atoms.

3-Peak heights of the added atoms should be proper on electron density map: if peak is too small, then the atom may be false.

When all the atoms have been located, then we have what is known as a trial structure.

Choice of HA: how large should HA be for this type of iteration to be successful? The heavier the HA, the better the initial phases, but the poorer will be the coordinates of the light atoms.

If HA is too light, iteration may not converge.

Rule of thumb: Σ Z2(HA) = Σ Z2(LA)

i i

A couple of examples: 1) Vitamin B12: Σ Z2(HA) / Σ Z2(LA) = 0.17 / 1

This structure required MUCH work to find the LA; HA atom

not quite heavy enough to phase the structure properly.

2) N-methylpyridinium iodide: Σ Z2(HA) / Σ Z2(LA) = (53)2/ 273 = 10.3 / 1

Found the iodide easily. Had poor LA coordinates. HA too

heavy and therefore dominated the phases extensively. Also,

since the space group was NONCENTROSYMMETRIC, the

structure had to be separated from its mirror image.

Practical considerations in locating the HA:

Consider several space groups with 1 HA/ asymmetric unit:

Ex. 1: P1, Triclinic, NONCENTRO, no symmetry:

Where is the HA? x y z

x y z 0 Therefore, no vectors between HA if only 1/cell

Assume that HA is at 0,0,0 --- this fixes the origin and chooses the lattice points.

What will the calculated phases be with only the HA at 0,0,0? αhkl(calc) = 0o

1 4

Fhkl(calc) = Σ fHA exp2πi(h0 + k0 + l0) + Σ fLA exp2πi(hxLA + kyLA + lz)

i=1 i=2

Fhkl(calc) = Σ fHA Therefore, Fhkl(calc) are real numbers with αhkl(calc) = 0o

i=1

Claim: Electron density (Fourier) map, calculated from HA phases (αhkl(calc) = 0o),

appears to have centers of symmetry (1).

ρX,Y,Z = (1/V) Σ Σ Σ│Fhkl│exp -2πi[hX + kY + lZ – α’hkl(calc)]

h k l

Since ρX,Y,Z with only 1 HA in looks like it is centrosymmetric, then it shows peaks from the true structure AND, superimposed on this, peaks from its mirror image.

How can you tell which is the correct structure or the correct enantiomorph?

With a great deal of difficulty!! Can only resolve this problem from electron density maps, ρ, and NOT from Δρ maps.

****Optically active materials always crystallize inNONCENTRO space groups & racemic mixtures usually crystallize in CENTRO space groups.

______

Ex. 2: P1, Triclinic, CENTRO:

Where is the HA? x y z -x -y -z

x y z 0 -2x -2y -2z

-x –y –z 2x 2y 2z 0

If there are two HA/cell, then there is 1 HA/asymmetric unit. The Patterson will have large peaks at U = 2xHA, V = 2yHA, W = 2zHA.

Therefore, find xHA, yHA, zHA

Since space group is 1, the molecule is identical with its mirror image and therefore, we expect no difficulties in solving the structure.

------

Ex. 3: P21, Monoclinic, NONCENTRO: 2HA/cell = 1HA/asymmetric unit

Where is the HA? x y z -x ½+y -z

x y z 0 -2x ½ -2z

-x ½+y -z 2x ½ 2z 0

In the Patterson, find big vector at 2x ½ 2z = U ½ W. Therefore, find xHA and zHA

What is yHA? Must choose some value for yHA. Free choice: therefore, let yHA = 0. and therefore fix the origin of the cell.

In real space, 2 atoms are at: xHA 0 zHA and at -xHA ½ -zHA (found from the Patterson).

These two positions are centrosymmetrically related at (0 ½ 0) [remember that this is a NONCENTRO system] and so, again, we will get the “true” structure and its mirror image in the initial electron density map.

------

Ex. 4: P21/c, Monoclinic, CENTRO: x,y,z; -x,-y,-z; -x, ½+y, ½-z; x, ½-y, ½+z

The Patterson unit cell showed the following peaks arising from 1 HA/asymmetric unit:

Patterson coordinates

UVWHeight

1)-0.4-0.6-0.8105

2) 0.4 0.6 0.8102positions 1&2, 3&4, 5&6, 7&8 are related to

3) 0.4-0.6 0.8100 _

4)-0.4 0.6-0.8 98each other by 1.

5) 0.0 0.1 0.5206

6) 0.0-0.1 0.5202

7) 0.4 0.5 0.3198

8)-0.4 0.5 -0.3200

Harker analysis: General positions: 2x 2y 2z weight = 1

2x -2y 2z weight = 1

Harker Plane: 2x ½ ½+2z weight = 2

-2x ½ ½-2z weight = 2

Harker Line: 0 ½+2y ½ weight = 2

0 ½-2y ½ weight = 2

Therefore, use Harker Plane to find xHA and zHA: 2x ½ ½+2z = 0.4 0.5 0.3

Choice 1: 2 xHA = 0.4 therefore xHA = 0.2

and ½+2z = 0.3 therefore zHA = -0.1

orChoice 2: -2 xHA = 0.4 therefore xHA = -0.2

and ½-2z = 0.3 therefore zHA = 0.1

Use Harker Line to find yHA coordinate: 0 ½+2y ½ = 0.0 0.1 0.5

Choice 1: ½+2y = 0.1 therefore yHA = -0.2

or Choice 2: ½-2y = 0.1 therefore yHA = 0.2

There are 4 possible positions for HA: x y z

1 0.2 -0.2 -0.1

2 0.2 0.2 -0.1

3 -0.2 -0.2 0.1

4 -0.2 0.2 0.1

Numbers 1 & 4 are related by center of symmetry and so are numbers 2 & 3. Therefore, only 1 & 2 are unique. However, these are related by a mirror perpendicular to b at y = 0. These will give enantiomorphic structures, so we can choose either and be correct. Since the space group is CENTRO, it makes no difference which one we choose; we will get

the same structure. However, had the space group been NONCENTRO, then you would get a different structure (depending on which choice you made!!).

Therefore, the single HA atom in P21/c is located at:

x y z -x -y -z -x ½+y ½-z x ½-y ½+z

0.2 -0.2 -0.1 -0.2 0.2 0.1 -0.2 0.3 0.6 0.2 0.7 0.4

through the 4 symmetry elements.

RAL080309