Exploring Semiconductors on the Nanometer Scale:
The Development of X-Ray Reflectivity Analysis Tools
Christopher Bishop Payne
Princeton University
United States Department of Energy
Office of Science, Science Undergraduate Laboratory Internship Program
SLAC National Accelerator Laboratory
Menlo Park, California
July 20th, 2011
Prepared in partial fulfillment of the requirement of the Office of Science, Department of Energy’s Science Undergraduate Laboratory Internship under the direction of Apurva Mehta and Matt Bibbe in the SSRL division of SLAC National Accelerator Laboratory.
Participant:______
Signature
Research Advisors:______
Signature
Note to draft reader:
Thanks for taking the time to read my paper, in order to make your life a little easier I would like to point out a few areas of the paper that I know are lacking/are in development:
1)Thematlab figures are hard to read in some cases, something happened when they were imported to Word so I will fix these.
2) The mathematical connection between oscillations and thickness is not easily made in the theory section. I am working with my advisor to clarify this. (the transition from eq 10 to 11)
3)Not all the figures are labeled/labeled neatly, I will remedy this.
4)In the results section, I do not discuss how altering the range affects either algorithm , this will be inserted in the final draft.
5)The Conclusion/Discussion section is in the process of being overhauled as it is greatly lacking. I am in the process of preparing a separate presentation for GE(separate from my talk on thur) telling them what I found in their data and am going to use parts of this presentation to connect the development of 2N back to the initial problem GE had.
Thanks,
Chris
I. Introduction
The next generation of electronic devices that will power an increasingly technology dependent world will push the boundaries of our capabilities to manufacture semiconductor chips that are both smaller and composed of new materials. In terms of being smaller, semiconductor chips – which are at the heart of all electronics – will need to be manufactured with precision on the order of a few nanometers in order to keep pace with Moore’s Law. This law, a fundamental trend in semiconductor manufacturing, corresponds to electronics that are cheaper and have a higher processing power per area than their predecessors. Additionally, nearly all semiconductors are now made of silicon, however the use of new semiconductor materials could yield electronics that are more efficient or can operate under greater extremes.
In order to unlock these favorable attributes of these next generation semiconductors, we need to be able to characterize and analyze materials with a resolution of a few nanometers. Many different techniques have been developed to perform this analysis yet the one we will focus on is called X-Ray Reflectivity(XRR). This technique quantifies parameters of materials by revealing the number of layers the material is composed of and each layer’s corresponding thickness, density, and roughness value.[i] We will discuss these parameters in more detail in the theory section, for now, it is sufficient to understand that XRR allows the user to understand the physical structure of a material on the nanometer scale. It should also be noted that unlike other techniques, XRR does not destroy the sample nor does it require specially prepared samples such Transmission Electron Microscopy[ii].
Our collaborators on this project, GE, want to realize the potential of these next generation semiconductors by developing semiconductor’s made from silicon carbide(SiC). GE believes SiC semiconductor devices would be much more energy efficient than those in use today and thus make a wide range of applications – from wind turbines to hybrid-electric vehicles – more efficient.[iii] Before society can benefit from these favorable SiC semiconductor attributes, GE needs to understand more about the structure of SiC semiconductors. In order to do this, GE has been working with Apurva Mehta and Matthew Bibee of SLAC National Laboratory, who have been taking XRR measurements in order to help answer GE’s questions about the structure of SiC semiconductors.
I was specifically tasked with developing tools to analyze the XRR data that my colleagues collected so that we could answer GE with a higher level of confidence, a crucial step on the path to developing more efficient electronic devices in a world that urgently seeks them.
II. Materials and Methods
i. Reflectivity Theory
In order to fully appreciate and understand the XRR data analysis tools I developed during this project, we must first establish the theory behind the XRR measurements that were taken on Beam Line 2-1 at SSRL to provide the XRR data to analyze. XRR is a well established technique in which a sample is illuminated by an x-ray beam and the reflectivity of the sample is measured with respect to theta(See Figure 2), with theta typically ranging from zero to eight degrees.
When the sample is struck in this low theta range, the corresponding reflectivity signal contains information about the electron density of the surface of the sample.[iv]Gradients in this electron density data are correlated with different layers of material existing on the surface of the sample and thus we are able to detect and characterize surface layers on the order of 1 nm.
To understand the connection between how this layer information is encoded in the reflectivity versus theta data, we turn to Reflectivity Theory, a model I will now discuss in brief. In Figure 3, we model a thin layer of thickness don a substrate:
According to Bragg’s Law, the extra distance traveled by the beam that reflects off the substrate is:
(1)
This extra distance corresponds to a phase difference at the point that the reflected beams interfere at the detector, dependent on the wavelength( of the incident beam:
As noted in Figure 3,this phase difference causes the relationship between the two waves to be:
(3)
(4)
The coefficientwhich modulates the amplitude of both waves is defined to be dependent on:
(6)
It is important to note that because of (6) and the fact that λ is constant, q is approximately linearly dependent on With this in mind, we can now rewrite our equations to be:
When these two waves interfere ‘faraway’ at the detector, the detector records the summation of these waves:
(9)
(10)
Accounting for all the reflections occurring on the sample, the detector ultimately records an intensity of the following nature, where σcan be thought of as just a constant:
(11)[v]
The two important points to understand from this result are that the intensity oscillates with respect to q and that these oscillations exponentially decay with q.Lastly, these oscillations are so important as their characteristics contain information such as thickness about each layer on the substrate.
ii. Analysis Tool Development Outline
The challenge that I was tasked with solving was taking the XRR data and extracting the layer information encoded within the intensity versus theta data. This extraction was made challenging because the oscillations occur over a range of many decades which both distorts the oscillations and exacerbates the effects of experimental noise.
In order to meet this task I used MATLAB to create and test algorithms that would extract the information rich oscillations from the intensity data. A secondary algorithm would then take the Fourier transform of the extracted oscillations, a procedure that would quantitatively reveal the different frequency components present in the extracted oscillations. These frequencies mathematically correspond to the thicknesses of layers present within the sample. Other data, such as roughness or density of the layers is thought to be contained within the amplitude of the oscillations in Fourier space, however, this is only a hypothesis at this point in time.
In order to test the accuracy of my MATLAB algorithm, I used a MATLAB program called Multig, developed by AnneliMunkholmand Sean M. Brennan. This program took as an input the parameters I was trying to extract from the oscillation data and outputted a simulated intensity versus theta curve. I would then apply my algorithm to the simulated data and see if I could determine the same parameters as I had created the curve with. I used this simple process to develop and tune my algorithm until I could apply the algorithm to a simulated data set and return the number of layers present in the simulation and their corresponding thickness. As I will discuss, it is not currently understood how other parameters such as density can be extracted from the intensity versus theta curves. This entire process is outlined in the four steps below:
1. Data Input: Simulate a simple 2nm layer of Al between a 20 nm layer of silicon oxide (SiO2)
on a SiC substrate with no roughness.
2. Data Analysis: Convert the intensity versus theta data into log(intensity) versus theta and apply an algorithm that quantifies how much the oscillations in the data are being distorted at each point. Additionally, the algorithm removes any data below a specified theta, 0.18 in this case.
3. Distortion Removal: Remove this distortion (plotted in green) from the curve (plotted in red) and convert the oscillations back to real intensity space (non-log).We now can visually see the oscillations that were present in the original intensity versus theta curve plotted in blue!
4.Fourier Space Analysis: Lastly, we take the Fourier transform of the extracted oscillation in order to quantitatively know the number of layers and their corresponding thicknesses. In this case, this final plot is interpreted to indicate two layers are present on a substrate. Peak four indicates a layer of 2.01 nm and peak three indicates a layer of 20.1 nm in this particular case. Peak two is simply the sum of these two layers and is not relevant to our present discussion.
In order to perform the above four step process, I developed two separate algorithms and tested them on a wide range of simulations in order to ascertain which yielded the most accurate determination of the parameters used to create the simulated data it operated on. I will first discuss the dN algorithm followed by the 2N algorithm, focusing on how they implement the four steps outlined above in detail. It should be noted that the Data Input step is the same for each method as both operate on the same intensity versus theta data sets.
iii. The dN Algorithm
This algorithm operates on an intensity versus theta data set using four input parameters: theta_low_clip, theta_high_clip, xray_energy, N_range. During the Data Analysis step, the algorithm changes the domain of the data to between theta_low_clipand theta_high_clip. This is necessary to remove the low theta region that lacks pertinent oscillations and sometimes used to remove high theta data points that usually contain experimental noise as the corresponding intensities are so low. The second part of the Data Analysis step is to take an averaged derivative of the intensity versus theta signal as described in Enhanced Fourier Transforms for X-Ray Scattering Applications[vi]. The discrete version of this, as we are dealing with a finite data set is described as:
Expressed in words, the algorithm takes the jthintensity & theta data point and replaces it with the ‘local’ average derivative of the intensity. The ‘local’ area is defined by N and is centered on the jthdata point.
After applying this transform, the oscillations are immediately extracted and thus no Distortion Removal step must be taken.
Lastly, Fourier Space Analysis is performed by first converting the theta values into q values using the function described in (5), note that these new q values have units of m-1 as they are a function of the wavelength of x-rays used. The standard MATLAB function called the Fast Fourier Transform is then applied to the intensity versus q data set. The coefficients that result from this transform are then squared in order to plot amplitude versus thickness (now in m). The maximums of this plot then indicate the presence of the thickness values they correspond to.
iii. The 2N Algorithm
This algorithm also operates on an intensity versus theta data set using the four input parameters: theta_low_clip, theta_high_clip, xray_energy, N_range. Additionally, it also narrows the domain of the data using the clip arguments in the first part of theData Analysis step. The second part of the Data Analysis step is to simply take the local average of the intensity versus theta data and record it as the distortion at that given theta value. As alluded to in Enhanced Fourier Transforms for X-Ray Scattering Applications[vii]:
The method is called ‘2N’ as the local average includes thejthpoint plus N closest data points that correspond to a lower theta and the N closest data points that correspond to a higher theta than that of the jthpoint.
After quantifying the distortion value at each point, we know subtract these distortions from the corresponding original intensity values in the Distortion Removal step:
for all j (14)
With the distortion removed, we know can visually see the oscillations that were contained inside the raw data. One last step in the Distortion Removal step is to take the anti-log of the oscillation values, which converts the scale of the oscillations from log(intensity) back into normal intensity. This ‘decompression’ serves to amplify the oscillation signals, provided for a more robust Fourier transform result.
The final Fourier Space Analysis step is identical to that used in the dN method.
III. Results
I will now discuss the results from testing the above algorithms against Multig simulations in addition to the sensitivity of parameters such as theta_low_clip, theta_high_clip, N_range in affecting the accuracy of the algorithms.
i. dN Results Under Normal Conditions
Under favorable conditions, such as the simulation of the 2nm of Al and 20 nm of SiO2 discussed earlier, dN successfully extracted the number of peaks present in the model and their corresponding differences. It should be noted that the oscillations determined by the derivative are out of phase with the original oscillations as theta values that corresponded to peaks in the original oscillations locked inside the data now correspond to zeroes of the derivative. This is simply a natural result of applying the derivative to a function. As seen in Figure Four below, the oscillations are extracted very cleanly from the original data set in a mathematically sound yet compact step of taking the derivative:
In terms of the accuracy of the Fourier Space Analysis step associated with the dN method, we see only two peaks indicating the presence of the two layers. Peak two rightly indicates the presence of the 20 nm SiO2 layer, while the difference between Peak one and two indirectly indicates the existence of the 2nm Al layer. Keep in mind,we cannot determine whether the 20 nm layer is really made of Al or SiO2, we only know that it exists and what its thickness is. Additionally, we cannot certify using this model the order of the layers, such as whether the SiO2 is on top of the Al layer.
Another item to point out, are the second harmonics of the signal present around approximately 45 nm. The more pronounced these harmonics are, the less sinusoidal are extracted oscillation is and thus the worse the extraction algorithm is.
The last, and most important result of running the dN method on a simple test simulation is the low frequency artifact highlighted in the orange box. This should not be there as it indicates a strong DC (linear) component in the data. Additionally, it drowns out the low frequency peak that should be present indicating the 2nm layer.
ii. dN Results on simulated roughness
Under actual experimental conditions, roughness is present on the layers of any real sample being tested since no semiconductor can be perfectly smooth. To simulate this, we ran the same simulation as above, yet with 1.5 Å of roughness added to the SiO2 layer. Using an N_range value of 9 to help average out this noise, I received the result in Figure Five:
Note how the roughness greatly attenuates the high frequency oscillation for theta above approximately four. We do note that the highest peak corresponds to a thickness of 20.0 nm while peak two corresponds to a thickness of 2.03 nm.
It is interesting to note that the highest peak, peak one, no longer is a summation of the two layers, rather it is just the largest layer. Also, the low frequency artifact present in the simulation without roughness seems to have disappeared, leaving the low frequency peak corresponding to 2.03 nm exposed. Lastly, many smaller peaks are prevalent as a result of the extracted oscillation not being very sinusoidal – these additional peaks could be mistaken to be additional layers.
ii. 2N Results Under Normal Conditions
Under favorable conditions, such as the simulation of the 2nm of Al and 20 nm of SiO2 discussed earlier, 2N successfully extracted the number of peaks present in the model and their corresponding differences. The results of using an N_rangeof 13 can be seen below in Figure Six:
Looking at the result of the Fourier Space Analysis, we see three distinct peaks. The first, peak four, corresponds to a thickness of 2.01 nm while the third corresponds to 20.1 nm. Peak two is approximately the sum of these two layers. We notice that second harmonics are extremely small as indicated by peak 5, thus the extracted oscillations are very sinusoidal in nature. One anomaly is noted for a peak that appears to correspond to 0 nm. The reason for its existence and its physical meaning is not understood. Additionally, like with dN, the Fourier Space Analysis cannot currently reveal the density of the layers the peaks correspond to, nor the order of the layers.