The Cross Spectum.
In normal spectral analysis the spectrum shows how energy is distributed in frequency (or wavenumber) space. This can be obtained either by taking the fourier transform of the lagged autocorrelation. The lagged autocorrelation is infact the convolution of a time series with itslef.
The autospectrum can be written as
FFT(u*u) (1)
where the star represents the convolution.
Recall the with the convolution theorem we found that the fft of the convolution was the product of the fft i.e.
FFT(u*u)= fft(u)*fft(u). (2)
Thus the power spectra can be obtained in two ways—the Blackmun Turkey approach (1) or the periodgram method (2).
The cross specturm is very similar—expect that instead of the using the auto-correlation taken from a single time series. You would take cross correlation of two different time series.
so (1) and (2) become
FFT(u*v)
and
FFT(u*u)= fft(u)*fft(v)
The Cross-correlation of two complex functions f,g is simply the convolution of those records.
This can be conceptually understood by recalling the convolution is simply the sum of the product of the two records—with one record shifting down the time axis.
Recall then that the fourier Transfrom of g*h was G(f) H(f)
Therefore one could calculate the cross-correlation by taking the inverse fft of product of the fft of G and H.
Likewise the fft of the autocorrelation of x (x*x) is equal to |X(f)|2.
Summary of standard spectral analysis approach
1 Remove mean and trend.
2) Could Pad with K zeros to increase spectral resolution (K<N). This partially reduces end effects ( second motivation to do this is to get the record to be a power of two—for faster fft—however this is typically not an issue if the data set is not to large). Padding with zeros also allows one to select a record length such at a fourier frequency will fall exactly on a major Harmonic (if one exists), it improves spectral resolution.
3) Break up the data into equal blocks of size M
4) Take FFT and average results in frequency space.
5) Rescale spectra to account for the loss of energy for given window. For the Hannign window mulitpy by 8/sqrt(3).
6) Compute the raw spectral density for the two-sided spectrum as
Gyy(0)=Sff(0)
Gyy(1,n-1)=2*Sff(1,n-2)
Gyy(n)=Sff(n)
One is the calculation of energy from the fft. We need to divide by T and multiply
Each spectral constituent by 2 except the first and the last.
Cross-Covariance Function
Cross-correlation Function
x1 and x2 are different time series—could be apples and oranges!
FOr example supplse X1 and X2 were sea-level observations at Sandy Hook and Atlantic City n The time lag of maximum correlation is the time that the signal travels down the coast the coast.
They can be complex numbers
For example if you had two vector time series—say surface currents and wind, you would make these time series into complex numbers
U=u+i*v Current
W=ue+i*un wind
Here’s an example using MATLAB to using complex numbers to get the correlation, between a scalar times series and a vector time series.
Wind_eta_Correlation.m matlab script
Wind and Sea-level at the Battery.
w=u+i*v;
c1=corrcoef(h,w);
R=real(c1(1,2));
I=imag(c1(1,2));
ang=atan2(I,R);
% Rotate wind so that real part is along axis correlated with wind
w=w*exp(-i*ang);
u=real(w);
v=imag(w);
corrcoef(h,u)
After rotation you could find the lag that produces the maximum correlation between the wind and sealevel records.
lag_corr.m matlab script
Nearly 80% of the variance in low-frequency sea-level is attributed to wind forcing.
This could probably be improved if I removed barometer effects
However it will never be 100 % because other processes effect sea-level such as that are not correlated with winds such as:
Waves (coastally trapped)
Tides (long period—these could be removed)
River discharge may impact sea-level at Battery.
Still, 80% is pretty good.
The first analysis—finding the wind direction that yields the best correlation could also be done with a lagged correlation. This might yield slightly difference
To extend this into the frequency domain—to characterize the cross-correlation as a function of frequency we estimate the cross-spectra
Time series may be better correlated in some frequencies than in others.
Following Blackman the Tukey we could take the spectra of the cross-correlation function to yield the cross spectra—
Or we could take FFT of each record obtain Y1(f), Y2(f) and the cross spectra is simply Cs=Y1* Y2
Note that since this is not the autospectra that Cs will be complex. This is important because it give us phase information on the two series—Who leads Who!
For example current velocity and temperature fluctuations can drive a net heat flux—if they are in phase. This is often referred to as an eddy heat flux. The cross-spectrum of temperature and velocity would give a frequency dependent measure of the eddy heat flux.
Coherent fluctuations that are in phase produce heat flux. IF they are in quadriture (90 degrees out of phase) there is no heat flux in that frequency band.
Show example of how advection due to oscillatory tidal motion will yield no- heat flux
But
Cross Spectra
Taking the cross spectrum is quite easy—identical to autospectra.
1) Ensure that the time series span the same time/space period with identical
2) resolution.
3) Remove the means the trends from both records
4) Determine how much block averaging you will need to do for statistical reliability.
5) Window the data
6) Compute FFT of data
7) Adjust scale factor of spectra according to windowing
8) Compute raw one-sided cross spectral density
Since the cross-spectrum is the transform pair of the covariance function the inverse Fourier transform of the cross-spectrum will recover the cross-covariance function. This is a more computationally efficient way to calculate the cross-covariance due to the robust computational efficiency of the fft.
i.e.
inv(G12)= cross-covaraines
in matlab
two time series x1 & x2.
F=fft(x1)
G=fft(x2);
CC=ifft(conj(F) G)
Two ways to quantify the real and imaginary parts of the cross-spectrum.
1) product of an amplitude function (Cross amplitude spectrum) and a phase function ( phase spectrum)
The amplitude spectrum gives the frequency dependent of co-amplitudes, while the phase indicates the relative phase between these two signals.
When the co-amplitudes are large-that means that there are covariance in that frequency band—and the phase is meaningful. However if the amplitude is small (specifically statistically insignificant) then the phase is meaningless. (Need to discuss significance levels for this).
So the amplitude is simply G*conj(G)
And the phase is
Atan2(imag(g), real(G))
Easily done in Matlab
xk(t)=Akcos(2pfot+qk) k=1,2
The Fourier transform of this is the sinc function that is shifted by eiq
(note that for zero phase shift e0 is 1)
i.e. Xk(f)=Ak/2(eiqksinc(f- fo)T+ e-iqksinc(f+ fo)T) k= 1,2
Hence the cross-power-spectra of these two time series would be
S12(f)= 1/T (X1(f) X*2(f))
For T going towards inifinity
More generally the sample cross-spectrum (one-sided) would read
So A12 is frequency dependent amplitude of the cross spectrum and theta is the frequency depended phase of the cross spectra. The phase only has meaning when the amplitude is significantly different than zero.
Here we can Show MATLAB examples of cross-spectrum for
1) Single frequency with phase
2) Multiple Frequency
3) Sea-level and Wind at the Battery.
It can also be decomposed into a co-spectrum (in phase fluctuations) and quad-spectrum (fluctuations that occur in quadrature). This is also related to the Rotary spectra that can decompose the cross-spectra of a vector time series into clockwise and counterclockwise rotating components.
Ellipse analysis, (tidal ellipse)
U=Acoswt+Bsinwt
V=Ccoswt+Dsinwt
This can be written in complex form as:
R=u+iv
R= Acoswt+Bsinwt+i (Ccoswt+Dsinwt)
R=(A+iC) cosw+(B+iD) sinwt (1)
Since we are dealing with vector that oscillates at a single frequency this can be represented in terms of a clockwise rotating vector and a counter clockwise rotating vector.
i.e.
R=R+eiwt+ R-e-iwt
Where R+ and R- are the radii of the clockwise and counter-clockwise rotating constituents.
Recall
eiwt=coswt+isinwt
so
R= R+( coswt+isinwt) + R-( coswt-isinwt)
R=( R++ R-) coswt +i( R+- R-) sinwt (2)
Solving for R+ and R- by equating equations (1) and (2) we find:
The magnitude of these are then
Since these are rotating at the same frequency but in opposite directions there will be times when they are additive (pointing in the same direction) and times when they are pointing in opposite direction and tend to cancel each other out.
These two times define the Major Axis is ( R++ R- ) and the minor axis (R+- R-)
of an ellipse.
While the orientation and phase of the ellipses:
G1=atan2(B1,A2)
G2=atan2(B2,A2)
OREN= 0.5*(G1+G2);
PHASE=0.5*(G2-G1);
Where
A1=(A+D)
B1=(C-B);
A2=(A-D);
B2=(C+B)
Back to the heat flux example – if fluctuations occur in phase (large co-spectrum amplitude) then there will be a large eddy heat flux.
In contrast if the co-spectrum is small but the quad-spectrum is large there will be little eddy heat flux at that frequency.
This of course could be induced by the amplitude and phase spectrums—but often this is a convienent way to plot it.
Here the cross Spectrum, which has a real and complex part
Coherence spectrum (Coherency or Squared Coherency)
The squared coherence represents the fraction of variance in x1 ascribable to x2 through a linear relationship between x1 and x2.
Phase estimates are generally unreliable when amplitudes fall below the 90-95% confidence intervals.
Confidence Intervals
EDOF is the independent cross-spectral realizations in each frequency band. Thomson and Emery suggest that it is equal to the number of frequency bands that you average.
For 95 % confidence level a =0.05
For EDOF=2
Requiring remarkably high coherence!!
However for EDOF =5
A limit that is more likely to be exceeded to be found in a noisy geophysical data set!
10