Correlator FITS Format for Phased Array Feed

Wednesday, November 14, 2018

Rick Fisher

Logging of GBT actions and data products is done through a well-established hierarchy of FITS files and directories that will need to be accessed to interpret the results of the PAF test measurement on the GBT. Each GBT subsystem (Antenna, Local Oscillator, Receiver, Spectrometer, etc.) has its own FITS file format, and it makes sense for the PAF to adopt a similar FITS logging and data structure. Two advantages of the GBT FITS system are that all scans are automatically self-documenting, and this documentation is clearly linked to the data. Since the GBT system is well tested for completeness, its FITS files make good templates for the PAF.

Each GBT data scan generates a new FITS file for each of the subsystems, and these files are stored in the Linux directory /home/gbtdata/program_name/subsystem, where the program_name is assigned to the observing project when the observing request is submitted to NRAO. A typical name looks like AGBT03B_023_03, where 03B designates the observing trimester, 023 is the serial number of the observing request in the trimester, and 03 is the observing session serial number that typically increments each day of a multi-dayobserving run. Hence, each observing session has its own subdirectory under /home/gbtdata, and each subsystem has its own sub-subdirectory. A typical observing session subdirectory is shown in Figure 1. For the PAF the subsystem subdirectories are RcvrArray1_2 and FXcorPAF. Each telescope “scan”, typically lasting between ten seconds and a few minutes, will generate a new FITS file in each of the subsystem subdirectories. These files will have identical file names constructed from the date and UTC time at the scan start, e.g., 2012_10_17_16:58:00.fits[1]. The only ways to tell which subsystem wrote a specific FITS file are by its subdirectory location or by looking at the header contents of the file. The common file name ties the subsystem files for each data scan together.

Figure 1 GBT FITS Directory Structure

The first step in PAF calibration and beamforming is to generate complex cross-products of the ADC samples from the array elements. Table 1 shows the FITS format for storing these cross products. Much of the data in the primary FITS table is boiler plate for compatibility with GBT FITS custom, but the number of active array elements can be found in the NUMELEM keyword. Note that the number of array elements may be smaller than the number of correlator input channels. Mapping of array element number to correlator input channels is recorded in the receiver FITS file in the RcvrArray1_2 directory.

The correlation data are in the second (CROSS_CORREL) FITS table. The correlator is assumed to be of the “FX” type where the signal from each array element is Fourier transformed into complex spectra with a given number of frequency channels, typically a power of two between 25 and 211. The total bandwith of these spectra are given by the BANDWIDTH header keyword in this table. The ‘DC’ frequency channel is omitted, and the baseband channel center frequencies run from BANDWIDTH / num_chan to BANDWIDTH. The same frequency channels from each pair of array elements of the same polarization are multiplied together to produce a complex cross product, and these cross products are accumulated for the duration of the ADC data scan. A PAF with Ncorrelator input channels will have N2 cross products, but some of these are redundant. The product of element m times element n is the complex conjugate of ntimesm so only one of the two need to be stored. Hence there are N(N+1)/2 unique complex cross products accumulations to be stored for each frequency channel of each polarization. However, one has the option of storing the entire correlation matrix and setting the value of the REDUND keyword in the second table header to ‘True’.

Because the nun-redundant correlation array is not rectangular, and, as far as I know, FITS does not have a complex number data type, the correlation data are stored as a one-dimensional 32-bit (4-byte) floating point array. In this one-dimensional array the spectral frequency channel number varies most slowly, one of the input channel indices changes next most slowly, the other channel index changes next most slowly, and the real-imaginary index changes fastest. Hence, the order on data values is freq1_ch1*ch1.real, freq1_ch1*ch1.imag, freq1_ch1*ch2.real, freq1_ch1*ch2.imag, freq1_ch1*ch3.real, freq1_ch1*ch3.imag,… freq1_ch1*chN.real,freq1_ch1*chN.imag, freq1_ch2*ch2.real, freq1_ch2*ch2.imag, freq1_ch2*ch3.real, freq1_ch2*ch3.imag,… freq1_ch2*chN.real, freq1_ch2*chN.imag,freq1_ch3*ch3.real, freq1_ch3*ch3.imag,… freq1_chN*chN.real, freq1_chN*chN.imag,freq2_ch1*ch1.real, freq2_ch1*ch1.imag, etc. The programming code logic to convert the correlator output to a 1-D array depends on the correlator, but the logic for reading the 1-D array and returning it as a rectangular array of imaginary number spectra is well defined as shown in the next section and the code in Table 2..

Writing a Correlator FITS File

Reading and writing of FITS files are supported in a number of programming languages. In this document we will use Python to illustrate the process and to define a basic FITS writer. The Python modules needed for file operation is “pyfits” available from the Space Telescope Science Institute at . This module makes use of the “numpy” module and currently works with Python versions 2.5 through 3.2. The pyfits user manual is available at . The PAF correlator uses the binary “table” data format (as opposed to the “image” format).

The returned Python array is a rectangular array of 64-bit complex floating-point numbers with the dimensions [num_frequency_channels, num_correl_inputs, num_correl_inputs].

FXcorPAF FITS File Format

To avoid a large number of FITS table rows and columns the cross products are stored in one-dimensional arrays with one array (table column) for each polarization. The actual dimension of these arrays is given by the MTXSHAPE keyword value, where the fastest changing index is the real-imaginary index. The second fastest is the frequency channel, and the slowest is the correlation product index. The product index sequence is element 1 times elements 1 through N, 2 x (2….N), 3 x (3….N), ….N x N.

If and when the PAF system incorporates a real-time beamformer, a new FITS format will be needed that is similar to the current GBT Spectrometer format where the outputs of the formed beams are treated the same way as the data from independent feed horns in the focal plane.

Table 1. FXcorPAF FITS File Format
FXcorPAF HDU structure:
Filename: ……/FXcorPAF/2010_11_07_00h39m08.fits
No. Name Type Cards Dimensions Format
0 PRIMARY PrimaryHDU 21 () uint8
1 CROSS_CORREL BinTableHDU 18 1R x 2C ['53760E', '53760E']
name:
PRIMARY
header:
SIMPLE = T / conforms to FITS standard
BITPIX = 8 / array data type
NAXIS = 0 / number of array dimensions
EXTEND = T
ORIGIN = 'NRAO Green Bank'
INSTRUME= 'FXcorrelator' / device or program of origin
GBTMCVER= '00.0 ' / telescope control software release
FITSVER = '0.0 ' / FITS definition version for this device
DATEBLD = '0000-00-00T00:00:00' / time program was released
SIMULATE= 0 / Is the instrument in simulate mode
DATE-OBS= '2011-07-28T18:05:30' / Manager parameter startTime
TIMESYS = 'UTC ' / time scale specification for DATE-OBS
TELESCOP= 'GBT ' / Green Bank Telescope
OBJECT = 'none ' / Manager parameter source
PROJID = 'A20Mxxxx' / Manager parameter projectId
OBSID = 'xxx ' / Manager parameter scanId
SCAN = 0 / integer scan identifier
NELEM = 19 / number of array elements
COMMENT FITS (Flexible Image Transport System) format is defined in Astronomy
COMMENT and Astrophysics, volume 376, page 359; bibcode: 2001A&A...376..359H
name:
CROSS_CORREL
XTENSION= 'BINTABLE' / binary table extension
BITPIX = 8 / array data type
NAXIS = 2 / number of array dimensions
NAXIS1 = 215040 / length of dimension 1
NAXIS2 = 1 / length of dimension 2
PCOUNT = 0 / number of group parameters
GCOUNT = 1 / number of groups
TFIELDS = 2 / number of table fields
TTYPE1 = 'XCROSCOR'
TFORM1 = '53760E '
TUNIT1 = 'corcoef '
TTYPE2 = 'YCROSCOR'
TFORM2 = '53760E '
TUNIT2 = 'corcoef '
EXTNAME = 'CROSS_CORREL' / name of this binary table extension
REDUND = F
BANDWDTH= 625000.0 / bandwidth of correlated spectra, Hz
MTXSHAPE= '[210,128,2]' / [num_correl, num_chan, real_imag]
data shape: (1,)
data[0]: [4.606e+08 0.000e+00 1.187e+08 0.000e+00 2.518e+08 0.000e+00 ...]
Table 2. Basic Python Code for Writing and Reading FXcorPAF FITS File
import pyfits as pf
import os
import time
import numpy as np
import pylab as pl
DATA_DIR = os.path.join('C:\\','Users','Rick','Data')
#FITS_DIR = os.path.join(DATA_DIR, 'FITS', 'PafSoftCorrel')
FITS_DIR = '.'
#CROSS_SPEC_DIR = os.path.join(DATA_DIR, 'CrossSpec')
CROSS_SPEC_DIR = '.'
class FXcorFits :
def __init__ ( self, fits_dir, cspec_dir=None, write_flag=False ) :
self.fits_dir = fits_dir
self.cspec_dir = cspec_dir
if not os.path.exists(self.fits_dir) :
os.makedirs(self.fits_dir)
self.software_start_time = self.make_time_string()
self.write_flag = write_flag
if not write_flag :
print 'FXcorPAFFits opened READ ONLY'
self.file_name = ''
self.file_open = False
self.fits_struct_open = False
self.fits_struct_unsaved = False
self.prihdr = False
defmake_time_string ( self, file_name=False, year=False, month=False,
day=False, hour=False, min=False, sec=False ) :
if file_name :
return file_name
if not year :
td = time.gmtime()
year = td[0]
month = td[1]
day = td[2]
hour = td[3]
min = td[4]
sec = td[5]
tstr = file_name
else :
tstr = '%04d-%02d-%02dT%02d:%02d:%02d' % (year. month, day,
hour, min, sec)
return tstr
deftranslate_name ( self, file_name ) :
if os.name == 'posix' :
return file_name
else :
i1 = file_name.find(':')
if i1 < 0 :
return file_name
i2 = file_name[i1 + 1:].find(':')
if i2 < 0 :
return file_name
else :
i2 += i1 + 1
new_name = file_name[:i1] + 'h'
new_name += file_name[(i1 + 1):i2] + 'm'
new_name += file_name[(i2 + 1):]
return new_name
defopen_file ( self, file_name ) :
if self.fits_struct_open :
print 'A FXcorPAF FITS structure is already open'
return
if self.file_open :
print 'A FXcorPAF FITS file, %s, is already open' % (self.file_name)
return
path_name = os.path.join(self.fits_dir, '%s.fits' % file_name)
if not os.path.exists(path_name) :
print '%s does not exist' % (path_name)
return
self.hdulist = pf.open(path_name)
self.file_name = file_name + '.fits'
self.prihdr = self.hdulist[0].header
self.file_open = True
defget_kw_val ( self, kw_name ) :
return self.prihdr[kw_name]
defclose_file ( self ) :
if self.fits_struct_unsaved :
self.save_fits_struct()
self.hdulist.close()
self.file_name = ''
self.file_open = False
self.fits_struct_open = False
defopen_new_fits_struct ( self, file_name, redundant=False,
num_elem=19, num_adc_chan=20 ) :
file_name = self.translate_name(file_name)
if self.file_open :
print 'A FXcorPAF FITS file is already open' % (self.file_name)
return
if self.fits_struct_open :
print 'A FXcorPAF FITS structure is already open'
return
self.fits_struct_open = True
self.file_name = file_name + '.fits'
self.hdu0 = pf.PrimaryHDU()
prihdr = self.hdu0.header
self.prihdr = prihdr
prihdr.update('ORIGIN', 'NRAO Green Bank')
prihdr.update('INSTRUME', 'FXcorrelator', 'device or program of origin')
prihdr.update('GBTMCVER', '00.0 ',
'telescope control software release')
prihdr.update('FITSVER', '0.0 ',
'FITS definition version for this device')
prihdr.update('DATEBLD', '0000-00-00T00:00:00',
'time program was released')
prihdr.update('SIMULATE', 0, 'Is the instrument in simulate mode')
prihdr.update('DATE-OBS', self.software_start_time,
'Manager parameter startTime')
prihdr.update('TIMESYS', 'UTC', 'time scale specification for DATE-OBS')
prihdr.update('TELESCOP', 'GBT', 'Green Bank Telescope')
prihdr.update('OBJECT', 'none', 'Manager parameter source')
proj_id = 'Test'
obs_id = 'xxx'
scan = 0
prihdr.update('PROJID', proj_id, 'Manager parameter projectId')
prihdr.update('OBSID', obs_id, 'Manager parameter scanId')
prihdr.update('SCAN', scan, 'integer scan identifier')
prihdr.update('NELEM', num_elem, 'number of array elements')
prihdr.add_comment('FITS (Flexible Image Transport System) format is defined in')
prihdr.add_comment('and Astrophysics, volume 376, page 359; bibcode: 2001A&A...376..359H')
self.temp_hdulist = [self.hdu0]
self.add_correlation_matrix(file_name, redundant, num_adc_chan)
self.fits_struct_unsaved = True
defadd_correlation_matrix ( self, file_name, redundant,
num_adc_chan, bandwidth=0.625e6 ) :
cor_file = os.path.join(self.cspec_dir, '%s.cspec' % file_name)
# Raw correlation file format:
# Each datum is 8-byte complex == 2, 4-byte floating point (real, imag)
# The data values are accumulated cross-product spectral values with
# the frequency channel being the most slowly varying index. the first
# receiver number in the cross-product being the next most slowly
# varying index, and the other receiver number varyng the fastest,
# i.e., freq1_chA1 * freq1_chB1, freq1_chA1 * freq1_chB2,
# freq1_chA1 * freq1_chB3, .... freq1_chA1 * freq1_chBN,
# freq1_chA2 * freq1_chB2, ... freq1_chA2 * freq1_chBN, ...
# freqM_chAN * freq1_chBN. Note that when the redundant flag is False
# the redundant cross-products are not stored to save disk space since
# freqX_chAY * freqX_chBZ == conj(freqX_chAZ * freqX_chBY)
fd = open(cor_file, 'rb')
cor_matrix = np.fromfile(fd, dtype= np.float32)
if not redundant :
num_correl = num_adc_chan * (num_adc_chan + 1) / 2
else :
num_correl = num_adc_chan**2
num_spec_chan = (len(cor_matrix) / num_correl) / 2
col1 = pf.Column(name='CROSS_COR', format='%dE' % len(cor_matrix),
unit='corcoef', array=[cor_matrix])
cols = pf.ColDefs([col1])
new_hdu = pf.new_table(cols)
new_hdu.name = 'CROSS_CORREL'
hdr = new_hdu.header
hdr.update('REDUND', redundant)
hdr.update('BANDWDTH', bandwidth)
hdr.update('MTXSHAPE', '[%d,%d,%d]' % \
(num_spec_chan, num_correl, 2),
'[num_chan, num_correl, real_imag]')
hdr.update('EXTNAME', 'CROSS_CORREL',
'name of this binary table extension')
self.temp_hdulist.append(new_hdu)
self.hdulist = pf.HDUList(self.temp_hdulist)
defget_correlation_matrix ( self, num_chan=20 ) :
hdu = self.hdulist['CROSS_CORREL']
hdr = hdu.header
print 'REDUND:', hdr['REDUND']
dat = hdu.data[0][0]
# The data order in the 'cross_accum' buffer is ch1Xch1_freq1_real,
# ch1Xch1_freq1_imag, ch1Xch2_freq1_real... ch1Xch20_freq1_imag,
# ch2Xch1_freq1_real...ch2Xch20_freq1_imag, ch3Xch1_freq1_real...
# ch1Xch1_freq2_real, ch1Xch1_freq2_imag...ch20Xch20_freq2_imag,
# ch1Xch1_freq3_real...ch20Xch20_freqN_imag
cdat = np.zeros(dat.shape[0] / 2, dtype=np.complex64)
cdat.real = dat[0:len(dat):2]
cdat.imag = dat[1:len(dat):2]
if hdr['REDUND'] :
num_freq = len(cdat) / num_chan**2
cdat.shape = (num_freq, num_chan, num_chan)
return cdat
else :
num_cor = (num_chan + 1) * num_chan / 2
num_freq = len(cdat) / num_cor
ret_dat = np.zeros((num_freq, num_chan, num_chan),
dtype=np.complex64)
print num_cor, num_freq
ix = 0; jx = 0
print cdat.shape, '---'
for fn in range(num_freq) :
for i in range(num_chan) :
jx += num_chan - i
ret_dat[fn, i, i:num_chan] = cdat[ix:jx]
ret_dat[fn, (i + 1):num_chan, i] = cdat[(ix + 1):jx].conj()
ix = jx
print dat.shape
print dat[1000:1006]
print cdat[500:503]
print 210 * 128 * 2
print hdr['MTXSHAPE']
return ret_dat
defupdate_keyword ( self, keyword, value, comment=False ) :
if self.prihdr :
if comment :
self.prihdr.update(keyword, value, comment)
else :
self.prihdr.update(keyword, value)
else :
print 'FXcorPAF primary header not open'
defsave_fits_struct ( self ) :
if not self.write_flag :
print 'FXcorPAFFits class not open for writing'
return
file_path = os.path.join(self.fits_dir, self.file_name)
if os.path.exists(file_path) :
os.remove(file_path)
print 'Write:', file_path
self.hdulist.writeto(file_path)
self.fits_struct_unsaved = False
defshow_hdu_info ( self ) :
if (not self.file_open) and (not self.fits_struct_open):
print 'FXcorPAF file not open'
return
print self.hdulist.info()
defshow_primary_card_list ( self ) :
if (not self.file_open) and (not self.fits_struct_open):
print 'FXcorPAF file not open'
return
print self.hdulist[0].header.ascardlist()
# To create a new fits file from a cross-product file in the directory,
# CROSS_SPEC_DIR:
#fx = FXcorFits(FITS_DIR, CROSS_SPEC_DIR, True)
#fx.open_new_fits_struct('2011_04_29_01h01m01')
#fx.show_hdu_info()
#fx.show_primary_card_list()
#fx.close_file()
# To read and plot data from an existing fits file
fx = FXcorFits(FITS_DIR)
fx.open_file('2011_04_29_01h01m01')
dat = fx.get_correlation_matrix()
pl.plot(dat[:,19,17].real, 'k')
pl.plot(dat[:,19,17].imag, 'r')
pl.plot(dat[:,17,19].real + 1e7, ':k')
pl.plot(-dat[:,17,19].imag + 1e7, ':r')
pl.text(90, 4e8, 'Black: Real')
pl.text(90, 3.5e8, 'Red: Imaginary')
pl.title('Correlation Test Data')
pl.xlabel('Frequency Channel_Number')
pl.ylabel('Relative Correlation Amplitude')
pl.show()
fx.close_file()

[1] Note that colons are not allowed in Windows file names so these file names need to be translated to file something like 2012_10_17_16h58m00.fitsbefore copying to a Windows system. See the translate_name() function in the code listing in Table 2.