Spatial and temporal analysis of cattle herd breakdowns in the Randomised Badger Culling Trial
Defra Project SE3240
AC Mill, SP Rushton, MDF Shirley, AWA Murray, GC Smith, RA McDonald
IRES, Newcastle University
and
Fera, York
October 2009
1
Executive summary
- The Randomised Badger Culling Trial (RBCT) confirmed that Mycobacteriumbovis infection in badgers causes some cattle herd breakdowns (CHB). Their analysis was performed at the ‘triplet’ level: each unit of measurement was ~100km2 in size.
- Here, we took the available data collected during the RBCT, allied with data from GIS, and subjected it to progressively more advanced statistical analysis at a fine spatial scale. This examined farm characteristics, treatment (culling) and badger related variables, to relate them to CHBs.
- The complex interaction of potentially causative factors (covariates) with the outbreak of Foot-and-Mouth (FMD) in 2001 meant that it was necessary to restrict the analysis to post-2001 data.Badger territorial data was collected on half the triplets empirically (by bait marking) and on the other half by a predictive mathematical formula (tessellation). The two methods of data collection are not comparable, so it was necessary to remove half of the data from the analyses. Further, this data was not collected in the 2km zone outside the trial areas, so the interactions that occurred here with cattle herd breakdowns could not be analysed.
- As expected, the results from all the statistical tests did not agree completely. However, there were consistent findings demonstrating that herd size, farm size and holdings with multiple parcels of land were all at greater risk of a herd breakdown. In addition, as identified by previous analysis, the tests demonstrated that proactive badger culling had a protective effect for some time after implementation.
- Badger related variables (disease prevalence, number of social groups and length of badger territorial boundaries) did not consistently point to an increase in risk. This could be because (1) the collected variables were not important to risk in cattle, or (2) there were insufficient data to demonstrate their importance.
Contents
Executive summary
1. Introduction
2. RBCT Data Specification
2.1 Data
2.1.1 Analysis Approach
2.1.2 Comparison between GIS and Access databases
2.1.3 Aston Down data
2.1.4 The Effect of Foot and Mouth Disease in 2001
2.1.5 Triplet J
2.2 Data sets
2.2.1 Covariates
3. Analytical Methods
3.1 Assessing spatial similarity of CHB and risk factors using Mantel testing
3.1.1 Single Mantel tests
3.1.2 Multiple Regression on resemblance Matrices (MRM).
3.2 Space–time clustering of CHB and badger disease prevalence using K-function tests
3.3 Event analyses: Investigating CHB as disease events using Cox proportional hazards.
3.4 Investigating risk factors for CHB using mixed effect modelling.
3.5 Investigating direct and indirect risk factors for CHB using structured equation modelling.
3.5.1 The conceptual model
3.6 Investigating direct and indirect risk factors for CHB using Bayesian Belief Networks
3.6.1 Discretization
3.7 Sensitivity of Cattle Herd testing: Implications on analyses
4. Results
4.1 Assessing spatial similarity of CHB and risk factors using Mantel testing
4.1.1 Single Mantel Tests
4.1.2 MRM
4.2 Space–time clustering of CHB and badger disease prevalence using K-function tests
4.3 Event analyses: Investigating CHB as disease events using Cox proportional hazards
4.4 Investigating risk factors for CHB using mixed effect modelling
4.5 Investigating direct and indirect risk factors for CHB using structured equation modelling
4.6 Investigating direct and indirect risk factors for CHB using Bayesian Belief Networks
4.7 Sensitivity of Cattle Herd testing: Implications on analyses
4.7.1 Failure to detect disease
4.7.2 Over-prediction
4.7.3 Generalised Linear Modelling
5. Discussion
6. Conclusions
7. References
8. Appendix:
1
1. Introduction
Bovine tuberculosis (bTB) caused by Mycobacterium bovis is among the greatest economic threats to the UK cattle industry. Following the recommendations of the 1997 Krebs review, the Randomised Badger Culling Trial (RBCT) was established in 1998. The trial had a simple design with 10 “triplets” of three sample areas, each of ca. 100km2, for each of three treatments comprising proactive culling of badgers, reactive culling, where badgers were removed when disease was detected in cattle, and a control (survey only) with no badger removal.
The RBCT confirmed that badger ecology and behaviour were critical features of the epidemiology of bTB within the study areas (IndependentScientificGroup 2007). Patterns of cattle and badger disease in and around treatment areas were clearly affected by culling and these effects were most likely explained in terms of changes in the social biology of badgers. The Independent Scientific Group (ISG) concluded that whereas proactive culling reduced bTB incidence in cattle within treatment areas, badger culling could lead to increased incidence of the disease in reactive culling areas and in the periphery of proactive treatment areas.
Most of the analyses undertaken on the RBCT data have been based on analysing trends or differences at the 'triplet' scale. Whilst this is appropriate given the original experimental design and the aim of evaluating specified culling options, it is of limited value in understanding the epidemiology of the disease in the context of the herd, which is the scale at which the disease has its direct impact on agricultural activity. A 100 km2 sampling unit is also too coarse to capture biological processes that drive disease spread, which operate at the scale of the individual badger social group and the farming context in which these are found. Indeed, finer scale analyses of RBCT data by the ISG have themselves revealed complex patterns of spatial and temporal effects on trial outcomes (Donnelly et al. 2007, IndependentScientificGroup 2007, Jenkins et al. 2007).
We proposed that Defra will gain a better understanding of the processes at work in the cattle:badger:disease system, if we address the issue at the scale of the relevant agricultural and biological processes, which is that of the farm/herd and badger social group.
Furthermore, we recognise that cattle herd breakdowns are themselves the outcome of a testing system that is subject to variation in sensitivity and specificity and that this may be reflected in the observed differences in confirmed and unconfirmed cattle herd breakdowns. Unlike previous analyses, we deal explicitly with uncertainty in the cattle testing system.
In addition to providing new insight at a fine scale by using novel statistical approaches, our project will also provide the basis for direct comparison between the RBCT and the contrasting outcomes of survival analyses assessing the impacts of badger culling on bTB in cattle in Ireland (Griffin et al. 2005, Olea-Popelka et al. 2006).
Work was carried out in 6 phases, corresponding to the scientific objectives and approaches outlined.
We adopted a progressive, modelling strategy that characterises the pattern of spatial and temporal dependence of disease in badgers and cattle, before moving on to assess risk factors for disease incidence in each, and then considering how the sensitivity of the bTB testing in cattle could have impacted on our assessment of risk factors.
We had 6 scientific objectives, adopting a distinct technical approach in each:
1.To characterise and compare the spatial pattern of disease in the RBCT, in individual herds and badger social groups, using Mantel testing of similarity matrices.
2.To characterise the spatial-temporal clustering of disease, among herds and badger social groups, using point pattern analysis methods based on K-function analyses.
3.To analyse the influence of variation in treatments, including the timing, duration, location and extent of culling operations, on disease dynamics in badgers and cattle, using Generalised Linear Mixed Models (GLMMs).
4.To undertake survival analysis, analysing the influence of herd, badger and environmental covariates on the hazard of cattle herd breakdowns, using Cox-proportional hazard models, allowing direct comparison with Irish badger culling analyses.
5.To investigate the interactions among the components of the Farm:Cattle:Badger:Disease system, using Structural Equation Modelling (SEM) and Bayesian Belief Networks.
6.To estimate the effects of uncertainty in cattle herd testing on the measured effect size for RBCT treatments, using Monte Carlo simulation.
These objectives addressed the following specific points from the RBCT research call:
•the location and timing of cattle herd breakdowns (CHBs) in relation to the location, timing and intensity of badger culling in the trial, and the infectious status of the badgers removed;
•that culling treatment areas extended to a variable distance into the 1-2km buffer zones around the trial area boundary in order to remove all badgers that had a territory impinging on farms within the trial area; and
•the observed difference between the effects of badger culling on confirmed and unconfirmed CHBs.
2. RBCT Data Specification
2.1 Data
Data were provided by the Veterinary Laboratory Agency (Weybridge) who are the repository for the data collected from the original trial. The data were provided in the form of an 'Access' database with predefined tables and headings and GIS shape files of County Parish Holdings (CPHs) and badger social group boundaries. The data tables were queried and relevant data abstracted prior to manipulation in either GIS software or custom built code to generate data tables for each analysis. Data collated in these tables were generated for two geographical areas; Triplets A, D, E, G and I in Herefordshire and Gloucester originally collated by the Gloucestershire Wildlife Unit (WLU) at Ashton Down and Triplets B, C, F, H and J in Devon, Somerset and Cornwall by the Cornwall WLU at Polwhele. A Cattle Herd Breakdown (CHB) was defined as a ‘confirmed’ breakdown from the RBCT Access database table [tblTrialAreaBreakdowns]. CHBs were recorded as a CPH level event, and all subsequent modelling was undertaken with the CPH as the modelled unit. The number of separate parcels of land within a CPH was used as a covariate.
2.1.1 Analysis Approach
We also obtained access to holding level statistical and administrative data, extensive environmental and habitat data sets held by the Defra under SPIRE (Spatial Information Repository), as well as additional data such as Land Cover Map 2000. These data sources allowed us to extract / create relevant farm and habitat covariates for statistical analysis. A schema showing the methods used to calculate these covariates is shown in Figure 1 and the covariates are fully described in Table 1. It was necessary to restrict all analyses to the core area of the Triplets, since the data relating to badgers were most complete here: badger social group boundaries were not derived in the 1-2km buffer zone outside the Trial area.
Figure 1: Schematic of the data analysis approach adopted.
Table 1: Covariate descriptions. All related to individual CPH, for full descriptions of how each is derived see description section below.
2.1.2 Comparison between GIS and Access databases
There is a substantial mismatch between the supplied GIS CPH data and the RBCT Access database (specifically [tblTrialOccupier]) with respect to identification of CPHs (Figure 2). There were 8955 CPHs in [tblTrialOccupier] and 8602 CPHs in the GIS. Part of the discrepancy in numbers is a result of CPHs having been recorded in the RBCT Access database for the full duration of the trial as well as those CPH not having any livestock. The CPH recorded in the GIS data, in contrast, represent those present at a single time point, at the start of the trial.
Figure 2: Identifications of CPHs from the two data sources
For the badger data there was no information linking social group numbers to their spatial position in the GIS shape files for any of the triplets A, D, E, G, or I (Aston Down data). Furthermore, there were substantial differences in the ways in which the spatial disposition of social groups were recorded in these Triplets and this is considered further below.
In addition, there was no information on the size and position of badger social groups for the Treatment area J3 (Survey Only). For the fourteen triplet/treatments where data were available, there were 718 social groups in the Access database (specifically [tblTrialSocialGroups]) and 719 social groups in the GIS (Figure 3).
Only those CPHs and badger social groups where information was present in both data sources could be used in the analyses.
Figure 3: Identification of badger social groups in the two sources
2.1.3 Aston Down data
All of the badger social group shape files that did not have social group labels were collected by the Aston Down WLU. In order to create covariates describing the number of social groups and social group boundaries on a CPH it was necessary to overlay badger social group boundaries and CPH occupier boundaries. However, the manner in which badger social group boundaries were recorded differed substantially between the two WLU. Polwhele had provided social group boundaries based on bait-marking, whereas Aston Down had provided only tessellations based on the primary sett locations (Figure 4).
In order to assess the impact of these two methods for determining the boundaries of territories for each social group we applied the tessellation approach used in Aston Down to sett data for Triplets assessed using bait marking in Polwhele (Figure 5)
Figure 4: Definition of home range of badger social groups at Aston Down and Polwhele.
Figure 5: Example of the comparison of methods to calculate badger social group boundaries, for triplet / treatment B1.
It is clear that there were substantial differences in the predicted lengths of badger social group boundary and number of social groups in CPHs when estimated by the two methods (Figures 6 a and b)
Figure 6: Comparison of covariates a) length of boundary and b) number of social groups per CPH at Triplet replicate B1 using the data collation method at Polwhele (x axis) and Aston Down (y axis).
It can be seen by the low R2 values that the correspondence of the two methods for calculating social group boundaries is poor. Furthermore, the differences between the two methods is not consistent: for some CPHs there is a gross over-estimation of the number of social groups or the total length of social group boundaries between the tessellation method and the bait-marking method, and for other CPHs there is a gross under-estimation.
Since the Aston Down tessellation approach introduces additional variation (noise) to the data, it is not comparable to that of Polwhele. We therefore had to restrict our analyses to the data generated by the Polwhele WLU, where the data were most complete (and more accurate) in relation to badger and landscape characteristics.
2.1.4 The Effect of Foot and Mouth Disease in 2001
There was a dramatic difference between the number of CHBs before and after the outbreak of Foot and Mouth Disease (FMD) in 2001 (Figure 7).
Furthermore, not all triplet/treatments commenced at the same time (Figure 8)
Figure 7: Number of new CHBs occurring each month over the trial period
Figure 8: The activity of triplet/treatments over time
There was no badger culling during the FMD period, which would have impacted on the effects of the treatment. Furthermore, not all of the Trial/Treatments had commenced prior to the onset of the FMD epidemic in 2001 (Figure 8), and the intensity of cattle testing in 2001 decreased by an order of magnitude.
Due to the complexity of these potential covariates, it was therefore impossible to model the entire duration of the trial using the suite of approaches outlined previously. Thus, it was necessary to restrict all analyses to CHBs commencing between 01/01/2002 and 31/12/2005. The duration since the starting point of each triplet is included in our analysis, as it is captured in the Triplet covariate.
2.1.5 Triplet J
Triplet J was the last Triplet to undergo Proactive culling (started in July 2002) and no reactive culling took place in this triplet (Figure 8). In addition, no social group boundaries were defined for the Survey trial area (J3) within this Triplet. This led to missing badger covariates for this Triplet and meant that the inclusion of Triplet J in some analyses was compromised by missing data. Therefore, for some analyses this Triplet could not be included in all models. Triplet J was excluded from the mixed effects modelling, Structured Equation Modelling and the Bayesian Belief Network analyses.
2.2 Data sets
All analyses were thus restricted to the core area of Polwhele Triplets (B, C, F, H and J), as defined by the extent of badger information during the time period 2002 to 2005. In total there were 1662 holdings (CPH), detailed by triplet in Table 2. In these CPHs there were a total of 682 CHBs in this time period.
Several CPHs had missing information (e.g. size of farm and location information) that made it impossible to calculate other relevant covariates for them and in some analyses these CPHs had to be omitted. For this reason there were slight differences in composition of the datasets used for each of the analyses.
2.2.1 Covariates
Most of the covariates in the analyses described below were derived directly from the databases provided. However, some of the covariates were derived from manipulations of the GIS shape files provided, or from outside data sources in a GIS. These are described in this section.
Database: Badger variables
The number of badgers culled per Trial area varied from 295 in B1 to 1098 in J1. Prevalence of bTB in culled badgers varied by Triplet and treatment and was lowest in F1 and highest in H1, varying between 9% and 24% (Table 2).