3

Selecting the best design for non-standard toxicology experiments

Corresponding Author:

A. John Bailer

Department of Statistics

Miami University

Oxford, OH 45056

Office: 513-529-7828

Fax: 513-529-0989

Total Number of Words: 6948 (including references and appendices)

References: 15

Tables: 5

Figures: 5

Selecting the best design for non-standard toxicology experiments

Jennifer M. Webb†, Byran J. Smucker† and A. John Bailer*,†

†Department of Statistics, Miami University, Oxford, Ohio 45056, USA

Office: 513-529-7828

Fax: 513-529-0989

* To whom correspondence should be addressed

Corresponding Author:

A. John Bailer

Abstract‒Although many experiments in environmental toxicology use standard statistical experimental designs, there are situations that arise where no such standard design is natural or applicable due to logistical constraints. For example, the layout of a lab may suggest that each shelf serve as a block with the number of experimental units per shelf either greater than or less than the number of treatments in a way that precludes the use of a typical block design. In such cases, an effective and powerful alternative is to employ optimal experimental design principles, a strategy which produces designs with precise statistical estimates. Here, a D-optimal design was generated for an experiment in environmental toxicology that has two factors, sixteen treatments, and constraints similar to those described above. After initial consideration of a randomized complete block design and an intuitive cyclic design, we compared a D-optimal design and a slightly more complicated version of the cyclic design. Simulations were conducted generating random responses under a variety of scenarios that reflect conditions motivated by a similar toxicology study, and the designs were evaluated via D-efficiency as well as by a power analysis. In this case, we found that the cyclic design performed well compared to the D-optimal design.

Keywords – ecotoxicology; dose-response modeling; effects-based monitoring; D-optimal design

INTRODUCTION

Often in aquatic toxicology tests, the size and design of experiments are constrained by limitations in the amount of physical space available. But what if the availability of space exceeds the needs of an experiment? How can this excess space be used efficiently to enhance the accompanying statistical analysis? Alternatively, what if the available space is less than would be needed to employ standard statistical experimental designs? A strategy for addressing questions such as these is described in this article.

An experiment was being designed by colleagues with a context that did not easily succumb to a standard experiment design (S. Rumschlag and M. Boone, personal communication). In this study, the physical layout of the lab imposed constraints upon the experiment (see Figure 1). Mass at metamorphosis of a tadpole was the response and there were 16 treatments to be replicated: the presence of a pesticide at a particular concentration (present/absent—2 levels) crossed with timing of pathogen exposure (8 levels). The experiment was conducted in a room containing three racks, each rack containing four shelves, each shelf accommodating up to twenty-seven beakers, and each beaker housing a single tadpole. Shelf was a natural blocking factor since there was concern that light levels, proximity to the door, and/or other characteristics that differ between shelves might impact the response (personal communication). Therefore, space for 324 (=3 racks x 4 shelves/rack x 27 beakers/shelf) experimental units was available which allowed for approximately 20 (≈324/16) replicates of each treatment. A complementary problem would be the situation where there was room for fewer than the number of treatments on each shelf, say only 10 beakers could be fit on each shelf; however, there were 16 treatments to be assigned. In some block-treatment configurations, this leads to incomplete block designs. This also can be addressed using the techniques in this paper; however, for the remainder of this paper, we focus on the situation where a block is able to hold more experimental units than the number of specified treatments.

This situation presents an interesting experiment design problem. In particular, how should we assign 16 treatments to 324 experimental units that are arranged in 12 blocks of size 27? A completely randomized design is inappropriate because of the block structure imposed by the shelves. A randomized complete block design (RCBD) could be considered if we only wanted to use 16 units per shelf. However, this clearly wastes an opportunity to collect more data, since there is a capacity of 27 experimental units on each shelf but only 16 treatments, and the limiting factor is not the overall number of tadpoles but the physical restrictions of the experimental environment. Balanced or partially balanced incomplete block designs are more promising, except that they are typically employed when the number of treatments exceeds the size of the blocks. In this case, the number of experimental units per block is larger than the number of treatments, but not in a way that divides evenly. One might even consider splitting the design in two but this reduces the power of the experiment to detect differences in treatments, and would still involve designing an experiment for which the size of the blocks does not nicely divide the number of treatments. An intuitive solution would involve assigning the treatments to blocks in a cyclic way. The statistical performance of this design is unclear, but we explore such an approach later in this article.

We propose D-optimal designs as a solution to this problem. These designs are based upon a well-known, flexible design construction criterion developed in the statistics literature (see [1] and [2]) that attempts to choose a design which minimizes the determinant of the variance-covariance matrix of the regression parameter estimates. In other words, such a design estimates these parameters precisely. To obtain these design, the experimenter must specify the nature of the predictors (i.e. whether they are continuous or categorical, and if categorical how many levels they possess), the form of the statistical model to be fit (e.g. which predictors will appear in the model, as well as interactions), the number of observations, and the blocking structure of the design. Then, the designs can be constructed, via algorithm, in common commercial software.

To demonstrate the flexibility of this approach, consider again that the motivating experiment that had 16 treatments but blocks of size 27. The D-optimal design, and indeed any design for this experiment, will indicate which treatments should be included in each block. The D-optimal approach, however, works for any treatment/block size combination. For instance, instead of 16 treatments and 12 blocks of size 27, we might, in a different situation in which resources are scarce, only have 6 blocks of size 11. The possible situations that preclude the routine use of standard blocking designs are endless, but the D-optimal approach would provide designs with good statistical properties regardless.

Optimal designs are often applied in industrial statistics settings, as alternatives to response surface designs, where process or product design and improvement is the objective. Most of the applications of optimal design to the chemical and biological sciences have been in the context of nonlinear regression models (environmental toxicology [3]; systems biology [4,5,6]; microbiology [7]; and environmental engineering [8]), though chemometrics applications [9,10] use it in the context of linear regression. More on optimal design can be obtained in the aforementioned books [1,2] as well as in an environmetrics context [11].

In this article, we describe in more detail this optimal approach to experiment design. Specifically, we define and discuss the benefits of D-optimal designs and show how the D-criterion can be used to compare designs. Finally, we conduct a small computer simulation study to compare the D-optimal design with an alternative design in terms of power to detect effects of interest.

METHODS

In this section, we first add some additional background for the motivating example in environmental toxicology, and describe the statistical model of interest. We then relate the model to more general regression models which are described in preparation for the description of the D-criterion and D-optimal designs. Subsequently, we describe D-optimal designs along with other design possibilities for the toxicology experiment. Finally, we describe a simulation experiment that was conducted to compare alternative designs.

Statistical model for analyzing the impact of treatments on mass at metamorphosis

The mass at metamorphosis of tadpoles may be affected by the presence of a pesticide. Further, tadpoles may be simultaneously exposed to a pathogen at a specific time prior to or at metamorphosis. Each combination of pesticide concentration and timing of pathogen exposure defines a unique treatment that might impact the biological response, mass at metamorphosis. Evaluating the effects of these two factors as well as their interaction on a biological endpoint is the ultimate goal of such a study.

Linear statistical models are a standard tool for detecting differences among treatments. In this case, we wish to evaluate the effect of a pesticide and timing of pathogen introduction on mass at metamorphosis. If we assume no interaction between the two factors, an appropriate model is as follows:

Massijk= μ+ Pesticidei+ Timingj+ Blockk2imingethods in a popular adpoles Four SHelves plant yield week should be blocked. in table 1.ach designles to a lower + εijk (1)

where εijk~N(0, σε2) and

Massijk = Response for tadpole k exposed to pesticide level i for pathogen timing level j

μ = Overall mean

Pesticidei= Pesticide effect for i= present, absent

Timingj= Timing of pathogen exposure effect for j=1(=no pathogen), 2, 3, 4, 5, 6, 7, 8 (=pathogen introduced late)

Blockk2imingethods in a popular adpoles Four SHelves plant yield week should be blocked. in table 1.ach designles to a lower = Block/shelf effect for k= 1, 2, …,12

εijk = Random Error term

The response variable, Mass, is measured at the metamorphosis of the tadpole. As an aside, when estimating the model some restrictions need to be imposed; e.g. Pesticideabsent=Timing1=Block1=0. Then the overall mean, μ, represents the mean mass at metamorphosis for a tadpole that was not exposed to a pesticide and not exposed to a pathogen and was in a beaker located in the first block. Therefore, all effects are interpreted in relation to these reference levels of the two factors and block. The various times of pathogen introduction prior to metamorphosis are labeled 1-8, where “1” signifies no pathogen was introduced and “2” is the time of introduction where these tadpoles have been exposed to the pathogen for the shortest amount of time. The random error term, εjk, is assumed to be independently normally distributed with mean zero, and constant variance σε2.

The second model is an extension of the first, this time assuming an interaction effect exists between the factors. The model is as follows:

Massijk= μ+ Pesticidei+ Timingj + Blockk2imingethods in a popular adpoles Four SHelves plant yield week should be blocked. in table 1.ach designles to a lower +Pesticidei* Timingj+ εijk (2)

where the Pesticidei* Timingj term represents the interaction effect between pesticide and timing levels. The lm() function in R (R Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/.) was used to fit these models using data simulated as explained below.

Relating the statistical model to a design

We note that models (1) and (2) can be reformulated as regression models in matrix form. We present this version of the model because it is necessary for the subsequent explanation of D-optimal design. However, we give more details in Appendix 1; see also [12] for additional background. The general form of the model is

Y=Xβ+ε / (3)

where Y is a vector of responses with n rows and 1 column (i.e. an n×1 vector, n is the total number of observations, β is a p×1 vector of model parameters, X is an n×p model matrix, and ε is an n×1 vector of error terms such that ε~N(0,σ2I), with 0 representing the n×1 vector of 0’s and I an n×n identity matrix. The matrix X is also called the design matrix and a row in this matrix corresponds to a replicate of a unique treatment-block combinations. The optimal design problem can be framed as ‘which rows should be included in the design matrix.’ Using ordinary least squares to estimate β gives a vector of estimates β=XTX-1XTY. This formulation is particularly useful in designing experiments because the variance-covariance matrix of β is σ2XTX-1, and we call XTX the information matrix. Minimizing the generalized variance of these estimators can be achieved by minimizing a function of the variance-covariance matrix of the estimators. This is equivalent to maximizing properties of the information matrix. The determinant of the matrix is one function that is function that is minimized.

Rows of the design matrix X reflect a replicate of a particular unique combination of treatments (pesticide, timing of pathogen) and block (shelf), and the design implemented by the experimenter determines the rows of this matrix. This is a key observation since the selection of the rows (conditions in the experiment) can be chosen to minimize the determinant of the variance-covariance matrix of β (i.e. the D-criterion), which results in relatively small variances for estimates of the model parameters. We will revisit this in the subsequent section discussing experiment designs. More details regarding how model (3) is equivalent to the models (1) and (2) are given in Appendix 1.

Here, we have a slightly more complicated situation, because we wish to make inference on just the Pesticide and Timing factors, while the blocks exist in the model but inference on them is unimportant. Because this is the case we split the matrix X into two smaller matrices: X1 is n×p1 and includes the columns in X corresponding to the intercept and pesticide/timing factors, and X2 is n×p2 and corresponds to the columns in X associated with the blocks (p1 is the number of parameters associated with the intercept and treatment factors, and p2 is the number of parameters associated with blocks; see Appendix 1). Then X can be partitioned as X=X1|X2 where the vertical bar “|” implies the concatenation of two matrices and in a similar way, β=β1|β2 where β1 is a p1-vector, β2 is a p2-vector. In our experiment, the total number of parameters for the first-order model is p=20 (1 for the intercept β0, 1 for the pesticide factor, 7 for the timing factor, and 11 for the blocks), so that p1=9 and p2=11 (again, see Appendix 1). Given the above development, we can rewrite (3) as