Major Prey Index: Procedure for Metric Calculations and Determination of Major Prey Taxa
Contact Joseph J. Bizzarro () with any questions or comments.
Major Prey Index (MPI) calculations were automated through the use of R scripts and a sample data set to provide users with tractable results for reference during their own computations. Three files are provided: 1) the species-specific, high resolution diet composition data set that was used in the manuscript (“Pacific_Coast_groundfish_diet_composition.xls), 2) an R script that runs the steps necessary to calculate the MPI and the associateddetermination of Major Prey taxa (“calculate_major_prey_index.R”), and 3) an R script (“run_this_code.R”) that installs packages and libraries necessary to calculations and runs the MPI calculations script.
Open the “Pacific_Coast_groundfish_diet_composition.xls” file to use for reference. To input your own data, you will need to create an Excel spreadsheet that subscribes to three requirements: 1) the first two columns must be string columns (e.g., scientific name and common name, species and sex), 2) the third column mustlist relative data quality, as described in the manuscript, on a scale of 0–1 (if data are unranked, use input “1” for each row in this column), and 3) columns 4 and greater must include diet composition data asa percentage. There is no limit on the number of rows or columns used to input descriptive information (e.g., scientific name and common name)or prey categories, respectively, except those imposed by Excel.
R is free statistics software that compiles and runs on a variety of Windows, Mac, and UNIX platforms. Access the R website, and download and install the proper software for your computer ( Once installed, double-click on the R program icon to open.
Set your working directory so that R can locate the three files that are needed to run the MPI calculations. Type “getwd()” at the prompt in R to determine your current working directory. You can either move your files to that directory, or establish a different working directory. For example, if you wanted to put the files in a folder on your Desktop called “MPI” and your current working directory is “C:/Users/username/Documents,” you would direct R to the desktop folder by inputting this string: setwd("C:/Users/username/Desktop/MPI").
Open the “run_this_code.R” using any program that opens text documents (e.g., Notepad, Microsoft Word). Copy the entire contents of the script and paste them into R at the (>) prompt.
You may need to specify a CRAN mirror (i.e., software distribution site). If so, choose the CRAN mirror that is closest to your location.
The script will calculate the five diet metrics (Mean Diet Composition = Mean, Median Diet Composition = Median, PSA = Prey-Specific Abundance, Min = Minimum Diet Contribution, and FO = Frequency of Occurrence), and will then scale and rank the metrics, and compare them to determine their degree of correlation. The minimum diet contribution used in the published study was 20%; however, this number can be adjusted by the user to best reflect a substantial dietary contribution based on the specifics of the data set (e.g., number of prey categories, diet breadth). To adjust the minimum diet contribution value, modify the default entry (minDiet <- 20) in the “calculate_major_prey_index” script.
Pearson Correlation Coefficients and associated p-values are calculated and presented. The user must then determine which metrics to advance for MPI calculations as indicated at the prompt. Highly correlated metrics (r > 0.70 and P < 0.05 with two or more metrics) provide redundant information and should be removed prior to this analysis. In the example data set, you will see that Mean Diet Composition is highly correlated with three metrics, but all other metrics are either not highly correlated or highly correlated with only one other metric (Min and PSA). If using this data set, advance these metrics: Median, PSA, Min, and FO.
MPI values are calculated, and Major Prey taxa are determined through a randomization test(9999 draws) that incorporatesthe ranked values among theadvanced metrics including tied ranks. The ranked values among metrics constitute the data set of observed values to draw from, instead of drawing from all values (1-47, given the 47 prey categories used in the example data set).Because the expected values generated from the randomization test could fall outside of the observed upper or lower limits (due to tied ranks in the observed data set), they were then scaled by the observed values to create the final null (expected) data set.
A series of output files are written to the work space location: 1) out_metrics.xls (contains diet metric calculations and ranks), 2) out_corr_rho.xls (contains results (rho values) of Pearson Correlation tests among metrics), 3) out_corr_pval.xls (contains p-values associated with rho values), 4) out_mpi.xls (contains relative ranks for each prey taxon among each forwarded diet metric, raw MPI ranks, scaled MPI ranks, determines what prey taxon are Major Prey), and 5) out_mpi_rand (contains the expected distribution of MPI values among all possible rank combinations from the provided data set).The MPI (cut-off) value associated with a 5% probability of occurrence is indicated as the final output to the program.To adjust the minimum default probability value, modify the appropriate entry (pCutoff <- 0.05) in the “calculate_major_prey_index” script.