McDonald 1
© 2009 Amanda McDonald
Professor White
Anthropology 174AW
10 December 2009
A Cross-Cultural Study of Rape and Analysis of Sanday’s Article
Within our society, it is a common misconception that rape is an act caused by uncontrollable lust, instinctual sexual urges and the need for sexual gratification. Rape is defined as an assault involving sexual intercourse with or without sexual penetration of another person without their consent (“Rape,” Wikipedia: The Free Encyclopedia). Biologist Randy Thornhill and anthropologist Craig Palmer argue this “instinctual” theory in their book “A Natural History of Rape,” which describes rape as “a natural, biological phenomenon that is a product of the human evolutionary heritage.” However, the idea of human behavior as being shaped and defined by cultural norms and patterns is a widely accepted theoretical position in the cultural anthropological community that challenges Thornhill and Palmer’s assertions. Anthropologist Malinowski describes human sexual behavior as “rather a sociological and cultural force, then a mere bodily relation of two individuals” (Malinowski 1929, xxiii). So the misconception described earlier, is in fact a “rape myth” that ignores the underlying social and cultural forces that lead to the behavior of rape. The next question is, what are those socio-cultural factors and patterns that lead to the human behavior of rape?
Anthropologist Peggy Sanday has done extensive work on the cross-cultural study of rape. Her article, “The Socio-Cultural Context of Rape: A Cross-Cultural Study,” provides a profile of both “rape-prone” and “rape-free” societies as well as an analysis of the socio-cultural factors that relate to the incidence of rape. Sanday’s article uses the standard cross-cultural sample to analyze rape committed by males against females in tribal societies with a sample size of 156 societies. Sanday describes “rape-prone” societies as “one in which incidence of rape is high, rape is a ceremonial act or rape is an act in which men punish or threaten women.” She defines “rape-free” societies as “societies where the act of rape is infrequent or does not occur.” Sanday tests four general hypotheses, the first is that sexual repression is related to the incidence of rape. The second is that intergroup and interpersonal violence is enacted in male sexual dominance. The third hypothesis is that the character of parent-child relations is enacted in male sexual violence and the fourth is that rape is an expression of a social ideology of male violence.
Sanday tests sixteen variables, four related to sexual repression, four related to intergroup and inter personal violence, three related to child rearing and five related to ideology of male dominance. Sanday’s results show that all hypotheses that were tested were significant with the exception of the hypothesis of sexual repression. Sanday attributes this lack of significance to the fact that sexual repression is very difficult to measure and that the variables may not accurately relate to sexual abstinence. Although some correlation was found between the relations between children and parents to rape, they were weak at best. Sanday’s results showed that interpersonal violence and the ideology of male toughness as highly significant.
Sanday concluded her article stating, “Rape in tribal societies is a part of a cultural configuration that includes interpersonal violence, male dominance and sexual separation. She concludes that when there is an imbalance in the value of characteristics associated with maleness and femaleness, then men elevate to a higher role. Sexual violence is then a way to assert male superiority, which Sanday then uses to conclude, “Rape is a struggle for control in the face of difficult circumstances.” (Sanday, 106) Sanday goes on to apply her findings in tribal societies, to our own culture and points out that violence is programmed, not a part of male nature. She argues that to prevent future rape in our own society, young boys must be taught to respect women and women must gain this respect through their struggle for equal rights.
In cross-cultural research there are two significant problems, missing data and network autocorrelation effects, also known as Galton’s problem. Galton’s problem is “the problem of drawing inferences from cross-cultural data, due to the statistical phenomenon of autocorrelation (“Galton’s Problem,” Wikipedia).” Eff and Dow have developed a methodology to deal with these two problems using multiple imputation and a new approach based on multivariate regression. Eff and Dow’s multiple imputation method involves using auxiliary data to impute missing values. Then statistical models are estimated and developed using these data sets and the parameter estimates, combined with a set of rules. Eff and Dow created a set of auxiliary data for the Standard Cross-Cultural Sample and introduced two new R programs to be used.
The first program is the unrestricted model (xUR), which uses auxiliary data to create datasets with variables from the SCCS. The xUR results provide p-values for each estimated coefficient, which is an indication of which independent variables can be dropped. Any variable with a p-value above the cutoff (0.10 or higher) is then excluded from the restricted model (xR). Then the second program, the restricted model, uses these data sets to estimate a two-stage least-squares model, using two-stage instrumental variables regression, which help control for Galton’s problem. Estimates are then produced from each imputed dataset and combined. The xR results are provided in three objects: bbb(coefficients with p-values and VIFs); ccc (diagnostics with p-values); and r2 (R2 for the final model and each of the models creating the instrumental variables). (Eff and Dow) These results enable us to see the correlation between the dependent variable and the independent variables and their significance, if any.
For my cross-cultural analysis of rape, I took five of Sanday’s sixteen variables, as well as an additional 3 of my own, and ran them through the R program using modified versions of Eff and Dow’s xUR and xR programs and methodology for multiple imputation (MI). I used Peggy Sanday’s variable 667 for my dependent variable:
667. Rape: Incidents reports, or thought of as means of punishment women, or part of ceremony.
91 . = Missing data 45 1 = Absent 50 2 = Present
Then I used the following independent variables to test my hypothesis:
240. POST-PARTUM SEX TABOOS
242. SEGREGATION OF ADOLESCENT BOYS
300. Aggression: Late Boy
664. Ideology of Male Toughness
78 . = Missing data
21 1 = Absent
87 2 = Present
665. Male Segregation: One or more places where males congregate alone, or males occupy a separate part of the household, or there is sharp ceremonial segregation of the sexes.
666. Moderate or Frequent Interpersonal Violence
55 . = Missing data
43 1 = Absent
88 2 = Present
668. At least some Wives taken from Hostile Groups
672. Male Avoidance of Female Sexuality
With the new programs developed by Eff and Dow, I predicted that the variables, which were shown to have significance in Sanday’s results, would prove to be insignificant after autocorrelation (Galton’s problem) is separated from casual effects by the newly developed method of two-stage OLS regression.
Below were my original xUR model results from my first xUR model in which I choose variables that I felt would have a correlation to rape.
> bbb
2SLS model for rape
coef Fstat ddf pvalue VIF
(Intercept) 2.562 0.336 167.139 0.563 NA
fyll -0.779 0.288 1083.896 0.592 4.020
fydd 0.448 0.287 2671.850 0.592 3.012
dateobs 0.000 0.043 128.399 0.836 3.595
cultints 0.085 0.727 966.792 0.394 9.227
roots 0.220 0.259 435.561 0.611 9.420
cereals 0.331 0.587 477.379 0.444 12.597
gath 0.012 0.025 1614.886 0.874 4.748
plow 0.217 0.304 814.607 0.581 4.793
hunt 0.063 0.545 4253.794 0.460 7.640
fish -0.010 0.018 440.129 0.893 5.969
anim 0.031 0.137 2549.142 0.711 10.677
pigs -0.153 0.305 1748.048 0.581 3.814
milk 0.103 0.086 3238.293 0.769 6.897
bovines -0.201 0.388 3475.813 0.533 5.909
tree 0.444 0.707 347.629 0.401 6.841
foodtrade 0.005 0.447 3496.470 0.504 1.789
foodscarc 0.016 0.067 338.090 0.796 1.687
ecorich -0.029 0.120 317.220 0.729 3.458
popdens -0.015 0.034 11862.371 0.853 5.462
pathstress -0.026 0.788 11684.485 0.375 3.863
exogamy -0.015 0.043 919.170 0.836 2.224
ncmallow -0.025 0.510 294.359 0.476 2.136
famsize -0.068 0.033 2624.650 0.856 68.830
settype 0.034 0.257 489.409 0.613 8.019
localjh -0.022 0.016 649.662 0.900 3.094
superjh -0.155 2.114 1105.436 0.146 4.040
moralgods -0.055 0.392 541.527 0.532 2.699
fempower 0.005 0.010 542.865 0.919 2.131
femsubs 0.013 0.036 7385.257 0.850 3.261
sexratio -0.107 0.431 53.240 0.514 2.186
war 0.009 0.397 877.498 0.529 2.897
himilexp -0.080 0.190 419.162 0.663 2.428
money 0.058 0.704 1884.435 0.402 2.744
wagelabor 0.036 0.078 31.348 0.782 2.120
migr 0.052 0.054 63.238 0.816 2.556
brideprice -0.082 0.140 390.846 0.708 3.323
nuclearfam -0.258 1.057 379.763 0.305 3.974
pctFemPolyg 0.003 0.777 229.551 0.379 2.663
nonmatrel 0.046 0.207 1436.922 0.649 1.961
lrgfam 0.003 0.000 1891.865 0.983 66.072
malesexag 0.083 1.887 66.098 0.174 1.922
segadlboys 0.001 0.000 116.689 0.992 2.308
agrlateboy -0.001 0.001 724.970 0.977 2.625
> r2
R2:final model R2:IV(distance) R2:IV(language)
0.3827469 0.9280168 0.9421643
> ccc
Fstat df pvalue
RESET 1.817 1838.106 0.178
Wald on restrs. -0.002 116384.622 1.000
NCV 0.622 3242.557 0.430
SWnormal 2.379 141.466 0.125
lagll 0.956 1332831.121 0.328
lagdd 1.200 24711.297 0.273
None of these results were significant as the VIFs were extremely high, meaning the significance was not calculated correctly. In addition, in this model I used included variable v.80 “lrgfam,” which is redundant with Eff and Dow’s variable v68 “famsize” and therefore should not have been included in the model, which the high VIF for v.80 signifies. However, although I was not able to find any significance amongst these variables, I was able to eliminate many independent variables.
After failing to find significance between my independent variable rape and my handpicked independent variables, I began researching the work of other scholars who had done a similar study. It was then that I discovered Sandy’s study and set out to see if the results she had obtained using descriptive statistical methods would hold up using the same variables using inferential statistical methods. Below are my xUR model results using Sanday’s variables:
> bbb
2SLS model for rape = SCCS$v668
ideomaltough v664. Ideology of Male Toughness
78 . = Missing data
21 1 = Absent
87 2 = Present
freintovio v666. 666. Moderate or Frequent Interpersonal Violence
55 . = Missing data
43 1 = Absent
88 2 = Present
wivhotgr v668. At least some Wives taken from Hostile Groups
55 . = Missing data
84 1 = Absent
47 2 = Present
18 5 = Complete, with peers
segadlboys v242. SEGREGATION OF ADOLESCENT BOYS
29 . = Missing data
108 1 = Absence
19 2 = Partial
8 3 = Complete, with relatives outside nuclear family
4 4 = Complete, with non-relatives
coef Fstat ddf pvalue VIF
(Intercept) 1.248 1.337 3668.645 0.248 NA
fyll 0.044 0.004 2392.471 0.951 1.426 <-- no language clustering
fydd -0.560 1.249 1199.506 0.264 1.486 <-- no distance clustering
segadlboys -0.002 0.002 311.539 0.967 1.367
agrlateboy 0.015 0.255 319.289 0.614 1.295
postsextab -0.027 0.359 85.438 0.551 1.653
freintovio 0.282 4.620 262.522 0.033 1.402 <-- significant
ideomaltough 0.291 3.535 153.221 0.062 1.513 <-- significant
maleseg 0.043 0.116 625.197 0.734 1.326
postsextab2 -0.026 0.365 116.080 0.547 1.648
mavofemsex -0.016 0.061 212.265 0.805 1.148
wivhotgr 0.121 1.120 683.239 0.290 1.295
> r2
R2:final model R2:IV(distance) R2:IV(language)
0.2734577 0.9461911 0.9609407 <-- excellent R2
> ccc
Fstat df pvalue
RESET 0.195 1519.599 0.659 <-- good
Wald on restrs. -0.051 207.302 1.000 <-- good
NCV 0.174 265.848 0.677 <-- good
SWnormal 7.462 295.726 0.007
lagll 0.366 132466.550 0.545 <-- good
lagdd 0.503 127523.692 0.478 <-- good
From the results above, I found that there were two variables that showed potential significance, “freintovio” (v.666) and “ideomaltough” (v.664). The variable “wivhotgr” (v.668) had showed significance before, but no longer, so I eliminated it from my xR model. There was no language or distance clustering and the R2 was very good so I moved forward with my xR model and produced the following results:
> bbb
2SLS model for rape = SCCS$v668
coef Fstat ddf pvalue VIF
(Intercept) 1.204 3.717 3743.789 0.054 NA
fydd -0.514 1.383 1078.445 0.240 1.158
freintovio 0.325 7.306 212.412 0.007 1.188
ideomaltough 0.317 5.144 241.921 0.024 1.337
> r2
R2:final model R2:IV(distance) R2:IV(language)
0.2258225 0.9461911 0.9609407
> ccc
Fstat df pvalue
RESET -0.109 90.199 1.000
Wald on restrs. -0.051 207.302 1.000
NCV 0.855 1587.143 0.355
SWnormal 18.056 2424.530 0.000
lagll 0.251 309437.657 0.616
lagdd 0.455 337267.059 0.500
The results from my xR program show that both ideology of male toughness (v664) and moderate or frequent interpersonal violence (v666) are variables that affect rape. Sanday’s results also produced this result. There is a 23 percent R2 causality with no significant autocorrelation. The chi-squared test shows significant values of variable 664 with a p-value of 0.0001 and variable 666 with a p-value of 0.0006. However, my 2SLS shows the values to be 240 and 11 times less significant, bringing the p-values to 0.024 for variable 664 and p-value 0.007 for variable 666. The two-stage OLS regression separates out autocorrelation from causal effects, meaning that the chi-squared values are not truly descriptive of the relationships of these variables. This is a significant overestimation of the significance of variables that reflects the problems with using non-descriptive statistics and the falsity of the literature that has used these methods to show such statistical significance up until this point (White 2009). Therefore, the results produced by Sanday’s cross-cultural survey, are highly inflated, making the random results seem more significant than they really are.
The results from Sanday’s cross-cultural study of rape using descriptive statistical methods were not replicated in my research where I used Eff and Dow’s inferential statistical methods. But, the results for post-partum sex taboos (v.240 PostSexTab) and male avoidance of female sexuality (v.672 MavoFemSex) did have no significant correlation with the incidence of rape (v.667), which matched the results produced by Sanday. However, this is not to say that these do not truly correlate to the incidence of rape, as these variables are incredibly hard to measure and may not be an accurate representation. As discussed above, results for the variables ideology of male toughness (v.664 IdeoMalTough) and frequent interpersonal violence (v.660 FreIntoVio), did show to be significant. However, their significance was greatly deflated which shows that these results are less significant than claimed by Sanday. This difference in significance can be attributed to the use of Eff and Dow’s inferential statistical methods as discussed in the previous paragraph. The last variable from Sanday’s survey that I reproduced was, at least some wives taken from hostile groups (WivHotGr). In Sanday’s research, she found this to be positively correlated with the incidence of rape, however in my results, I found that this has no significant correlation to rape. The last three variables that I tested were my own, Segregation of Adolescent Boys (v.242 SegAdlBoys), Aggression: Late Boy (v.300 AgrLateBoy) and Male Segregation (v. 665 MaleSeg), but unfortunately, these did not produce any significant results either.
There are various diagnostic tests used to eliminate null hypotheses. The first of these tests is the Ramsey RESET, which tests for omitted non-linear transformations of the independent variable. In my results above, because the p-value is greater than 0.05, there are no non-linear transformations of independent variables. The second test is a Wald restriction test that tests to see if there are any significant variables missing and in my results the p-value for this test was greater than 0.05, which means there are no significant excluded variables. A third diagnostic test (NCV), a Lagrange multiplier tests for heteroscadicity, in other words it tests to see if autocorrelation errors are bunched. Since the p-value for this in my results is greater than 0.05, then there are no bunched autocorrelation errors. Another diagnostic test, which test to see if autocorrelation errors are normally distributed, is applied to model residuals. In my xR results the p-value for this diagnostic 0.00, which is less than 0.05, meaning that the errors are not normally distributed. The last two diagnostics tests use a Lagrange multiplier to test for additional network dependence (network lag) using the two weight matrices of language and distance. Since the p-values in my results for these tests are 0.616 and 0.500 respectfully, both have a p-value greater than 0.05, meaning there are no additional network effects in this variable. It is important to note that these tests are only valid if the dropt<- list of independent variables that were dropped, are included in the final model.
Up until now, descriptive statistics have been used to correlate relationships been variables in cross-cultural survey research with the serious problem of missing data. In the past, this problem was dealt with by a procedure known as listwise deletion, which drops an observation that contains a missing value for any of the model’s variables (Eff and Dow 2009). This leads to the survey research being based upon a subsample that poorly represents the full population. Another issue with these methods and the use of cross-cultural data sets is that sample cases often are not independent of one another as a result of inter-societal network processes, also known as Galton’ problem. In previous modeling procedures, these networks of relations were overlooked, producing inconsistent and biased results. Eff and Dow developed a new approach using inferential statistical methods that carefully estimates missing observations and use a new autocorrelation effects regression approach to Galton’s problem (Eff and Dow 2009). Through the use of their xUR and xR program, I evaluated the results of a previously published survey, using the now dated methods of descriptive statistics. Peggy Sanday’s cross-cultural survey of rape used these (non) descriptive statistics to evaluate the significance of sixteen independent variables in relation to rape (v.664). Through the use of Eff and Dow’s inferential statistical methods and R programs, I re-evaluated six of the variables Sanday used in her survey. After controlling for Galton’s effects and adding uncertainties due to missing data, it became clear that Sanday’s results had inflated significance, 240 to 11 times more significant to be exact. Though they were still significant, this shows the impact that Galton’s problem has on descriptive statistic, making random results appear significant when they in fact are not. Cross-cultural methods and research have been given a facelift, allowing for a more modern and accurate evaluation of cross-cultural data. These new methods have opened the door for new research and the ability to return to the results of previous researchers and reevaluate their true significance and validity. Using these new inferential methods will also create a new era of learning, where students are able to study the correlation of variables across different cultures, helping spur new ideas and theories in anthropology.
Works Cited
Dow, Malcolm M., and Anthon E. Eff. "Global, Regional, and Local Network Autocorrelation in the Standard Cross-Cultural Sample." Cross-Cultural Research 42.2 (2008): 148-71. Sage Journals Online. Web. 15 Oct. 2009.
Dow, Malcolm M., and Anthon E. Eff. "Multiple Imputation of Missing Data in Cross-Cultural Samples." Cross Cultural Research 43.3 (2009): 206-29. Sage Journals Online. Web. 10 Oct. 2009.
Sanday, Peggy R. "The Socio-Cultural Context of Rape: A Cross-Cultural Study." Confronting rape and sexual assault. Wilmington, Del: Scholarly Resources, 1998. 93-108. Print.
White, Douglas R. Inferential Statistics with Digital Learning Media: A Roadmap for Causal and Autocorrelation Estimates with SCCS data. Web. 11 Dec. 2009.