Supplementary Material
DIAGNOSIS OF PD
The diagnosis of PD was based on a careful history and neurological exam performed by the enrolling investigator who documented the presence of at least two of the three cardinal signs of PD: rigidity, bradykinesia, and rest tremor with no evidence of diagnostic alternatives. Patients who had been treated with levodopa must have demonstrated a clear response.
OPERATIONAL DEFINITION OF SUBSYNDROMAL DEPRESSION
Subsyndromal depression was defined as the presence of two or more depressive symptoms at threshold or subthreshold levels on the Structured Clinical Interview for DSM-IV (SCID)e1, at least one of which had to include depressed mood or anhedonia. Subjects had to have a score of > 12 on the first 17 items of the HAM-D.
DIAGNOSIS OF DEMENTIA
Dementia was defined as meeting the DSM-IV criteria for dementia or a Mini Mental State Examination (MMSE)e2 score < 23.
DEFINITION OF “ADEQUATE TRIAL” OF STUDY MEDICATIONS
Subjects were excluded if they had had an adequate trial of paroxetine or venlafaxine defined as 20 mg or greater of paroxetine (25 mg of paroxetine CR) or 225 mg or greater of venlafaxine or venlafaxine XR for at least 4 weeks.
OTHER EXCLUSION CRITERIA
Other exclusion criteria were neurosurgery for PD, electroconvulsive therapy within the past 6 months, significant suicidal ideation or prior hospitalization for suicidal ideation/attempts, depression requiring hospitalization within the past 6 months, bipolar disorder, spontaneous hallucinations or psychotic symptoms, or an anxiety disorder (except for simple phobias and social phobia that were clearly related to PD symptoms). Excluded medical conditions were uncontrolled hypertension, unstable coronary artery disease, significant hepatic or renal disease, narrow angle glaucoma and severe urinary retention.
STUDY MEDICATION
Wyeth Pharmaceuticals provided venlafaxine XR and matching placebo. Glaxo-Smith Kline provided paroxetine. The central study pharmacy (Rochester, NY) over-encapsulated the paroxetine and prepared matching placebo capsules. The medication bottles were labeled and shipped from the central study pharmacy to the sites in advance.
RANDOMIZATION PROCESS
The CTCC informed the site of the identity of the coded medication bottles to be supplied to the participant. Each participant received two medication bottles, one containing active or placebo paroxetine and the other containing active or placebo venlafaxine XR, each labeled with a code number. Participants took pills from each of the two bottles (one or both of which contained placebo). This “double dummy” procedure was used to maintain the double-blind of the treatment. The computer-generated randomization plan was prepared by an unblinded programmer in the Department of Biostatistics and Computational Biology at the University of Rochester. The plan specified equal allocation to the three treatment arms and included stratification (by site and type of depression [major, non-major]) and blocking. Nobody else involved in the conduct of the trial had access to the identity of the treatment assignments except an unblinded statistician who served as a liaison with the Data and Safety Monitoring Board (DSMB), and the central pharmacy personnel involved in packaging the study medication.
TRAINING AND RELIABILITY ASSESSMENT FOR THE SCID AND HAM-D
For the Structured Clinical Interview for DSM-IV (SCID) training, raters (who were different from the investigators who administered the HAM-D) were required to view and rate three videotapes of patients with depressive disorders that were independently rated by an expert rater (WMM). If the rater did not get the same diagnoses as the expert rater, the rater was given feedback. All raters were required to achieve concordance on three SCID videotapes before being certified to administer the SCID in the multicenter trial.
In order to ensure accurate administration and inter-rater reliability for the Hamilton Depression Rating Scale (HAM-D) and the Montgomery Asberg Depression Rating Scale (MADRS), the co-PI and psychiatric consultant (WMM and JL, respectively) prepared training tapes for the investigators that demonstrated the administration of the HAM-D and MADRS. A prepared script was used to help administer the scales. All raters were certified prior to enrollment. The certification process included two steps. First, each rater viewed three tapes of depressed PD patients and rated them according to the scripted interview. Raters were required to have scores for each tape that were no more than three points from the HAM-D and MADRS scores established by WMM and JL. Any raters with deviations greater than three points on a specific tape and rating scale discussed their ratings with the co-PI (WMM) and were given an additional set of videotapes to score.
Quarterly teleconferences were held to discuss the assessment of participants in the trial with feedback from WMM and his staff on videotape ratings. Finally, investigators were required to videotape and rate the baseline and final HAM-D and MADRS interviews for each participant. Our expert rater (WMM) and his staff evaluated the tapes and determined a score for each item on the HAM-D and MADRS while blind to the original rater’s scores. The total scores for the HAM-D and MADRS were required to be within two points of the corresponding expert ratings or the tapes were discussed with the site and a consensus score was assigned. The ratings for both scales were within two points in 90% of the tapes reviewed and the majority of discrepancies occurred when the investigator did not follow the script and the expert rater had difficulty determining a rating. With clarification, the expert rater and investigator came to agreement on the ratings for the items on the scales in all of the participants.
ADDITIONAL SECONDARY OUTCOME MEASURES
Other secondary outcome measures included the Unified Parkinson’s Disease Rating Scalee3 (UPDRS, total and subscale scorese4) to assess PD motor function, Schwab and England Activities of Daily Living (S/E ADL) Scalee5, Parkinson’s Disease Questionnaire 39 (PDQ-39, overall and subscale scores)e6 which served as a PD-specific measure of quality of life, Medical Outcomes Study Short Form-36 Health Status Survey (SF-36, summary and subscale scores)e7, Clinical Anxiety Scale (CAS)e8, Brief Psychiatric Rating Scale (BPRS)e9, and MMSEe2. A neuropsychological test battery was administered which included the Hopkins Verbal Learning Teste10, Brief Test of Attentione11, Symbol Digit Modalities Teste12, Controlled Oral Word Association Teste13, and Letter-Number Sequencing Teste14. Sleep was assessed with the Pittsburgh Sleep Quality Index (PSQI)e15. The motor component of the UPDRS, S/E ADL, CAS, BPRS, and PSQI were obtained at all follow-up visits. The mental and ADL components of the UPDRS were obtained at baseline, Week 6, and Week 12. The PDQ-39, SF-36, MMSE, and neuropsychological test battery were obtained at baseline and Week 12.
SAMPLE SIZE DETERMINATION
For sample size planning, published data from two randomized, double-blind, controlled multicenter clinical trials in elderly people with depression were used to estimate the variability of the changes in HAM-D scoree16-e17. In both studies, the standard deviation of the change in HAM-D from baseline to end of study (6-8 weeks) was approximately 6.0 points. A sample size of 165 participants (55 per group) was initially chosen to provide approximately 89% power to detect a difference of 4.0 points in mean HAM-D score between either of the active treatment groups and the placebo group, using a t-test and a Bonferroni-adjusted significance level of 2.5% (two-tailed). Group differences in mean response of less than 4.0 points were thus not considered to be clinically important. To account for an anticipated 15% attrition rate, the required sample size was inflated to 228 participants (76 per group) using the inflation factor 1/(1 – π)2, where π = 0.15 is the attrition ratee18. The trial did not achieve its planned sample size, however, due to difficulties with recruitment; only 115 participants were enrolled in the trial.
DATA AND SAFETY MONITORING
An independent DSMB was appointed by the National Institute of Neurological Disorders and Stroke to monitor the trial. The DSMB met twice per year to review data related to study performance (including recruitment) and safety outcomes; monitoring of outcomes for efficacy or futility was not performed. An independent medical monitor reviewed safety information periodically in a blinded fashion and interacted with both the site investigators and the DSMB as necessary.
EXPLORATORY ANALYSES OF TREATMENT EFFECTS IN SUBGROUPS
Exploratory analyses of treatment effects on the HAM-D in pre-specified subgroups of participants were performed by adding appropriate main effect and interaction terms to thestatistical model used in the primary analysis (e.g., treatment group by type of depression). Subgroups of interest included those defined by age (≤ 64, > 64), sex, type of depression (major, non-major), dopamine agonist use, years since onset of PD symptoms (< 6, ≥ 6), MMSE score (≤ 27, > 27), CAS score (≤ 7, > 7), and baseline HAM-D score (≤ 21, > 21).
These analyses suggest that paroxetine and venlafaxine XR treatment may have been more effective in participants with worse depression (major depression or baseline HAM-D score > 21) and more cognitive impairment, although the interactions were not statistically significant (Table e-4). Venlafaxine XR also tended to be more effective in women, those not taking dopamine agonists, and those with a longer duration since onset of PD symptoms (Table e-4).
CLINICAL GLOBAL IMPRESSION SCALES
Because of the small number of participants who were rated as worse than “no change” by either the participant or the investigator (n = 7), the CGI results were analyzed by comparing the percentages of participants rated as “much improved” or “very much improved” between each active treatment group and the placebo group using chi-square tests. A complete presentation of these data are provided in Table e-5.
For the investigator-rated CGI at Week 12, the percentages of participants who were rated “much improved” or “very much improved” were 71% (24/34) in the paroxetine group, 50% (15/30) in the venlafaxine XR group, and 44% (15/34) in the placebo group (paroxetine vs. placebo, p = 0.03; venlafaxine XR vs. placebo, p = 0.64). These percentages decreased to 57% (paroxetine), 44% (venlafaxine XR), and 38% (placebo) when poor responses were imputed for participants with missing CGI scores at Week 12(paroxetine vs. placebo, p = 0.09; venlafaxine XR vs. placebo, p = 0.62).
For the participant-rated CGI at Week 12, the percentages of participantswho rated themselves “much improved” or “very much improved” were 56% (19/34) in the paroxetine group, 43% (13/30) in the venlafaxine XR group, and 29% (10/34) in the placebo group (paroxetine vs. placebo, p = 0.03; venlafaxine XR vs. placebo, p = 0.25). These percentages decreased to 45% (paroxetine), 38% (venlafaxine XR), and 26% (placebo) when poor responses were imputed for participants with missing CGI scores at Week 12(paroxetine vs. placebo, p = 0.07; venlafaxine XR vs. placebo, p = 0.25).
GUESSES OF TREATMENT ASSIGNMENTS
For participants in the active treatment groups, investigators guessed that 75% were taking active treatment; for those in the placebo group, the investigators guessed that 61% were taking active treatment (p = 0.11, chi-square test). For participants in the active treatment groups, 72% guessed that they were taking active treatment; for those in the placebo group, 66% guessed that they were taking active treatment (p = 0.48). These results may reflect the observation that participants’ depression generally improved to some extent during the trial, regardless of treatment group, as well as the fact that each participant had a 67% chance of receiving active treatment.
ADDITIONAL DISCUSSION TOPICS
Our findings differ somewhat from those of the trial by Menza, et ale19. In that single-site trial, 52 patients with dPD were randomized to receive paroxetine (n = 18), nortriptyline (a TCA, n = 17) or placebo (n = 17) for 8 weeks. Nortriptyline, but not paroxetine, was found to be significantly better than placebo in decreasing the mean score on the HAM-D. This finding led some to question the use of SSRIs in dPD and whether TCAs should be preferred in the treatment of dPDe20. There are several possible explanations for the different results. In addition to a shorter duration of treatment, the Menza et al study had a relatively high dropout rate due to adverse events (33% in the paroxetine group and 24% in the nortriptyline group, compared to 12% in the paroxetine group and 6% in the venlafaxine XR group in our study). Menza et al also used a “last observation carried forward” imputation strategy for missing data. This may have led to an artificial attenuation of the response in the paroxetine group, which had a slightly higher dropout rate than the other groups and mayhave had a response that appeared more gradually over time than that of a TCA. Indeed, the percentages of participants with a ≥ 50% reduction in HAM-D score from baseline to Week 8 were 53% in the nortriptyline group, 11% in the paroxetine group, and 24% in the placebo group. The 11% response rate for paroxetine is unusually low for an SSRI in major depression and was less than half of the placebo response rate. In our trial, these rates were 55% in the paroxetine group, 47% in the venlafaxine XR group, and 38% in rate than the other groups and may have had a response that appeared more gradually over time than that of a TCA. Indeed, the percentages of participants with a ≥ 50% reduction in HAM-D score from baseline to Week 8 were 53% in the nortriptyline group, 11% in the paroxetine group, and 24% in the placebo group. The 11% response rate for paroxetine is unusually low for an SSRI in major depression and was less than half of the placebo response rate. In our trial, these rates were 55% in the paroxetine group, 47% in the venlafaxine XR group, and 38% in the placebo group, which are conservative since poor responses were imputed for participants with missing HAM-D scores at Week 12.
Due to difficulties in recruitment, we were only able to enroll approximately 50% of our goal of 228 participants. Some barriers to recruitment in this trial may have included issues specific to depression treatment studies (e.g., reduced awareness of depression in PD, reluctance of patients with depression to get involved in a clinical trial). Additional factors that may have hindered recruitment include the fact that the trial was placebo-controlled, the medications were available by prescription and frequently utilized, and there were relatively frequent (every 2 week) visits.
We found that both treatments were associated with a significant improvement in depression but neither was associated with significant improvement in overall quality of life. This is a bit surprising on the surface since depression has been noted to be the factor most important in reducing QOLe21-e23. Measures of overall QOL, however, would be expected to be less sensitive to treatment effects than measures of the condition being studied. Also, it may take more than 12 weeks for improvements in depression symptoms to translate into improvements in quality of life.
We had originally anticipated recruiting more patients with minor depression, dysthymia or subsyndromal depression since these have been reported to be more commone24-e25or almost as commone26as major depression in PD. Our study sample, however, was comprised mostly of patients with major depression. This may be because these patients are more readily diagnosed. Although those with more severe depression tended to have better responses to treatment than other participants, our sample was not large enough to permit definitive conclusions about differential effects of treatment in these and other subgroups of patients (Table e-4).
Table e-1. Baseline characteristicsa
(n = 42) / Venlafaxine XR
(n = 34) / Placebo
(n = 39)
Age / 65.2 (9.8) / 62.5 (11.4) / 62.7 (11.0)
Male (%) / 73.8% / 55.9% / 59.0%
Caucasian (%) / 95.2% / 91.2% / 89.7%
Hispanic (%) / 9.5% / 11.8% / 10.3%
Education > High School (%) / 83.3% / 79.4% / 69.2%
Married (%) / 64.3% / 79.4% / 71.8%
Major Depression (%) / 69.1% / 64.7% / 56.4%
Past Antidepressant Use (%) / 7.1% / 5.9% / 12.8%
HAM-D / 22.2 (6.5) / 21.2 (6.0) / 21.4 (4.8)
MADRS / 21.0 (6.8) / 19.4 (7.9) / 19.9 (5.9)
GDS / 15.5 (6.2) / 15.1 (5.8) / 15.0 (4.9)
BDI-II / 17.2 (9.2) / 17.1 (9.1) / 17.5 (7.4)
Years since PD Onset / 6.7 (5.8) / 7.4 (4.2) / 7.0 (3.8)
Years since PD Diagnosis / 5.2 (5.9) / 4.7 (3.7) / 4.9 (3.6)
H/Y Stage (%)
1.0-1.5
2.0-2.5
3.0-4.0 / 7.1%
76.2%
16.7% / 0.0%
85.3%
14.7% / 7.7%
79.5%
12.8%
UPDRS
Mental
ADL
Motor
Total / 4.8 (2.4)
27.3 (9.6)
10.6 (6.7)
42.7 (14.9) / 5.2 (1.9)
26.8 (12.3)
11.4 (7.4)
43.1 (19.0) / 4.6 (2.0)
26.4 (11.5)
10.8 (5.5)
41.7 (15.9)
S/E ADL (“On”) / 82.9 (11.2) / 80.3 (14.6) / 80.8 (13.4)
Motor Fluctuations (%) / 59.5% / 58.8% / 61.5%
Treatment for PD
Levodopa
Agonist
COMT Inhibitor
Amantadine
Anticholinergic / 90.5%
40.5%
21.4%
14.3%
7.1% / 79.4%
38.2%
26.5%
11.8%
11.8% / 82.1%
30.8%
28.2%
15.4%
7.7%
PDQ-39 Total / 36.8 (16.3) / 37.2 (15.6) / 39.6 (14.8)
SF-36
PCS
MCS / 37.5 (9.3)
41.4 (8.8) / 36.2 (10.3)
38.3 (10.1) / 37.1 (10.4)
40.0 (9.3)
Snaith CAS / 6.7 (3.9) / 7.8 (4.5) / 7.5 (4.3)
MMSE / 28.7 (1.4) / 28.9 (1.8) / 28.5 (1.5)
BPRS / 33.6 (10.7) / 35.7 (8.9) / 34.4 (9.3)
Abbreviations: HAM-D, Hamilton Depression Rating Scale; MADRS, Montgomery-Asberg Depression Rating Scale; GDS, Geriatric Depression Scale; BDI-II, Beck Depression Inventory II; H/Y, Hoehn and Yahr; UPDRS, Unified Parkinson’s Disease Rating Scale; S/E ADL, Schwab and England Activities of Daily Living Scale; PDQ-39, Parkinson’s Disease Questionnaire; PCS, Physical Component Summary; MCS, Mental Component Summary; CAS, Clinical Anxiety Scale; MMSE, Mini Mental State Examination; BPRS, Brief Psychiatric Rating Scale.
a Values are presented as mean (standard deviation) unless otherwise indicated
Table e-2. Treatment effects on primary and secondary outcome variables at Week 12
Variable / TreatmentGroup / Adjusted Mean (SE) Change from Baseline / Treatment
Effect / 97.5% CI / P-Value
HAM-D / Paroxetine
Venlafaxine
Placebo / -13.0 (1.3)
-11.0 (1.4)
-6.8 (1.3) / -6.2
-4.2 / (-10.3, -2.2)
(-8.4, -0.1) / 0.0007
0.02
MADRS / Paroxetine
Venlafaxine
Placebo / -13.6 (1.2)
-10.9 (1.3)
-6.6 (1.2) / -7.0
-4.3 / (-10.8, -3.2)
(-8.2, -0.4) / < 0.0001
0.01
GDS / Paroxetine
Venlafaxine
Placebo / -6.9 (1.0)
-6.9 (1.1)
-2.8 (1.0) / -4.2
-4.2 / (-7.4, -1.0)
(-7.5, -0.8) / 0.004
0.005
BDI-II / Paroxetine
Venlafaxine
Placebo / -9.7 (1.1)
-9.6 (1.2)
-5.2 (1.1) / -4.5
-4.4 / (-8.0, -1.1)
(-7.9, -0.8) / 0.004
0.006
Snaith CAS / Paroxetine
Venlafaxine
Placebo / -3.6 (0.6)
-3.2 (0.6)
-2.4 (0.6) / -1.2
-0.8 / (-3.0, 0.6)
(-2.6, 1.0) / 0.13
0.33
BPRS / Paroxetine
Venlafaxine
Placebo / -9.0 (1.3)
-9.8 (1.4)
-4.4 (1.3) / -4.6
-5.4 / (-8.6, -0.6)
(-9.5, -1.3) / 0.01
0.004
PSQI / Paroxetine
Venlafaxine
Placebo / -2.1 (0.4)
-2.6 (0.5)
-1.1 (0.4) / -1.0
-1.5 / (-2.4, 0.4)
(-3.0, -0.1) / 0.10
0.02
UPDRS Total / Paroxetine
Venlafaxine
Placebo / -8.7 (2.1)
-7.0 (2.3)
-4.3 (2.0) / -4.5
-2.7 / (-11.0, 2.1)
(-9.4, 4.0) / 0.12
0.36
UPDRS Motor / Paroxetine
Venlafaxine
Placebo / -4.3 (1.5)
-2.0 (1.6)
-1.0 (1.5) / -3.3
-1.0 / (-8.0, 1.3)
(-5.8, 3.8) / 0.11
0.64
UPDRS Tremor / Paroxetine
Venlafaxine
Placebo / 0.4 (0.5)
0.5 (0.5)
-0.6 (0.5) / 0.9
1.1 / (-0.6, 2.4)
(-0.4, 2.6) / 0.15
0.11
UPDRS Bulbar / Paroxetine
Venlafaxine
Placebo / -1.4 (0.3)
-1.4 (0.3)
-0.5 (0.3) / -0.9
-0.9 / (-1.8, -0.04)
(-1.7, 0.02) / 0.02
0.03
PDQ-39 Overall / Paroxetine
Venlafaxine
Placebo / -8.0 (2.3)
-8.4 (2.4)
-5.3 (2.1) / -2.7
-3.0 / (-9.5, 4.2)
(-9.9, 3.8) / 0.37
0.32
PDQ-39 Emotional Well-Being / Paroxetine
Venlafaxine
Placebo / -21.4 (3.3)
-20.7 (3.5)
-10.9 (3.1) / -10.5
-9.8 / (-20.4, -0.6)
(-19.8, 0.2) / 0.02
0.03
SF-36 MCS / Paroxetine
Venlafaxine
Placebo / 11.4 (1.7)
9.5 (1.8)
4.8 (1.6) / 6.6
4.7 / (1.6, 11.5)
(-0.5, 9.8) / 0.003
0.04
SF-36 Vitality / Paroxetine
Venlafaxine
Placebo / 13.5 (3.1)
9.1 (3.3)
4.7 (2.9) / 8.7
4.3 / (-0.5, 17.9)
(-5.3, 13.9) / 0.03
0.31
SF-36 Role-Emotional / Paroxetine
Venlafaxine
Placebo / 39.5 (7.5)
26.9 (8.0)
12.7 (6.9) / 26.8
14.1 / (4.6, 49.0)
(-8.7, 36.9) / 0.007
0.16
SF-36 Mental Health / Paroxetine
Venlafaxine
Placebo / 16.7 (2.7)
17.4 (2.8)
9.7 (2.5) / 7.1
7.7 / (-0.8, 15.0)
(-0.4, 15.9) / 0.04
0.03
Abbreviations: CI, Confidence Interval; HAM-D, Hamilton Depression Rating Scale; MADRS, Montgomery-Asberg Depression Rating Scale; GDS, Geriatric Depression Scale; BDI-II, Beck Depression Inventory II; CAS, Clinical Anxiety Scale; BPRS, Brief Psychiatric Rating Scale; PSQI, Pittsburgh Sleep Quality Index; UPDRS, Unified Parkinson’s Disease Rating Scale; PDQ-39, Parkinson’s Disease Questionnaire; MCS, Mental Component Summary.