BIOSTATISTICS II (MATH 335)
Instructor: Doug Landsittel, Ph.D.
Phone: 412-396-1419; E-Mail:
Office: 419 College Hall; Office Hours: TBD.
Teaching Assistant(s): TBD
Prerequisites: Biostatistics I
You should understand the following concepts:
- Basic concept of sampling and sampling variability
- Numerical summary statistics and graphical displays
- 1- and 2- sample confidence intervals for μ or μ1 – μ2
- 1- and 2- hypothesis tests for μ = μ0 or μ1 = μ2
- Analysis of variance tests
- Chi-square tests (OK if you didn’t cover)
Course Objectives: This course will basically extend our knowledge of statistical methods to cover other scenarios and types of data not covered in Biostatistics I.
- Non-parametric Statistics:
- Methods for non-normal data
- 1- and 2-sample hypothesis tests
- Categorical Data Analysis:
- Methods for contingency tables (2 or more categorical variables)
- Summary statistics and hypothesis tests
- Rates and Standardization:
- Methods for calculating and standardizing morbidity/mortality rates
- Survival Analysis:
- Methods for survival times, i.e. continuous data with censoring
- Displaying survival distributions and 2-sample hypothesis tests
- Correlation and Simple Linear Regression:
- Relationship between 2 continuous variables (usually one independent and one dependent variable)
- Summary statistics, the regression model, and associated tests
- Multiple Regression:
- Regression methods for a single continuous dependent variable but two or more independent variables
- Summary statistics, the regression model, and associated tests
- Logistic Regression: regression methods for a binary dependent variable
- Regression methods for a single binary dependent variable
- Summary statistics, the regression model, and associated tests
Overall Goal:learn enough statistics for analyses about non-normal, categorical, and multidimensional data;provide a background for more advanced work.
Lectures will typically include a description of the concept/methods followed by a numerical example that generally focuses on a biological or other scientific application. Mathematical notation will relatively be minimal, although some mathematics are necessary. Students are responsible for attending class,taking notes, collecting any additional handouts, and completing HW as we finish covering the topic in class. Blackboard will be used for this course.
Statistical Software Package: SPSS or S-Plus.
Required Textbook: Principles of Biostatistics (2nd Edition). By M. Pagano and K. Gaurveau. Pacific Grove, CA: Duxbury. Homework assignments originate from the textbook.
Grading:
Homework: Homework will be assigned, but will not be collected or graded. Class time will be reserved for answering homework questions.
Attendance: Attendance will be taken, but not formally incorporated into grading.
Quizzes: Quizzes will be given (without prior announcement), but not graded.
Exams: Exams 1, 2 and 3 are worth 20%, 25%, and 25% each (not cumulative).
Final Exam: 30%; the final is cumulative.
Final grades may be curved if necessary, but will likely follow the usual 10-point scale (e.g. 90-100 =A, 80-89 = B, etc.), with borderline grades just below the cutoff potentially receiving a + or -. Typically, there will not be any specific curve on individual exams. It is unlikely anyone will be ‘curved into’ the A range.
Exams will NOT be open book; students are allowed to bring notes fitting on an 8.5×11 sheet of paper (back and front). Also, bring a calculator to the exam. Copies of needed statistical tables will be provided with the exam.
All of the exams will have a take-home component.
Policy on missed exams: Students must make arrangements PRIOR to the exam to have the opportunity to take the exam late.
Expectations of the Students:
- Lecture:Attend lecture, take notes, and review notes
- Homework:Complete homework for both repetition and application/expansion of concepts learned in lecture
- Statistical Software: Learn and apply statistical software (during later sections of the course)
- Exams: Demonstrate an understanding of both the concepts and methods, and an ability to apply them to “similar” problems.
Anything I write down during class is potentially fair game for an exam (except notes that I clearly denote as ‘an aside’). Exam questions are meant to be reflective of the notes and HW as a whole, but also challenge that understanding.
Course Schedule:
1. Mon 1/9Wed 1/11Fri 1/13Non-parametric Statistics
2. Mon 1/16Wed 1/18Fri 1/20Non-parametric Statistics
3. Mon 1/23Wed 1/25Fri 1/27Categorical Data
4. Mon 1/30Wed 2/1Fri 2/3Categorical Data /Exam 1: 2/3
5. Mon 2/6Wed 2/8Fri 2/10Rates and Standardization
6. Mon 2/13Wed 2/15Fri 2/17Rates and Standardization
7. Mon 2/20Wed 2/22Fri 2/24Survival Analysis
8. Mon 2/27Wed 3/1Fri 3/3Survival Analysis/Exam 2: 3/3
Mon 3/6Wed 3/8Fri 3/10
9. Mon 3/13Wed 3/15Fri 3/17Correlation & Simple Regression
10. Mon 3/20Wed 3/22Fri 3/24Simple & Multiple Regression
11. Mon 3/27Wed 3/29Fri 3/31Multiple Regression
12. Mon 4/3Wed 4/5Fri 4/7Exam 3: 4/5; Logistic Regression
13. Mon 4/10Wed 4/12Fri 4/14Logistic Regression
14. Mon 4/17Wed 4/19Fri 4/21Overview & Review for the Final
The final is scheduled by the University, as Mon 5/1 11am-1pm.
Reading Assignments:
The given reading assignments provide an important supplement to the lecture notes. However, they do not provide a replacement for lecture! Lecture notes will include material not contained within the text. You may want to do the reading assignment before (for preparation) and/or immediately after (for repetition of concepts) the relevant lecture discussions.
Non-parametric Statistics: Chapter 13
Categorical Data: Chapters 15 and 16
Rates and Standardization: Chapter 4
Survival Analysis: Chapter 21
Correlation: Chapter 17
Simple Regression: Chapter 18
Multiple Regression: Chapter 19
Logistic Regression: Chapter 20
Homework Assignments:
Homework assignments should be completed immediately after completing the relevant lectures in class. Within 1-2 classes after completing a given topic, I will reserve class discussion time for homework. This period will only be useful if you have at least attempted the assignment, and have formulated subsequent questions. I will not pass out solutions (sincestudents tend to just wait for the solutions!).
Non-parametric Statistics:Section 13.6; page 317-321
Assignment: #1-5, #6, #7a, #10
Additional practice: #8a, #9, #11, #12, #13, #14, #15b, #16
Categorical Data: Section 15.6; page 366-372.
Assignment: #1-5, #8a-b, #10, #13; Additional practice: #6-7, #9, #12a, #14, #16
Section 16.4; page 393-396.
Assignment: #1-4, #5; Additional practice: #6-8
Rates and Standardization: Section 4.4; page 89-95.
Assignment: #1-6, #7, #8, #15
Additional practice: #16
Survival Analysis: Section 21.5; page 511-512.
Assignment: #1-5, #6; Additional practice: #7, #8, #9
Correlation: Section 17.5; page 412-414.
Assignment: #1-4, #5
Additional practice: #6-8
Simple Linear Regression: Section 18.5; page 443-447.
Assignment: #1-7, #9, #11
Additional practice: #10, #12, #13
Multiple Linear Regression: Section 19.4; page 465-469.
Assignment: #1-6, #7, #8, #9
Additional practice: #10, #11, #12
Logistic Regression: Section 20.5; page 484-487.
Assignment: #1-4, #5, #7,
Additional practice: #6, #8, #9
Policy regarding Student Disabilities:
Students who feel they may have a disability that requires special accommodation should contact the Office of Freshman Development and Special Student Services, 309 Student Union, at 412-396-6658. This office will then provide an appropriate paperwork to the student to pass on to his/her instructors. Appropriate arrangements will then be made for exam accommodations or other such issues in accordance with University policy.
Policy regarding Academic Integrity:
Any students found to be sharing answers or assisting each other on any exams will be assigned an F (0%) for that exam.
Extra Credit:
Students wishing to raise their grades can turn in an additional project which uses as many of the techniques we learned as possible. You must pick a dataset to analyze and prepare a 3-6 page report which has an abstract, introduction, methods, results and discussion section. You must organize your report around one or more logical research questions. You must also include a list of references for any literature that you site in the report. All extra credit projects must use a statistical software package for at least some of the analyses.
The introduction section describes relevant background information and a summary of existing knowledge on the subject. Your introduction must include at least a few references to the published literature. The methods section includes a brief description of the data set and the statistical approaches that you use to analyze the data. This section also must include at least one citation of the literature, which could be your textbook (but should also include a reference to where you got the dataset, which could cite the internet if you get your data online). The results section describes the statistical results you got, and does not need to include any references. The discussion should describe the importance and limitations of the results you got. The discussion should include at least a few literature citations. Since this is for extra credit, I will not provide extensive help on this project.
Students must select individual projects; team projects will not be accepted. All of the above specifications must be met to receive any extra credit. Any pages beyond the 6-page limit will be discarded and not considered at all in the grading. Projects satisfying all of the above specifications will get between 1% and 10% added to their final grade, with an expected average of 2-3% extra credit. Only extremely innovative and well-done projects will receive more than 4-5% extra credit.
The actual data set and research question for this project is not limited to biostatistics or the medical field. You may, for instance choose a subject (such as NFL statistics) that is of personal interest to you. I will accept multiple projects that happen to be on the same subject, but you may not work together at all. Projects which appear to be very similar in content (not just the overall idea) will be discounted and given no extra credit.
1