Course Catalog Description
This course prepares students to employ essential ideas and reasoning of applied statistics. It teaches theoretical statistical concepts and tests the student’s understanding of them. The course provides students with a solid foundation for solving empirical problems with the ability to summarize observed uni- and multivariate data, and to calibrate statistical models.
While financial applications are emphasized, the course may also serve areas of science and engineering where statistical concepts are needed. The course is designed to familiarize students with the use of R for statistical data analysis (familiarity with programming in R is assumed. See below).
Students require sound understanding of probability gathered through an undergraduate class such as MA222 or equivalent. Also students must have the ability to program in R. Please consider taking FE515 if you are not familiar with R.
Attendance is mandatory, and there may be short pop quizzes every week, starting from the second week.
This course will allow the students to:
- Understand and summarize complex data sets through graphs and numerical measures.
- Calculate estimates of parameters using fundamental statistical methods.
- Measure the “goodness” of an estimator by computing confidence intervals.
- Apply statistical tests to experimental observations.
- Estimate and calibrate parameters of mathematical models using real data.
- Study relationships between two or more random variables.
- Be prepared for more advanced applied statistical courses.
The only required textbook is Moore, McCabe, and Craig (2017).
- Moore et al. (2017) will be our main textbook. Earlier editions (say back to the sixth edition) should be OK.
- Greene (2012) (or other editions) will be useful for classes on Inference.
- Dalgaard (2004) is a useful basic reference concentrating on the use of R in statistics.
- Florescu and Tudor (2013) is useful for Probability and Estimation Methods.
- James et al. (2013) will be useful for its chapter on Variable Selection. It is available free online from its authors: http://www-bcf.usc.edu/~gareth/ISL/.
Peter Dalgaard. Introductory Statistics with R. Springer, 2004.
Ionut Florescu and Ciprian Tudor. Handbook of Probability. Wiley, 2013.
William H. Greene. Econometric Analysis. Prentice Hall, Seventh edition, 2012.
Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. An Introduction to Statistical Learning with Applications in R. Springer Verlag, 2013.
David Moore, George P. McCabe, and Bruce A. Craig. Introduction to the Practice of Statistics. W. H. Freeman and Co., Ninth edition, 2017.
You will be required to submit four homework assignments.
All homework assignments must be submitted in R markdown (.Rmd) format, with all answers written as functions. For your information, the main markdown page is here: https://rmarkdown.rstudio.com/. A nice summary of the use of R markdown appears here: http://www.stat.cmu.edu/~cshalizi/rmarkdown/. You may wish to include mathematical expressions in your markdown code. If so, it is useful to use L A TEX, which is taught in FE505. If you wish, you may optionally submit a .pdf version of your assignment, but no other formats will be accepted.
To emphasize: submission in R markdown format is mandatory. When I grade your homework, I will automatically parse your markdown code to extract your functions. I will run your functions with test data to confirm that they work and provide the correct results.
Late assignments will not be accepted unless you inform me of your circumstances before the assignment is due, and I grant you an extension. I will only grant extensions for serious medical or compassionate reasons. You will not receive an extension just because your computer fails or the network goes down at an inconvenient time.
Examination and Project
There will be an in-class, closed-book, hand-written, mid-term examination. This will test your understanding of the basic concepts. There will also be a take-home final project that will test your ability to put theory into practice.
For the project, you will work in groups of three to propose, design, and analyze a research topic that contains a significant data component and is applicable to your primary field of study. The project must use statistical methods that are taught in this course. Before you spend more than a few hours of work on your project, you must get my formal approval of your topic.
Your final grade will be determined by your performance in the homework, mid-term examination, project, and spot quizzes, as weighted below. However, I reserve the right to “curve” the grades; i.e., to adjust the grades such that they follow the usual distribution at Stevens.
Final Presentation: 10%
Final Project: 30%
Quizzes, Class Participation: 10%
||Descriptive Graphical Measures.
|Moore et al. (2017): Ch. 1
Moore et al. (2017): Ch. 2
| Week 2
||Moore et al. (2017): Ch. 5
| Week 3
||Introduction to Inference.
||Moore et al. (2017): Ch. 6
| Week 4
||Inference for Distributions.
||Moore et al. (2017): Ch. 7
| Week 5
||Inference for Proportions.
||Moore et al. (2017): Ch. 8
| Week 6
||Estimation. Methods in General.
- Method of Moments,
- Maximum (and Conditional) Likelihood,
- Bayesian estimators.
|Greene (2012): Ch. 12
Greene (2012): Ch. 13
Greene (2012): Ch. 14
Greene (2012): Ch. 16
| Week 7
| Week 8
||Analysis of Two-Way (and One-Way) Tables.
Goodness of Fit Test.
|Moore et al. (2017): Ch. 9
| Week 9
||Simple Linear Regression.
Least Squares Method.
Analysis and Testing.
|Moore et al. (2017): Ch. 10
| Week 10
ANOVA Table, Multiple R 2 , Residuals.
|Moore et al. (2017): Ch. 11
| Week 11
||Selection of Variables.
Variance Inflation Factors.
Generalized Additive Models.
|James et al. (2013): Ch. 6
| Week 12
||Analysis of Variance (ANOVA) Models.
Two-Way Analysis of Variance.
Expansion to Mixture Models.
Analysis of Covariance.
|Moore et al. (2017): Ch. 12
Moore et al. (2017): Ch. 13
| Week 13
||Moore et al. (2017): Ch. 14
| Week 14
||Bootstrap Method and Permutation Tests.
|Moore et al. (2017): Ch. 16