University of Washington, Spring 2016
Primary | Jeffrey Arnold | jrnold@uw.edu |
TA | Andreu Casas | acasas2@uw.edu |
Class | Tues, Thurs | 4:30–5:50 pm | Mary Gates Hall 284 |
Lab | Fri | 1:30–3:20 pm | Savery 121 |
Jeffrey Arnold | Mon 4–5pm, Wed 2–4pm | Smith 221B |
Andreu Casas | Tues and Thurs 3:20–4:20 pm | Smith 221E |
This course continues the graduate sequence in quantitative political methodology from POLS 501. In this course, students will learn the Statistical and computational principles necessary to perform modern, flexible, and creative analysis of quantitative social data. This course is focused particularly on fitting, interpreting, and refining the linear regression model. Emphasis is placed on modern interpretations of linear regression as causal inference, as well as an introduction to several modern computational tools (bootstrapping, cross-validation, regularization).
By the end of the semester, you will be able to:
Further, because we cannot possibly cover everything that you will need to know during your career as a researcher, there are two final long-term goals. After this course is over, you will be able to:
This course is designed to be a continuation of POLS/CS&SS 501. Although that is not a formal prerequisite for this course, I will assume that students have a basic understanding of the material covered in that course. In particular, students should have had a course in hypothesis testing, univariate statistical tests, and linear regression. I also assume that students have proficiency in R prior to starting the course.
There are two required texts for this course,
and one optional text,
Other reading will come from articles or chapters, which if not open, will be available through either the UW library, or posted on Canvas.
Finally, much of the material and reading for this course will be available in the course notes.
This course takes an applied and computational approach to learning statistics. As such a programming language is essential. This course uses R as its statistical programming language, and the [RStudio] IDE as an interface to R. We will make use of several R packages, with extensive use of the Hadleyverse packages (ggplot2, dplyr, tidyr, …). Additionally, this course will use R Markdown for writing reproducible research reports with R and git and GitHub for version control, collaboration, and distribution of code and research.
Assignments for this course comprise:
Research project: Every student in this class will execute their own statistical data analysis of a research question. The results of this analysis will be presented as a paper due at the end of the course. See the schedule for the due date.
The purpose of this paper is for the students to apply the quantitative methods used in this course to the real-world research problems that they will encounter in their research careers. However, due to the limited time in this course, it is not necessary for this paper to address an important research problem or a novel contribution to the literature. While those will not be criteria for the evaluation of this paper, the author is encouraged to pursue those, as they are what leads to publications. The paper will be evaluated on the appropriateness of the statistical methods applied to the data and question, and not the novelty or contribution of the question itself.
If you developed a research design for POLS 501, you can continue to use it for 503. If you did not take POLS 501, then talk to the instructor to confirm that your project is feasible.
While the final paper is the ultimate objective of the paper, students will work with their data throughout the course, including the following assignments related to the research project.
Peer review of assignments/projects: Students will review each others code and analysis and provide feedback.
The exact nature and timing of the assignments will adjust with the exigencies of the course in consulation with the students.
Students will be evaluated on the whole of their work in this course with an emphasis on the final paper. For this course, grades on the 4.0 scale have the following interpretation:4.0 | Exceptional |
3.9 | Very good |
3.8 | Meeting expectations |
3.7 | Somewhat below average |
3.6 | Not up to expectations |
≤ 3.5 | Way below expectations |
Below is a list of some of the topics that this course may cover. What is actually covered in course will depend on how the course evolves in practice. See the Schedule for readings and schedule, though it, too, will change over the course of the quarter.
For questions about the course that would be of general interest to all students in the course, email the course mailing list, rather than the individual instructors. Please reserve emails to individual instructors for individual concerns, such as your data analysis project or personal matters.
Beyond what the teaching team can providing, there are several resources on campus that you can go for assistance with data, computing, and statistical problems:
This course was inspired by and makes use of some material from:
This work is licensed under a Creative Commons Attribution 4.0 International License.
Parts of the course materials are derived from
The source for the materials of this course is on GitHub at https://github.com/UW-POLS503/pols_503_sp16.