POLS/CS&SS 503

Advanced Quantitative Political Methodology

University of Washington, Spring 2016

Class Meetings

Class Tues, Thurs 4:30–5:50 pm Mary Gates Hall 284
Lab Fri 1:30–3:20 pm Savery 121

Office Hours

Jeffrey Arnold Mon 4–5pm, Wed 2–4pm Smith 221B
Andreu Casas Tues and Thurs 3:20–4:20 pm Smith 221E

Overview and Class Goals

This course continues the graduate sequence in quantitative political methodology from POLS 501. In this course, students will learn the Statistical and computational principles necessary to perform modern, flexible, and creative analysis of quantitative social data. This course is focused particularly on fitting, interpreting, and refining the linear regression model. Emphasis is placed on modern interpretations of linear regression as causal inference, as well as an introduction to several modern computational tools (bootstrapping, cross-validation, regularization).

Learning Objectives

By the end of the semester, you will be able to:

  • Conduct, interpret, and communicate results from analysis using multiple regression (including dummy variables and interactions).
  • Explain the limitations of observational data for making causal claims, and begin to use existing strategies for attempting to make causal claims from observational data.
  • Write clean, reusable, and reliable R code.
  • Build a solid, reproducible research pipeline to go from raw data to final paper.
  • Feel empowered working with data.

Further, because we cannot possibly cover everything that you will need to know during your career as a researcher, there are two final long-term goals. After this course is over, you will be able to:

  • Learn new statistics
  • Learn new programing

Prerequisites

This course is designed to be a continuation of POLS/CS&SS 501. Although that is not a formal prerequisite for this course, I will assume that students have a basic understanding of the material covered in that course. In particular, students should have had a course in hypothesis testing, univariate statistical tests, and linear regression. I also assume that students have proficiency in R prior to starting the course.

Materials

Reading

There are two required texts for this course,

  • Angrist, Joshua D., and Jörn-Steffen Pischke. 2009. Mostly Harmless Econometrics: An Empiricist’s Companion.
  • Wooldrige, Jeffrey M. Introductory Econometrics. 5th edition or earlier.

and one optional text,

  • Angrist, Joshua D., and Jörn-Steffen Pischke. 2014. Mastering ’Metrics: The Path from Cause to Effect. This covers most of the same material as Mostly Harmless but at a less technical level.

Other reading will come from articles or chapters, which if not open, will be available through either the UW library, or posted on Canvas.

Finally, much of the material and reading for this course will be available in the course notes.

Software

This course takes an applied and computational approach to learning statistics. As such a programming language is essential. This course uses R as its statistical programming language, and the [RStudio] IDE as an interface to R. We will make use of several R packages, with extensive use of the Hadleyverse packages (ggplot2, dplyr, tidyr, …). Additionally, this course will use R Markdown for writing reproducible research reports with R and git and GitHub for version control, collaboration, and distribution of code and research.

Assessment and Evaluation

Assignments for this course comprise:

  1. Research project: Every student in this class will execute their own statistical data analysis of a research question. The results of this analysis will be presented as a paper due at the end of the course. See the schedule for the due date.

    The purpose of this paper is for the students to apply the quantitative methods used in this course to the real-world research problems that they will encounter in their research careers. However, due to the limited time in this course, it is not necessary for this paper to address an important research problem or a novel contribution to the literature. While those will not be criteria for the evaluation of this paper, the author is encouraged to pursue those, as they are what leads to publications. The paper will be evaluated on the appropriateness of the statistical methods applied to the data and question, and not the novelty or contribution of the question itself.

    If you developed a research design for POLS 501, you can continue to use it for 503. If you did not take POLS 501, then talk to the instructor to confirm that your project is feasible.

    While the final paper is the ultimate objective of the paper, students will work with their data throughout the course, including the following assignments related to the research project.

    1. Proposal (week 2)
    2. Several analyses throughout the quarter
    3. Draft (week 9)
    4. Poster presentation (week 10)
  2. Participation: Students will submit either pull requests or issues that contribute to, or raise questions about the current week’s readings.
  3. Weekly or bi-weekly assignments: These assignments will largely focus on applying the concepts to either real or simulated data.
  4. Peer review of assignments/projects: Students will review each others code and analysis and provide feedback.

The exact nature and timing of the assignments will adjust with the exigencies of the course in consulation with the students.

Students will be evaluated on the whole of their work in this course with an emphasis on the final paper. For this course, grades on the 4.0 scale have the following interpretation:
4.0 Exceptional
3.9 Very good
3.8 Meeting expectations
3.7 Somewhat below average
3.6 Not up to expectations
≤ 3.5 Way below expectations

Topics

Below is a list of some of the topics that this course may cover. What is actually covered in course will depend on how the course evolves in practice. See the Schedule for readings and schedule, though it, too, will change over the course of the quarter.

Statistical Topics

  1. Types of Research Questions: Prediction vs. Casual Inference
  2. Potential outcomes framework for causal inference
  3. Linear Regression
  4. Matching estimators
  5. Instrumental variables
  6. Fixed effects and Difference-in-difference designs
  7. Regression discontinuity

Technical topics

  1. Reproducible Research
  2. Version control with GitHub
  3. Reproducible documents with R Markdown
  4. Programming with R

Communication

For questions about the course that would be of general interest to all students in the course, email the course mailing list, rather than the individual instructors. Please reserve emails to individual instructors for individual concerns, such as your data analysis project or personal matters.

Resources

Beyond what the teaching team can providing, there are several resources on campus that you can go for assistance with data, computing, and statistical problems:

  • Center for Social Science Computing and Research (CSSCR) has a drop-in statistical consulting center in Savery 119. They provide consulting on statistical software, e.g. R. Go there for software or data related questions.
  • CSSS Statistical Consulting provides general statistical consulting. Go there for questions about statistical methods.
  • eScience Data Science Office Hours

References

Inspirations

This course was inspired by and makes use of some material from: