POLS/CSSS 503, University of Washington, Spring 2015
Due in lab: Friday, April 17, 2015 at 3:30 pm
hw1
and load that project.Submit a zipped file of the directory with your R project through Canvas. This should contain all the materials for another person to run your R Markdown fil This should contain:
.Rproj
) file.Rmd
) of your analyses.html
) compiled from your R Markdown document.You can work together on this but you should each turn in your own assignments and write up your work separately. Include the names of your collaborators on your assignment.
Some other guidance
The file democracy.csv contains data from Przeworski et. al, Demoracy and Deveolpment: Political Institutions and Well-Being in the Worlds, 1950-1990 1. The data have been slightly recoded, to make higher values indicate higher levels of political liberty and democracy.
Variable | Description |
---|---|
COUNTRY |
numerical code for each country |
CTYNAME |
name of each country |
REGION |
name of region containing country |
YEAR |
year of observation |
GDPW |
GDP per capita in real international prices |
EDT |
average years of education |
ELF60 |
ethnolinguistic fractionalization |
MOSLEM |
percentage of Muslims in country |
CATH |
percentage of Catholics in country |
OIL |
whether oil accounts for 50+% of exports |
STRA |
count of recent regime transitions |
NEWC |
whether county was created after 1945 |
BRITCOL |
whether country was a British colony |
POLLIB |
degree of political liberty (1–7 scale, rising in political liberty) |
CIVLIB |
degree of civil liberties (1–7 scale, rising in civil liberties) |
REG |
presence of democracy (0=non-democracy, 1=democracy) |
Load the Democracy dataset into memory as a dataframe. Use the read.csv
function, and the stringsAsFactors = FALSE
option. Note that missing values are indicated by “.
” in the data. Find the option in read.csv
that controls the string used to indicate missing values.
Report summary statistics (means and medians, at least) for all variables.
Report a correlation matrix of all the variables in the dataset. You will need to find the function in R that calculates correlation. You will need to exclude the identifier columns. Watch out for missing values. Even though your input data containts missing values, your correlation matrix should not have missing values in any of its entries.
Create a histogram for political liberties in which each unique value of the variable is in its own bin.
Create a histogram for GDP per capita.
Create a histogram for log GDP per capita. How is this histogram different than the one for GDP per capita when it was not logged.
Create a scatterplot of political liberties against GDP per capita.
When there is a lot of overlap in a scatter plot it is useful to “jitter” the points (randomly move them up and down). Make the previous plot but jitter the points to mitigate the problem of overplotting. (Only jitter the points vertically). You can use geom_jitter
in ggplot2 for this.
Create a scatterplot of political liberties against log GDP per capita. Jitter the points. How is the relationship different than when GDP per capita was not logged.
Create a boxplot of GDP per capita for oil producing and non-oil producing nations.
Calculate the mean GDP per capita in countries with at least 40 percent Catholics. How does it compare to mean GDP per captia for all countries?
Calculate the average GDP per capita in countries with greater than 60% ethnolinguistic fractionalization, less than 60%, and missing ethnolinguistic fractionalization. Hint: you can calculate this with the dplyr verbs: mutate
, group_by
and summarise
.
What was the median of the average years of education in 1985 for all countries?
Which country was (or countries were) closest to the median years of education in 1985 among all countries?
What was the median of the average years of education in 1985 for democracies?
Which democracy was (or democracies were) closest to the median years of education in 1985 among all democracies?
What were the 25th and 75th percentiles of ethnolinguistic fractionalization for new and old countries?
Derived from of Christopher Adolph, “Problem Set 1”, POLS/CSSS 503, University of Washington, Spring 2014. http://faculty.washington.edu/cadolph/503/503hw1.pdf; “Problem Set 2”, POLS/CSSS 503, University of Washington, Spring 2014 http://faculty.washington.edu/cadolph/503/503hw2.pdf. Used with permission.
Przeworski, Adam, Michael E. Alvarez, Jose Antonio Cheibub, and Fernando Limongi. 2000. Democracy and Development: Political Institutions and Well-Being in the World, 1950-1990. Cambridge University Press.↩