Learning Objectives

  • Write R functions to make your code less redundant and more efficient

Required Packages and Datasets

If you dont’ have the uwpols501 package yet, install it first from GitHub.

library(devtools)
install_github("UW-POLS501/r-uwpols501")

Make sure you can load all these packages and dataset before starting the module

library(dplyr)
library(uwpols501)
data(turnout)
data(iver)

Intro

Why should we use functions? Well, don’t take it from me, take it from Grolemund & Wickham (2016):

Writing a function has three big advantages over using copy-and-paste:

  1. You drastically reduce the chances of making incidental mistakes when you copy and paste.
  2. As requirements change, you only need to update code in one place, instead of many.
  3. You can give a function an evocative name that makes your code easier to understand.

Writing a function step by step:

  1. Specify a function name
  2. Use the function() command
  3. Specify inside the function() command the arguments that the functions takes
  4. Write what the function does between the curly brackets { } (Function names should be verbs, and arguments should be nouns)

A function has the following skeleton:

function_name <- function(argument1, argument2, ...) {
  # do_something with the arguments
  output <- argument1 + argument2
  return(output)
}

A short/silly example:

say_hi <- function(name) {
  full_statement <- paste0("Hi! My name is ", name)
  return(full_statement)
}
full_statement <- say_hi("Andreu")
full_statement
## [1] "Hi! My name is Andreu"

Real Example (1)

For the final project last quarter, one of your classmates needed to retrieve the first digit of all numbers for several numeric variables. So the person had to write the same code multiple times. Let’s now write a function that would simplify that code.

get_first_digit <- function(variable) {
  num_digits <- nchar(variable)
  var_first_num <- variable %/% (10 ^ (nchar(variable) - 1))
  return(var_first_num)
}

Now apply the function get_first_digit() that we just created to the variables age and educate of the turnout dataset.

turnout$age_new <- get_first_digit(turnout$age)
turnout$educate_new <- get_first_digit(turnout$educate)

Real Example (2)

Last quarter some other classmates needed to create a dummy variable indicating if the country or political party for any given row was part of a particular list of country or political party.

Here we first create a list of countries we are interested in, and then we write a function that takes a country variable as an argument, and returns a dummy variable indicating which observations are in the list of countries of interest.

peripherial_countries <- c("Portugal", "Italy", "Ireland", 
                           "Cyprus", "Greece", "Spain")
create_per_dummy <- function(country_variable) {
  country_variable <- as.character(country_variable)
  dummy_boolean <- country_variable %in% peripherial_countries
  dummy_numeric <- as.numeric(dummy_boolean)
  return(dummy_numeric)
}

Now we apply the function to the cty variable of the iver dataset from the uwpols501 package

head(iver, n = 10)
## Source: local data frame [10 x 4]
## 
##            cty elec_sys povred   enp
##         (fctr)   (fctr)  (dbl) (dbl)
## 1    Australia      maj  42.16  2.38
## 2      Belgium       pr  78.79  7.01
## 3       Canada      maj  29.90  1.69
## 4      Denmark       pr  71.54  5.04
## 5      Finland       pr  69.08  5.14
## 6       France      maj  57.91  2.68
## 7      Germany      maj  46.90  3.16
## 8        Italy       pr  42.81  4.11
## 9  Netherlands       pr  66.93  3.49
## 10      Norway       pr  67.17  3.09
iver$peripherial_cty <- create_per_dummy(iver$cty)
head(iver, n = 10)
## Source: local data frame [10 x 5]
## 
##            cty elec_sys povred   enp peripherial_cty
##         (fctr)   (fctr)  (dbl) (dbl)           (dbl)
## 1    Australia      maj  42.16  2.38               0
## 2      Belgium       pr  78.79  7.01               0
## 3       Canada      maj  29.90  1.69               0
## 4      Denmark       pr  71.54  5.04               0
## 5      Finland       pr  69.08  5.14               0
## 6       France      maj  57.91  2.68               0
## 7      Germany      maj  46.90  3.16               0
## 8        Italy       pr  42.81  4.11               1
## 9  Netherlands       pr  66.93  3.49               0
## 10      Norway       pr  67.17  3.09               0

Challenge

Look through some of your previous code from other classes and projects, and write a function to reduce redundancy (10 min.)