Chapter 8 Presenting relationships

To show how different variables are related, Table 7.1 shows the geometric objects we will be working with below as well as link where you can find more information.

Table 8.1: Selected geometric objects for relations in ggplot2
Name Function Cookbook for R
Box plot geom_boxplot() Plotting distributions
Scatter plot geom_point() Scatterplots

8.1 Box plot

For the box plot, we will be using geom_boxplot() to show how the vote share for Obama is related to abortion laws (here with the abortlaw3 variable, i.e. abortion restrictions with three tiers of number of restrictions).

ggplot(states, aes(x=abortlaw3, group=abortlaw3, y=obama2012)) +

Here we can see that Obama got a greater vote share in states with less restrictions on abortion.

8.2 Scatter plots

To illustrate the relation between number of abortions and Obama’s vote share, measured with the variables abort_rate08 and obama2012, we will create a scatter plot with geom_point().

ggplot(states, aes(x=abort_rate08, y=obama2012)) +

If we are working with a lot of observations, there will be an overlap in the points. To show all of the observations, we can add some small, random noise to the observations, so we can see more of them. To do this, we can use geom_jitter() instead of geom_point().

ggplot(states, aes(x=abort_rate08, y=obama2012)) +

We can also use geom_point(position = "jitter") instead of Instead of geom_jitter(). However, in this particular case, as we only have 50 observations, it is not a major concern.

8.3 Line plots

To create a regression line we can use the geom_smooth() function. Here we will again look at the relation between abort_rate08 and obama2012.

ggplot(states, aes(x=abort_rate08, y=obama2012)) +

Here we can see that as the abortion rate increases, so does the vote share for Obama. As we can also see, this is a smoothing function. To have a linear line instead we can specify that we will be using method="lm" as an option.

ggplot(states, aes(x=abort_rate08, y=obama2012)) +