Line graphs are typically used for visualizing how one continuous variable, on the yaxis, changes in relation to another continuous variable, on the x-axis. As with bar graphs, there are exceptions. Line graphs can also be used with a discrete variable on the x-axis.
Making a Basic Line Graph
To make a simple line graph you can use ggplot() with geom_line() and provide the variables which you want to map to the X-axis and Y-axis respectively. Refer the code below, you will have to type the below code in the console of the RStudio & see the result in the plots window:
ggplot(BOD, aes(x=Time, y=demand)) + geom_line()
In the above stated code the BOD is a data frame which contains 6 rows and 2 columns shows the Biochemical Oxygen Density.
Step 2: Basic line graph with a factor on the x-axis. In the BOD data set there is no entry for Time=6, so there is no level 6 when Time is converted to a factor. Factors hold categorical values, and in that context, 6 is just another value. It happens to not be in the data set, so there’s no space for it on the x-axis.
Line graphs can be made with discrete (categorical) or continuous (numeric) variables on the x-axis. In the example here, the variable demand is numeric, but it could be treated
as a categorical variable by converting it to a factor with factor(). When the x variable is a factor, you must also use aes(group=1) to ensure that ggplot() knows that the data points belong together and should be connected with a line.
Step 3: it’s better to have the y range start from zero. You can use ylim() to set the range, or you can use expand_limits() to expand the range to include a value. This will set the range from zero to the maximum value of the demand column in BOD. Line graph with manually set y range.
Adding points to a Line Graph
Step 1: We use the geom_point() to add points to the Line graphs created above.
The code for the above is:
ggplot(BOD, aes(x=Time, y=demand)) + geom_line() + geom_point()
Step 2: With the log y-axis, you can see that the rate of proportional change has increased in the last thousand years. The estimates for the years before 0 have a roughly constant rate of change of 10 times per 5,000 years. In the most recent 1,000 years, the population has increased at a much faster rate.
In the worldpop data set, the intervals between each data point are not consistent. In the far past, the estimates were not as frequent as they are in the more recent past. Displaying points on the graph illustrates when each estimate was made.
Making a Line Graph with Multiple Lines
To make a graph with multiple lines we need to use a column with 2 or more groups.
In the Toothgrowth data frame the supp column has two groups OJ and VC so the graphs are colored by these two parameters.
The same method the geom_line is used for the line graph.
Adding Different Size and Shape to the Line Graphs
Step 1: If your plot has points along with the lines, you can also map variables to properties of the points, such as shape and fill. The code highlighted below in the Rscript tells the console to set the X & Y axes for the plot and then the geom_point() is used to provide the points on the line by the variable supp.
Step 2: Here the dodge command is used to separate the lines by 0.2 precision.
This is usually done to avoid the overlapping points.
Changing the Appearance of the Lines
Step 1: If you are interested in changing the appearance of the lines i.e. the type , thickness, color etc then these properties can be set by passing them by values in the call to geom_line().
Here we use the geom_line() to pass the linetype i.e. dashed , color=blue and size=1
ggplot(BOD, aes(x=Time, y=demand)) +
geom_line(linetype=”dashed”, size=1, colour=”blue”)
Step 2: If there is more than one line, setting the aesthetic properties will affect all of the lines. On the other hand, mapping variables to the properties will
result in each line looking different. The default colors aren’t the most appealing, so you may want to use a different palette. we can do that by using scale_col
our_brewer() or scale_colour_manual().
tg <- ddply(ToothGrowth, c(“supp”, “dose”), summarise, length=mean(len))
ggplot(tg, aes(x=dose, y=length, colour=supp)) +
we use the above code for changing the color palette. First we summarize the data using ddply from the plyr library. Then we plot the line graph.
Step 3: If you want to use a single colour for both the lines then you have to specify the color outside the aes().
ggplot(tg, aes(x=dose, y=length, group=supp)) +
Step 4: You can also give the size shape and fill parameters in the geom_point() to change the appearance of the points on graph. all these methods are from the ggplot2 package.
# Since supp is mapped to colour, it will automatically be used for grouping
ggplot(tg, aes(x=dose, y=length, colour=supp)) +
geom_point(shape=22, size=3, fill=”white”)
Changing the Appearance of points
Step 1: If you want to change the the appearance of the points then you can do it by changing the shape and size of the points by passing the values to the parameter in the geom_point(). Here in the image below we have passed the vlaue of the size as 4 and the shape as 22 which is a square and then we have assigned color that is the outline color and the we pass the pink color in the fill parameter.
Step 2: To set a single constant shape or size for all the points specify shape or size outside of aes(). By setting the parameters outside the aes() we are telling the console to separate the lines and then we can set different properties.
Making a Graph with a Shaded Area
Step 1: Now we will fill the graph i.e. make it shaded by using the geom_area(). And all other computation is like the previous examples.
Then we can set the fill color to fill the area with a color so as to make the graph look more appealing.
we can also add the transparency factor that is alpha to the geom_area as parameter to reduce the color intensity.
How to make a Stacked Area Graph
Here we can use the geom_area() and map a factor to fill. We can use the code mentioned below. For the data set we use the library gcookbook.
library(gcookbook) # For the data set
ggplot(uspopage, aes(x=Year, y=Thousands, fill=AgeGroup)) + geom_area()
You may not like the color palette here so you can change the palette and make the graph semi transparent , for this we use the alpha parameter to which we set the factor , so that we can see the grid lines through them.
Here we are taking the AgeGroup from the uspopage data set for fill.
You can also remove the lines from the edges of the transparent graph in the image below.
Adding a Confidence Region
Step 1: To add a confidence region to the line graph use the method geom_ribbon().
For this use the package gcookbook.
In the image above, a subset of the climate data i.e. year, anomaly and unc are taken and stored in clim. Then by using the ggplot2 a graph is created and the geom_ribbon() is used to assign the confidence region.
Step 2: You can also use the upper and lower bound areas as dotted lines by setting the line type as dotted and no geom_ribbon().