Time Series Analysis in R Part 1: The Time Series Object

Many of the methods used in time series analysis and forecasting have been around for quite some time but have taken a back seat to machine learning techniques in recent years. Nevertheless, time series analysis and forecasting are useful tools in any data scientist’s toolkit. Some recent time series-based competitions have recently appeared on kaggle, such as one hosted by Wikipedia where competitors are asked to forecast web traffic to various pages of the site. As an economist, I have been working with time series data for many years; however, I was largely unfamiliar with (and a bit overwhelmed by) R’s functions and packages for working with them. From the base ts objects to a whole host of other packages like xts, zoo, TTR, forecast, quantmod and tidyquant, R has a large infrastructure supporting time series analysis. I decided to put together a guide for myself in Rmarkdown. I plan on sharing this as I go in a series of blog posts. In part 1, I’ll discuss the fundamental object in R – the ts object.

The Time Series Object

In order to begin working with time series data and forecasting in R, you must first acquaint yourself with R’s ts object. The ts object is a part of base R. Other packages such as xts and zoo provide other APIs for manipulating time series objects. I’ll cover those in a later part of this guide.

Here we create a vector of simulated data that could potentially represent some real-world time-based data generation process. It is simply a sequence from 1 to 100 with some random normal noise added to it. We can use R’s base plot() function to see what it looks like:

This could potentially represent some time series, with time represented along the x-axis. However it’s hard to tell. The x-axis is simply an index from 1 to 100 in this case.

A vector object such as t above can easily be converted to a time series object using the ts() function. The ts() function takes several arguments, the first of which, x, is the data itself. We can look at all of the arguments of ts() using the args() function:

To begin, we will focus on the first four arguments – data, start, end and frequency. The data argument is the data itself (a vector or matrix). The start and end arguments allow us to provide a start date and end date for the series. Finally the frequency argument lets us specify the number of observations per unit of time. For example, if we had monthly data, we would use 12 for the frequency argument, indicating that there are 12 months in the year.

Let’s assume our generated data is quarterly data that starts in the first quarter of 2000. We would turn it into a ts object as below. We specify the start argument as a two element vector. The first element is the year and the second element is the observation of that year in which the data start. Because our data is quarterly, we use 4 for the frequency argument.

Notice that now when we plot the data, R recognizes that it is a ts object and plots the data as a line with dates along the x-axis.

Aside from creating ts objects containing a single series of data, we can also create ts objects that contain multiple series. We can do this by passing a matrix rather than a vector to the x argument of ts().

Now when we plot the ts object, R automatically facets the plot.

At this point, I should mention what really happens when we call the plot() function on a ts object. R recognizes when the x argument is a ts object and actually calls the plot.ts() function under the hood. We can verify this by using it directly. Notice that it produces an identical graph.

The plot.ts() function has different arguments geared towards time series objects. We can look at these again using the args() function.

Notice that it has an argument called plot.type that lets us indicate whether we want our plot to be faceted (multiple) or single-panel (single). Although in the case of our data above we would not want to plot all three series on the same panel given the difference in scale for ts3, it can be done quite easily.

I am not going to go in-depth into using R’s base plotting capability. Although it is perfectly fine, I strongly prefer to use ggplot2 as well as the ggplot-based graphing functions available in Rob Hyndman’s forecast package. We will discuss these in parts 2 and 3.

Convenience Functions for Time Series

There are several useful functions for use with ts objects that can make programming easier. These are window(), start(), end(), and frequency(). These are fairly self-explanatory. The window function is a quick and easy way to obtain a slice of a time series object. For example, look again at our object tseries. Assume that we wanted only the data from the first quarter of 2000 to the last quarter of 2012. We can do so using window():

The start() function returns the start date of a ts object, end() gives the end date, and frequency() returns the frequency of a given time series:

That’s all for now. In Part 2, we’ll dive into some of the many transformation functions for working with time series in R. See you then.

2 thoughts on “Time Series Analysis in R Part 1: The Time Series Object

    1. troywalters760@gmail.com Post author

      Indeed it does. I plan on discussing xts in later posts. Just getting started here.

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *