- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Manipulating Time Series Data in R with xts & zoo
The xts and zoo are two R packages that provide tools and functions for manipulating time series data. Both packages offer functions for reading, writing, and manipulating time series data stored in various formats, such as CSV, Excel, and other data sources. We shall start by introducing xts and zoo classes, basic manipulations, merging and modifying time series, and by the end, we will be discussing applying and aggregating by time.
XTS and Zoo class
Syntax
In R, xts extends the zoo class. An xts object is similar to a matrix of observations that are indexed by a time object. We can create an xts object by using the below syntax,
xts(myData, order.by)
Here myData represent the data and order.by represents a vector of data/time type (for indexing the data)
Note that one may also include metadata to a xts object by creating name-value pairs like birthdayDate = as.POSIXct(“2000-09-07”).
Example
Consider an example below that creates an xts object using a vector of data that is indexed by a vector of dates. As you can see below, we have used name-value pair as well to add metadata −
# Importing xts library library(xts) # Creating a data myData <- xts(x = rnorm(n = 10), order.by = seq(as.Date("2022-01-01"), length = 10, by = "days"), born = as.POSIXct("2000-09-07")) # Print the data print(myData)
Output
[,1] 2022-01-01 -0.7307375 2022-01-02 0.2299910 2022-01-03 -0.6965284 2022-01-04 -0.2002072 2022-01-05 1.1121364 2022-01-06 -0.6601843 2022-01-07 0.2926226 2022-01-08 1.2273859 2022-01-09 -0.5464344 2022-01-10 0.1108407
As you can see in the output, we have created an XTS object that is index by a date vector containing dates from 01-01-2022 to 10-01-2022.
The Zoo class provides us coredata() and index() functions using which we can separate core data and index attributes for analysis and manipulation purposes.
Example
Consider the following program −
# Importing xts library library(xts) # Creating a data myData <- xts(x = rnorm(n = 10), order.by = seq(as.Date("2022-01-01"), length = 10, by = "days"), born = as.POSIXct("2000-09-07")) # Using coredata() function print(coredata(myData))
Output
[,1] [1,] -2.4213283 [2,] -0.8433878 [3,] 2.0066340 [4,] 0.2640308 [5,] -0.3049552 [6,] 0.2998816 [7,] -0.4239970 [8,] 0.5577881 [9,] -0.5870677 [10,] 0.6856740
Using index() function to get actual dates
print(index(myData))
Output
[1] "2022-01-01" "2022-01-02" "2022-01-03" "2022-01-04" [5] "2022-01-05" "2022-01-06" "2022-01-07" "2022-01-08" [9] "2022-01-09" "2022-01-10"
As you can see in the output, firstly core data is printed then we have print actual dates on the console.
Basic Time Series Data Manipulations
In this section, we will discuss basic operations like extraction on the basis of the index and the role of forward slash between two time intervals.
Extraction on the basis of index
The xts class object allows us to extract value based on the index. Let us consider an example below −
Example
# Importing library library(xts) # Create myData myData <- rnorm(n = 500) dates <- seq(as.Date("2022-01-01"), length = 500, by = "days") # Creating myObject myObject <- xts(x = myData, order.by = dates) # Print the number of rows at index "2022" nrow(myObject["2022"])
Output
[1] 365
Using forward slash(/) for creating an interval
We can use forward slash between two-time intervals in an XTS object to get the duration between the specified intervals. For example, consider a program given below −
Example
# Importing library library(xts) # Create myData myData <- rnorm(n = 500) dates <- seq(as.Date("2022-01-01"), length = 500, by = "days") # Creating myObject myObject <- xts(x = dates, order.by = dates) # Print the duration between the two time intervals nrow(myObject["2022-01-01/2022-03-01"])
Output
[1] 60
As you can see in the output, the duration of data between 01-01-2022 and 01-03-2022 gets printed on the console.
We can also use forward slash between two-time intervals −
# Importing library library(xts) # Create myData myData <- rnorm(n = 500) # 20 days of data by minute times <- rnorm(n = 60*24*20) dateTimes = as.POSIXct("2022-11-01") + (1:(60*24*20))*60 # Creating myObject myObject <- xts(x = times, order.by = dateTimes) # Print time intervals between, # 2022-11-01 4AM and 2022-11-01 6AM head(myObject["2022-11-01T04:00/2022-11-01T06:00"])
Output
[,1] 2022-11-01 04:00:00 -0.4277830 2022-11-01 04:01:00 0.6544654 2022-11-01 04:02:00 0.4196311 2022-11-01 04:03:00 -0.1766988 2022-11-01 04:04:00 -1.8570621 2022-11-01 04:05:00 0.3229214
As you can see in the output, the time intervals between the two specified times gets printed on the console.
Merge and Modify Time Series Data
Syntax
The xts package provides us merge() function using which we can join an object of xts to another object on the index or a vector containing dates to an xts object. This function has the following syntax −
merge(object1, object2, ..., objectN, join = typeOfJoin, fill = integerValue)
The first argument is equal to the objects that you want to merge. The second argument is the type of join to be performed. The third argument is fill that specifies what to do with NA values and it is an optional argument.
Performing an Inner Join
Now let us consider a program that performs an inner join between two objects of xts class containing date elements in it −
Example
# Importing library library(xts) # Creating an object of xts class myObject1 <- xts(x = rnorm(n = 4), order.by = as.Date(c("2022-11-01", "2022-11-04", "2022-11-10", "2022-11-23"))) # Creating another object of xts class myObject2 <- xts(x = rnorm(n = 4), order.by = as.Date(c("2022-11-04", "2022-11-10", "2022-11-15", "2022-11-21"))) # Performing inner join merge(myObject1, myObject2, join = "inner")
Output
myObject1 myObject2 2022-11-04 0.9151754 1.332591 2022-11-10 0.4244563 -1.494515
Similarly, we can perform left outer join and right outer join.
Performing Outer and Right Outer Join
Now we will see a program demonstrating the full outer join of two objects of xts class −
Example
# Importing library library(xts) # Creating an object of xts class myObject1 <- xts(x = rnorm(n = 4), order.by = as.Date(c("2022-11-01", "2022-11-04", "2022-11-10", "2022-11-23"))) # Creating another object of xts class myObject2 <- xts(x = rnorm(n = 4), order.by = as.Date(c("2022-11-04", "2022-11-10", "2022-11-15", "2022-11-21"))) # Performing inner join merge(myObject1, myObject2, join = "outer")
Output
myObject1 myObject2 2022-11-01 -0.1080882 NA 2022-11-04 0.6906676 -0.75314257 2022-11-10 -0.3375777 1.29528001 2022-11-15 NA 0.09088094 2022-11-21 NA 0.20408394 2022-11-23 -1.7205721 NA
Performing Full Outer Join
Consider another program below that also performs full outer join of the two objects. It is important to note that, this time we are passing a third argument as “fill = 0” to the merge() function. Due to this, all the NA values will be replaced by 0 in the output −
Example
# Importing library library(xts) # Creating an object of xts class myObject1 <- xts(x = rnorm(n = 4), order.by = as.Date(c("2022-11-01", "2022-11-04", "2022-11-10", "2022-11-23"))) # Creating another object of xts class myObject2 <- xts(x = rnorm(n = 4), order.by = as.Date(c("2022-11-04", "2022-11-10", "2022-11-15", "2022-11-21"))) # Performing inner join merge(myObject1, myObject2, join = "outer", fill = 0)
Output
myObject1 myObject2 2022-11-01 -0.27983799 0.000000 2022-11-04 0.56771575 1.079353 2022-11-10 0.09849405 1.169731 2022-11-15 0.00000000 -1.022448 2022-11-21 0.00000000 1.031976 2022-11-23 0.99577871 0.000000
Apply and aggregate by time
The xts class also provides us endpoints() function that we can use to get the locations of the last observations in each interval that is mentioned by the argument,
on = c("years", "quarters", "months", "hours", "minutes")
Example
Let us consider a program below that contains all the xts object having dates in the range 01-01-2022 to 10-01-2022 −
# Importing xts library library(xts) # Creating a data myData <- xts(x = rnorm(n = 10), order.by = seq(as.Date("2022-01-01"), length = 10, by = "days"), born = as.POSIXct("2000-09-07")) # Print myData print(myData)
Output
[,1] 2022-01-01 -0.71176135 2022-01-02 0.07589876 2022-01-03 -0.06607525 2022-01-04 0.53143095 2022-01-05 0.11743337 2022-01-06 -0.29164378 2022-01-07 -0.04782661 2022-01-08 -1.93776118 2022-01-09 -0.04961253 2022-01-10 -0.45633307
Now we can use endpoints function on myData as to print Sunday dates (02-01-2022 and 09-01-2022) and end point date (10-01-2022) −
# Importing xts library library(xts) # Creating a data myData <- xts(x = rnorm(n = 10), order.by = seq(as.Date("2022-01-01"), length = 10, by = "days"), born = as.POSIXct("2000-09-07")) # Get endpoints endPoints <- endpoints(myData, on = "weeks") # Print the endpoints data from myData myData[endPoints]
Output
[,1] 2022-01-02 0.399972612 2022-01-09 -0.009547296 2022-01-10 -0.622855139
Conclusion
In this tutorial, we have discussed about how we can manipulate time series data in R using xts and zoo. We discussed in detail basic manipulation, merging and modifying time series, and lastly, we discussed how we can apply and aggregate by time. I hope this tutorial has helped you to strengthen your knowledge in the field of data science.
- Related Articles
- How to plot time series data with labels in R?
- How to delete repeated rows in xts object in R?
- Role of time series algorithms in Data Science
- How to extract columns of a data frame with their names after converting it to a time series object in R?
- How to create a time series plot in R without time vector?
- How to plot multiple time series using ggplot2 in R?
- Manipulating images with ImageMagick command
- How to decompose a time series with trend and seasonal components using loess method in R?
- How to calculate monthly average for time series object in R?
- How to check if a time series is stationary in R?
- How to convert a time series object to a vector in R?
- How can Matplotlib be used to generate time-series data?
- Data Manipulation in R with data.table
- Dealing with Missing Data in R
- Joining Data in R with data.table
