R - Interview Questions


Dear readers, these R Interview Questions have been designed specially to get you acquainted with the nature of questions you may encounter during your interview for the subject of R programming. As per my experience good interviewers hardly plan to ask any particular question during your interview, normally questions start with some basic concept of the subject and later they continue based on further discussion and what you answer −

R is a programming language meant for statistical analysis and creating graphs for this purpose.Instead of data types, it has data objects which are used for calculations. It is used in the fields of data mining, Regression analysis, Probability estimation etc., using many packages available in it.

There are 6 data objects in R. They are vectors, lists, arrays, matrices, data frames and tables.

A valid variable name consists of letters, numbers and the dot or underline characters. The variable name starts with a letter or the dot not followed by a number.

A matrix is always two dimensional as it has only rows and columns. But an array can be of any number of dimensions and each dimension is a matrix. For example a 3x3x2 array represents 2 matrices each of dimension 3x3.

The Factor data objects in R are used to store and process categorical data in R.

A csv file can be loaded using the read.csv function. R creates a data frame on reading the csv files using this function.

The command getwd() gives the current working directory in the R environment.

This is the package which is loaded by default when R environment is set. It provides the basic functionalities like input/output, arithmetic calculations etc. in the R environment.

Logistic regression deals with measuring the probability of a binary response variable. In R the function glm() is used to create the logistic regression.

The expression M[4,2] gives the element at 4th row and 2nd column.

When two vectors of different length are involved in a operation then the elements of the shorter vector are reused to complete the operation. This is called element recycling. Example - v1 <- c(4,1,0,6) and V2 <- c(2,4) then v1*v2 gives (8,4,0,24). The elements 2 and 4 are repeated.

We can call a function in R in 3 ways. First method is to call by using position of the arguments. Second method id to call by using the name of the arguments and the third method is to call by default arguments.

The lazy evaluation of a function means, the argument is evaluated only if it is used inside the body of the function. If there is no reference to the argument in the body of the function then it is simply ignored.

To install a package in R we use the below command.

install.packages("package Name")

The package named "XML" is used to read and process the XML files.

We can update any of the element but we can delete only the element at the end of the list.

The general expression to create a matrix in R is - matrix(data, nrow, ncol, byrow, dimnames)

The boxplot() function is used to create boxplots in R. It takes a formula and a data frame as inputs to create the boxplots.

Frequency 6 indicates the time interval for the time series data is every 10 minutes of an hour.

In R the data objects can be converted from one form to another. For example we can create a data frame by merging many lists. This involves a series of R commands to bring the data into the new format. This is called data reshaping.

It generates 4 random numbers between 0 and 1.

Use the command


It splits the strings in vector x into substrings at the position of letter e.

x <- "The quick brown fox jumps over the lazy dog"
split.string <- strsplit(x, " ")
extract.words <- split.string[[1]]
result <- unique(tolower(extract.words))

Error in v * x[1] : non-numeric argument to binary operator

[1] 5 12 21 32s

It converts a list to a vector.

x <- pbinom(26,51,0.5)


Using the function as.data.frame()

function(x) { x[is.na(x)] <- sum(x, na.rm = TRUE); x }

It is used to apply the same function to each of the elements in an Array. For example finding the mean of the rows in every row.

Every matrix can be called an array but not the reverse. Matrix is always two dimensional but array can be of any dimension.


sd(x, na.rm=TRUE)


"%%" gives remainder of the division of first vector with second while "%/%" gives the quotient of the division of first vector with second.

Find the column has the maximum value for each row.



data(package = "MASS")

data(package = .packages(all.available = TRUE))

It is used to install a r package from local directory by browsing and selecting the file.

15 %in% x
pairs(formula, data)

Where formula represents the series of variables used in pairs and data represents the data set from which the variables will be taken.

The subset() functions is used to select variables and observations. The sample() function is used to choose a random sample of size n from a dataset.

is.matrix(m) should retrun TRUE.

[1] NA

The function t() is used for transposing a matrix. Example - t(m) , where m is a matrix.

The "next" statement in R programming language is useful when we want to skip the current iteration of a loop without terminating it.