The replicates of a data frame in R can be created with the help of sapply function, to set the number of times we want to repeat the data frame we can use rep.int,times argument. For example, if we have a data frame df and we want to create 5 replicates of df and add them in the original then sapply(df,rep.int,times=5) can be used.Example Live DemoConsider the below data frame −set.seed(151) x1
To find the day of the year from dates, we can use yday function of lubridate package. For example, if we have a date or a date of vectors then we simply need to pass that date or the vector inside yday function by using the below syntax −yday(“date”)oryday(“vector_of_date”)Loading lubridate package −library(lubridate)Examplesdate1
To find the row means we can use rowMeans function but if we have some missing values in the data frame then na.rm=TRUE argument can be used in the same way as it is used while calculating the means for columns. For example, if we have a data frame df that contains two columns x and y each having some missing values then the row means can be calculated as rowMeans(df,na.rm=TRUE).ExampleConsider the below data frame − Live Demoset.seed(1515) x1
One of the most important aspects of a boxplot is Y-axis labels because these labels help us to understand the limit of the variable. Since R generate these labels automatically in a good way, we stick with that but we can change that using coord_cartesian function with ylim as shown in the below example.Example Live DemoConsider the below data frame −set.seed(1212) x
To replace missing values with median, we can use the same trick that is used to replace missing values with mean. For example, if we have a data frame df that contain columns x and y where both of the columns contains some missing values then the missing values can be replaced with median as df$x[is.na(df$x)]
Sometimes we want to figure out which value lies at some position in an R data frame column, this helps us to understand the data collection or data simulation process. For example, if we have a data frame df that contain columns x, y, and z each with 5000 values then we can use df$x[[253]] to find which values lies at 253rd row in column x of data frame df.ExampleConsider the below data frame − Live Demoset.seed(987) x
To create a point chart for cumulative sums using ggplot2, we need to use cumsum function for the dependent variable inside the aes function for aesthetic mapping that describes how the variable will be plotted. For example, if we have a data frame df that contain columns x and y where y is the dependent variable then the point chart for cumulative sums can be created as ggplot(df,aes(1:20,y=cumsum(y)))+geom_point().ExampleConsider the below data frame − Live Demoset.seed(666) x
If we have two lists of same size then we can create a data frame using those lists and this can be easily done with the help of expand.grid function. The expand.grid function create a data frame from all combinations of the provided lists or vectors or factors. For example, if we have two lists defined as List1 and List2 then we can create a data frame using the code expand.grid(List1,List2).Example Live DemoConsider the below lists −List1
In mathematics, when two vectors are multiplied the output is a scalar quantity which is the sum of the product of the values. For example, if we have two vectors x and y each containing 1 and 2 then the multiplication of the two vectors will be 5. In R, we can do it by using t(x)%*%y.Example1 Live Demox1
To find the mean of list elements we need to unlist those elements. For example, if we have a list named as List that contains three elements of equal or different sizes such element1, element2, and element3 then we can find the mean of all the list elements by using mean(unlist(List)).Example1List1