To create a stacked barplot using barplot function we need to use matrix instead of a data frame object because in R barplot function can be used for a vector or for a matrix only. We must be very careful if we want to create a stacked bar plot using barplot function because bar plots are created for count data only. Here, you will see some examples of count as well as continuous data, carefully read the graphs and understand how the graphs are different from each other.Example1 Live DemoM1
Often, we get missing data and sometimes missing data is filled with zeros if zero is not the actual range for a variable. In this type of situations, we can remove the rows where all the values are zero. For this purpose, we can use rowSums function and if the sum is greater than zero then keep the row otherwise neglect it.Example1 Live DemoConsider the below data frame −set.seed(251) x1
To generate a permutation of x values in y positions, we can use expand.grid function. For example, if we want to generate three columns for the range of values 0 to 5 then it can be done in R by using the below command − Live Demox
The Mahalanobis distance is the relative distance between two cases and the centroid, where centroid can be thought of as an overall mean for multivariate data. We can say that the centroid is the multivariate equivalent of mean. If the mahalanobis distance is zero that means both the cases are very same and positive value of mahalanobis distance represents that the distance between the two variables is large. In R, we can use mahalanobis function to find the malanobis distance.Example1 Live DemoConsider the below data frame −set.seed(981) x1
When we have two or more categorical columns in an R data frame with strings as level of the categories or numbers as strings/integers then we can find the frequency of one based on another. This will help us to identify the cross-column frequencies and we can understand the distribution of one categorical based on another column. To do this with dplyr package, we can use filter function.Example Live DemoConsider the below data frame −Group%filter(Standard=="II")%>%count(Group) OutputGroup n 1 1 1 2 2 1 3 3 2 4 4 1Exampledf1%>%filter(Standard=="III")%>%count(Group) OutputGroup n 1 1 1 2 3 2 3 4 6 4 ... Read More
A geometric progression series is a sequence of numbers in which all the numbers after the first can be found by multiplying the previous one by a fixed number. To generate a geometric progression series in R, we can use seq function. For example, to generate a geometric progression series of 2 by having the difference of multiplication value equal to 1 up to 5 can be found as 2^seq(0, 5, by=1) and the output would be 1, 2, 4, 8, 16, 32.Examples2^seq(0, 5, by=1) [1] 1 2 4 8 16 32 2^seq(0, 5, by=2) [1] 1 4 16 2^seq(0, ... Read More
To create a random sample in R, we can use sample function but if the weight of the values is provided then we need to assign the probability of the values based on the weights. For example, if we have a data frame df that contains a column X with some values and another column Weight with the corresponding weights then a random sample of size 10 can be generated as follows −df[sample(seq_len(nrow(df)),10,prob=df$Weight_x),]Example Live DemoConsider the below data frame −set.seed(1256) x
If we have an integer column that actually contains date values, for example having 29th September 2020 as 20200929 then we can convert it to date by using transform function by reading the dates with as.Date function but as.character will also be needed so that as.Date function can read the date values.Example1 Live DemoConsider the below data frame −ID
To create a scatterplot for factor levels, we can use facet_grid function of ggplot2 package. For example, suppose we have a factor column in a data frame df defined as F and numerical columns defined as x and y then the scatterplot for the factor levels can be created as −ggplot(df,aes(x,y))+geom_point()+facet_grid(~Factor)Example Live DemoConsider the below data frame −set.seed(1251) Factor
In base R, we can use legend function to add a legend to the plot. For example, if we want to create a histogram with legend on top-right position then we can use legend("topright",legend="Normal Distribution") and if we want to change the font size then we need to as cex argument as shown below:legend("topright",legend="Normal Distribution",cex=2)Example Live DemoConsider the below histogram −x