To find the means of all columns in an R data frame, we can simply use colMeans function and it returns the mean. But for standard deviations, we do not have any direct function that can be used; therefore, we can use sd with apply and reference the columns to find the standard deviations for all column of an R data frame. For example, if we have a data frame df then the syntax using apply function to find the standard deviations for all columns will be apply(df, 2, sd), here 2 refers to the columns. If we want to ... Read More
When we take a random sample from an R data frame the sample rows have row numbers as in the original data frame, obviously it happens due to randomization. But it might create confusion while doing analysis, especially in cases when we need to use rows, therefore, we can convert the index number of rows to numbers from 1 to the number of rows in the selected sample.ExampleConsider the below data frame − Live Demo> set.seed(111) > x1 x2 x3 df1 df1Output x1 x2 x3 1 1.735220712 2.8616625 1.824274 2 1.169264128 2.8469644 ... Read More
A vector may have thousands of values and each of them could be different or same also. It is also possible that values can be grouped or randomly selected but having few similar values. Irrespective of the values in a vector, to find some largest values we need to sort the vector in ascending order then the largest values will be selected.Examples> x1 x1 [1] -1.4447473195 3.2906645299 -0.4680055849 0.1611487482 -0.7715094280 [6] 0.4442103640 0.3702444686 0.0783124252 1.3476432299 1.0140576107 [11] -0.0968917066 0.4628821017 0.3102594626 -0.2946001275 0.1498108166 [16] -0.6002154305 0.5905382364 1.3892651534 0.1008921325 -0.6486318692 [21] -0.0562831933 -0.6887431711 0.4907512082 -0.3994662410 0.7827897030 [26] 0.5294704584 -1.3802965730 -0.6159076490 -0.0009408529 1.6182294859 ... Read More
A numeric vector may contain a large number of elements; therefore, we might want to convert that vector into a vector of intervals. For example, if we have 1 to 10 values in a vector then we might want to convert that vector into a vector of intervals such as (1, 5) for 1, 2, 3, 4, and 5 and (6, 10) for 6, 7, 8, 9, 10). This can be done by using cut function where we will use breaks argument to combine the vector elements in an interval.Examples Live Demo> x1 x1Output[1] 1 2 3 4 5 6 7 ... Read More
The cumulative sums are the sum of consecutive values and we can take this sum for any numerical vector or a column of an R data frame. But if there exits an NA, then we need to skip it and therefore the size of the cumulative sums will be reduced by the number of NA values. If we have NA values in a vector then we can ignore them while calculating the cumulative sums with cumsum function by using !is.na.Examples Live Demo> x1 x1Output[1] 1 2 3 4 5 6 7 8 9 10 NA > cumsum(x1[!is.na(x1)]) [1] 1 3 6 ... Read More
A fraction form of a decimal value is the form of the value represented with division sign. For example, representing 0.5 as 1 / 2. In R, we can use fractions function of MASS package to convert a decimal value or a vector of decimal values to fractional form. To do so, we just need to pass the value in fractions function as fractions(“Decimal_value or Vector_Of_Decimal_Values”).Loading MASS package −Examples> library(MASS)Output> fractions(0.14) [1] 7/50 > fractions(1.14) [1] 57/50 > library(MASS) > fractions(0.5) [1] 1/2 > fractions(0.3) [1] 3/10 > fractions(0.31) [1] 31/100 > fractions(1.31) [1] 131/100 > fractions(2.01) [1] 201/100 > ... Read More
To annotate the text inside a plot created by ggplot2, we can use annotate function. It is used to give some explanation about the plot or add any useful information that will help readers to understand the plot in a better way. Sometimes, we might want to change the angle of the annotated text, especially in cases where we have some information that is presented vertically in the plot, therefore, we can use angle argument of the annotate function.ExampleConsider the below data frame − Live Demo> x y df dfOutput x y 1 4.086537 5.890591 2 ... Read More
A dot plot is a type of histogram that display dots instead of bars and it is created for small data sets. In ggplot2, we have geom_dotplot function to create the dot plot but we have to pass the correct binwidth which is an argument of the geom_dotplot, so that we don’t get the warning saying “Warning: Ignoring unknown parameters: bins `stat_bindot()` using `bins = 30`. Pick better value with `binwidth`.”ExampleConsider the below data frame −> x df1 library(ggplot2)Creating the dot plot of x −> ggplot(df1, aes(x))+geom_dotplot(binwidth=0.2)OutputLet’s have a look at one more example −> y df2 ggplot(df2, aes(y))+geom_dotplot(binwidth=0.2)OutputRead More
Percentile helps us to determine the values that lie at a certain percent in a data set. For example, if we have a vector of size 100 with containing any values and suppose that the tenth-percentile of the vector is 25, which means there are ten percent values in the vector that are less than 25, or we can say, there are ninety percent values in the vector that are greater than 25. We can find percentiles of a vector values using quantile function in R.Examples Live Demo> x1 x1Output[1] 7 1 7 6 6 5 3 1 5 5 4 ... Read More
Deletion or addition of rows and columns in a matrix of any size is mostly done by using single square brackets and it is also the easiest way. To delete rows and columns, we just need to use the column index or row index and if we want to delete more than one of them then we can separate them by commas by inserting them inside c as c(-1, -2). If we want to delete more than one rows or columns in a sequence then a colon can be used.Examples Live Demo> M MOutput [, 1] [, 2] [, 3] [, ... Read More
Data Structure
Networking
RDBMS
Operating System
Java
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP