Sometimes we need the multiplication of two columns and create a new column so that the multiplication can be used further for analysis. For example, to calculate BMI we need mass and height and the height is squared, therefore, we would be needing the square of height. For this purpose, we can either multiply height with height or simply take the square both the ways work. Hence, if only have height column in an R data frame then we can multiply it with itself.ExampleConsider the below data frame −Live Demo> set.seed(957) > x y z df dfOutputx y z 1 ... Read More
In R, by default the whisker lines are as wide as the box of the boxplot but it would be great if we reduce that width or increase it because it will get attention of the viewer in that way. This can be done by using the width argument inside the stat_boxplot function of ggplot2 package. Check out the below example to understand how it works.ExampleConsider the below data frame −ExampleLive Demo> x y df dfOutputx y 1 B 5 2 B 4 3 A 6 4 A 9 5 B 2 6 B 4 7 B 6 8 B ... Read More
To find the sum of division if zero exists in the vectors, we need to assign NA to zeros in both the vectors and then use the sum function with na.rm set to TRUE. For example, if we have two vectors x and y that contains some zeros then we can divide x by y using the below commands −x[x==0] y yOutput[1] 1 5 3 1 9 1 3 8 9 0 1 7 3 2 3 3 2 9 3 1 9 5 5 2 5 4 4 7 4 5 9 1 9 9 4 2 3 [38] ... Read More
The Chi Square Goodness of fit test is used to test whether the distribution of nominal variables is same or not as well as for other distribution matches and on the other hand the Kolmogorov Smirnov test is only used to test to the goodness of fit for a continuous data. The difference is not about the programming tool, it is a concept of statistics.ExampleLive Demo> x xOutput[1] 0.078716115 -0.682154062 0.655436957 -1.169616157 -0.688543382 [6] 0.646087104 0.472429834 2.277750805 0.963105637 0.414918478 [11] 0.575005958 -1.286604138 -1.026756390 2.692769261 -0.835433410 [16] 0.007544065 0.925296720 1.058978610 0.906392907 0.973050503Example> ks.test(x, pnorm) One-sample Kolmogorov-Smirnov test data: x ... Read More
To create a boxplot for data frame columns we can simply use boxplot function but it cannot be done directly for matrix columns. If we want to create boxplot for matrix columns then we need to convert the matrix into data frame and then use the boxplot function. For example, if we have a matrix called M then the boxplot for columns in M can be created by using boxplot(as.data.frame(M)).ExampleLive Demo> M MOutput[,1] [,2] [,3] [,4] [,5] [1,] 1.688556 1.697216 1.9469573 1.873956 2.010246 [2,] 1.655357 1.927145 2.0937415 2.273638 1.966972 [3,] 1.886917 1.182852 2.0291452 2.507944 2.338664 [4,] 2.013053 1.995526 1.8122830 2.531708 2.483359 [5,] 1.812015 1.950053 1.8902859 2.453222 2.123253 [6,] 1.781764 1.786285 2.3384120 2.275382 2.509708 [7,] 1.836378 1.192781 1.5382031 2.012324 2.290340 [8,] 2.061482 1.705481 2.5542404 1.958202 1.991252 [9,] 2.162214 1.958862 1.8096081 1.810033 1.856942 [10,] 1.897020 1.614834 2.3407207 2.199068 1.807968 [11,] 2.491147 2.317192 2.4486029 2.131722 1.947841 [12,] 1.860307 1.932982 2.2034280 1.982581 2.720482 [13,] 1.814205 2.214286 1.6917036 1.854341 2.150684 [14,] 1.224437 1.800944 1.7600398 1.503382 2.775012 [15,] 2.309462 2.534766 1.5111472 2.058761 1.823550 [16,] 2.190564 1.588298 1.8854163 1.694651 1.939035 [17,] 2.521611 2.339012 2.2959581 2.501148 1.951673 [18,] 1.808799 2.314207 1.8704730 1.937851 1.877917 [19,] 2.476626 1.806194 2.7111663 2.156506 1.521197 [20,] 1.819725 1.633549 1.9438948 2.213533 2.247944 [21,] 2.412117 1.797531 2.5320892 1.889267 2.586912 [22,] 1.679395 2.276218 1.6120445 1.648766 1.889033 [23,] 2.286285 2.221312 0.9408758 1.896072 1.996449 [24,] 2.274975 2.398884 2.0146319 1.814092 2.350100 [25,] 2.106620 1.640401 1.6416454 2.452356 1.638885 [26,] 1.556329 1.706762 1.8324196 2.348518 1.593293 [27,] 2.171867 1.707615 1.9667116 2.191344 1.595531 [28,] 1.796751 2.753674 2.1741976 1.623239 2.399018 [29,] 2.635992 2.180735 2.2114669 2.258419 2.277367 [30,] 1.874671 2.113165 2.3653358 2.231705 1.919449Example> boxplot(as.data.frame(M))Output
Sometimes we need to compare the maximum values or set some column of a data frame or data.table object to their maximums, especially in research studies that may require biasedness. Therefore, we can set all the column values to maximum. In case of a data.table object, we can use single square bracket to access and assign the column values to their maximum as shown in the below examples.ExampleLoading data.table package and creating a data.table object −> library(data.table) > x1 x2 DT1 DT1Outputx1 x2 1: 3 4 2: 3 5 3: 5 6 4: 10 5 5: 8 2 6: 3 ... Read More
The mtext function can help us to create X-axis or Y-axis labels and we can put these labels to places desired by us with the help of at argument. For example, if we want to use capital letters starting from A to J that are 10 characters on the X-axis labels then it can be done by using the below command −mtext(text=LETTERS[1:10],outer=FALSE,side=1,las=1,at=1:10)Example> plot(1:10,xaxt="n")OutputExample> mtext(text=LETTERS[1:10],outer=FALSE,side=1,las=1,at=1:10)Output
The rownames and colnames functions are used to define the corresponding names of a matrix and if we want to extract those names then the same function will be used. For example, if we have a matrix called M that has row names and column names then these names can be found by using rownames(M) and colnames(M).ExampleLive Demo> M1 M1Output[, 1] [, 2] [, 3] [, 4] [, 5] [, 6] [, 7] [, 8] [, 9] [, 10] [1, ] 1 11 21 31 41 51 61 71 81 91 [2, ] 2 12 22 32 42 52 62 ... Read More
The F statistic has two degrees of freedom, one for the numerator and one for the denominator and the F distribution is a right-tailed distribution. Therefore, we need to use the F-statistic, the degrees of freedoms, and the lower.tail=FALSE argument with pf function to find the p-value for a F statistic.ExamplesLive Demo> pf(5, 1, 99, lower.tail=F) > pf(5, 1, 24, lower.tail=F) > pf(5, 1, 239, lower.tail=F) > pf(5, 5, 239, lower.tail=F) > pf(5, 5, 49, lower.tail=F) > pf(12, 5, 49, lower.tail=F) > pf(120, 5, 49, lower.tail=F) > pf(120, 1, 49, lower.tail=F) > pf(120, 1, 149, lower.tail=F) > pf(3, 1, 149, ... Read More
To detach a package in R, we can simply use the detach function. But we need to remember that once the package will be detached there is no way to use any of the functions of that particular package. We make this mistake if we forget about detachment. For example, if we detach ggplot2 package using detach function detach(package:ggplot2, unload=TRUE) and again run the ggplot or qplot function then there will be an error.ExampleConsider the below data frame −Live Demo> x y df dfOutputx y 1 -0.09124881 0.8106691 2 -0.20521435 -1.0067072 3 -1.07904498 1.3867400 4 1.34461945 -1.4676405 5 -0.21731862 0.5801624 ... Read More