To find the row variance of columns having same name in R data frame, we can follow the below steps −
First of all, create a data frame with some columns having same name.
Then, use tapply along with colnames and var function to find the row variance of columns having same name.
Let’s create a data frame as shown below −
df<- data.frame(x=rpois(25,1),x=rpois(25,10),y=rpois(25,3),y=rpois(25,5),check.names=FALSE) df
On executing, the above script generates the below output(this output will vary on your system due to randomization) −
x x y y 1 1 13 4 5 2 2 9 5 9 3 1 13 6 7 4 1 16 4 4 5 2 7 2 6 6 1 9 2 5 7 1 11 7 7 8 1 9 3 7 9 1 13 4 1 10 1 10 6 6 11 3 7 3 5 12 0 11 4 2 13 0 9 4 7 14 0 7 3 7 15 0 9 2 10 16 0 14 1 3 17 0 12 4 6 18 3 10 1 8 19 1 13 2 4 20 0 7 3 9 21 0 14 3 2 22 0 10 2 4 23 0 4 1 4 24 2 9 1 4 25 1 5 1 11
Find the row variance of columns having same name
Using tapply along with colnames and var function to find the row variance of columns having same name in data frame df −
df<- data.frame(x=rpois(25,1),x=rpois(25,10),y=rpois(25,3),y=rpois(25,5),check.names=FALSE) t(apply(df,1, function(x) tapply(x,colnames(df),var)))
x y [1,] 72.0 0.5 [2,] 24.5 8.0 [3,] 72.0 0.5 [4,] 112.5 0.0 [5,] 12.5 8.0 [6,] 32.0 4.5 [7,] 50.0 0.0 [8,] 32.0 8.0 [9,] 72.0 4.5 [10,] 40.5 0.0 [11,] 8.0 2.0 [12,] 60.5 2.0 [13,] 40.5 4.5 [14,] 24.5 8.0 [15,] 40.5 32.0 [16,] 98.0 2.0 [17,] 72.0 2.0 [18,] 24.5 24.5 [19,] 72.0 2.0 [20,] 24.5 18.0 [21,] 98.0 0.5 [22,] 50.0 2.0 [23,] 8.0 4.5 [24,] 24.5 4.5 [25,] 8.0 50.0