How to find the variance of frequency data in R?


If we have frequency data then we first need to find the total data or complete data by repeating the values up to the frequency corresponding to each value after that we can apply var function on this complete data.

For Example, if we have a data frame called df that contains two columns say X and Frequency then we can find the total data by using the command given below −

Total_data<-rep(X,Frequency)

Now the median will be found by using the command as follows −

var(Total_data)

Example 1

Following snippet creates a sample data frame −

x<-rpois(20,20)
frequency<-sample(1:10,20,replace=TRUE)
df1<-data.frame(x,frequency)
df1

The following dataframe is created

   x frequency
1  11 3
2  15 9
3  23 2
4  16 3
5  16 4
6  17 10
7  19 6
8  23 9
9  15 6
10 22 4
11 21 5
12 18 10
13 21 3
14 27 1
15 16 5
16 27 5
17 19 8
18 23 5
19 19 3
20 16 8

To find total data in df1 on the above created data frame, add the following code to the above snippet −

x<-rpois(20,20)
frequency<-sample(1:10,20,replace=TRUE)
df1<-data.frame(x,frequency)
Total_data1<-rep(x,frequency)
Total_data1

Output

If you execute all the above given snippets as a single program, it generates the following Output −

[1] 11 11 11 15 15 15 15 15 15 15 15 15 23 23 16 16 16 16 16 16 16 17 17 17 17
[26] 17 17 17 17 17 17 19 19 19 19 19 19 23 23 23 23 23 23 23 23 23 15 15 15 15
[51] 15 15 22 22 22 22 21 21 21 21 21 18 18 18 18 18 18 18 18 18 18 21 21 21 27
[76] 16 16 16 16 16 27 27 27 27 27 19 19 19 19 19 19 19 19 23 23 23 23 23 19 19
[101] 19 16 16 16 16 16 16 16 16

To find the variance of Total_data1 on the above created data frame, add the following code to the above snippet −

x<-rpois(20,20)
frequency<-sample(1:10,20,replace=TRUE)
df1<-data.frame(x,frequency)
Total_data1<-rep(x,frequency)
var(Total_data1)

Output


If you execute all the above given snippets as a single program, it generates the following Output −

[1] 12.58699

Example 2

Following snippet creates a sample data frame −

y<-rpois(20,20)
count<-sample(1:10,20,replace=TRUE)
df2<-data.frame(y,count)
df2

The following dataframe is created

  y count
1  25 2
2  14 2
3  13 8
4  22 6
5  18 1
6  30 9
7  22 9
8  26 1
9  23 3
10 20 2
11 17 2
12 12 5
13 20 3
14 12 8
15 20 1
16 11 7
17 19 3
18 13 3
19 17 8
20 15 8

To find total data in df2 on the above created data frame, add the following code to the above snippet −

y<-rpois(20,20)
count<-sample(1:10,20,replace=TRUE)
df2<-data.frame(y,count)
Total_data2<-rep(y,count)
Total_data2

Output

If you execute all the above given snippets as a single program, it generates the following Output −

[1] 25 25 14 14 13 13 13 13 13 13 13 13 22 22 22 22 22 22 18 30 30 30 30 30 30
[26] 30 30 30 22 22 22 22 22 22 22 22 22 26 23 23 23 20 20 17 17 12 12 12 12 12
[51] 20 20 20 12 12 12 12 12 12 12 12 20 11 11 11 11 11 11 11 19 19 19 13 13 13
[76] 17 17 17 17 17 17 17 17 15 15 15 15 15 15 15 15

To find the variance of Total_data2 on the above created data frame, add the following code to the above snippet −

y<-rpois(20,20)
count<-sample(1:10,20,replace=TRUE)
df2<-data.frame(y,count)
Total_data2<-rep(y,count)
var(Total_data2)

Output

If you execute all the above given snippets as a single program, it generates the following Output −

[1] 33.33138

Example 3

Following snippet creates a sample data frame −

z<-sample(1:2,20,replace=TRUE)
count<-sample(1:10,20,replace=TRUE)
df3<-data.frame(z,count)
df3

The following dataframe is created

  z count
1  1 8
2  1 1
3  1 3
4  1 5
5  1 3
6  2 5
7  2 6
8  1 1
9  1 10
10 2 10
11 2 6
12 2 7
13 2 1
14 1 5
15 1 4
16 1 1
17 2 2
18 1 5
19 2 2
20 2 6

To find total data in df3 on the above created data frame, add the following code to the above snippet −

z<-sample(1:2,20,replace=TRUE)
count<-sample(1:10,20,replace=TRUE)
df3<-data.frame(z,count)
Total_data3<-rep(z,count)
Total_data3

Output

If you execute all the above given snippets as a single program, it generates the following Output −

[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1
[39] 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
[77] 2 2 1 1 1 1 1 2 2 2 2 2 2 2 2

To find the variance of Total_data3 on the above created data frame, add the following code to the above snippet −

z<-sample(1:2,20,replace=TRUE)
count<-sample(1:10,20,replace=TRUE)
df3<-data.frame(z,count)
Total_data3<-rep(z,count)
var(Total_data3)

Output

If you execute all the above given snippets as a single program, it generates the following Output −

[1] 0.2527473

Updated on: 02-Nov-2021

281 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements