How to find the rank of each value in columns if some columns are categorical in R data frame?



To find the rank of each value in columns if some columns are categorical in R data frame, we can follow the below steps −

  • First of all, create a data frame.

  • Then, use numcolwise function from plyr package to find the rank of each value in columns if some columns are categorical.

Example

Create the data frame

Let’s create a data frame as shown below −

Level<-sample(c("low","medium","high"),25,replace=TRUE)
Group<-sample(c("first","second"),25,replace=TRUE)
DV1<-rnorm(25)
DV2<-rnorm(25)
df<-data.frame(Level,Group,DV1,DV2)
df

Output

On executing, the above script generates the below output(this output will vary on your system due to randomization) −

   Level  Group     DV1           DV2
1  medium first  -0.15444635   0.44771691
2  low    first   0.64594002   0.70918039
3  medium first   0.11612343  -0.46156286
4  medium second -2.07505385  -0.19145800
5  medium first   0.91928571   0.80887669
6  medium first   0.71592841   0.16538757
7  high   second -1.45712679   0.40105329
8  high   second -0.57098794   0.97701583
9  high   second -0.55531986   0.52548578
10 medium first   0.21788069  -0.89447993
11 low    second  0.13378146  -1.54879981
12 low    first  -1.25162532   0.21650691
13 low    second  0.14558721   1.24260380
14 medium second  0.93689245   0.34528017
15 high   second -1.25450836   0.34797171
16 low    second -0.38612538   0.31359466
17 high   first   2.70415465   0.73713265
18 high   second -0.12480067   0.37259163
19 high   second  0.78704330  -0.35841561
20 low    first   0.81727351  -0.74304509
21 medium second  0.61382411  -0.40644606
22 low    first   0.39757586  -2.33494132
23 high   second -2.07106056  -0.90051548
24 high   second -0.08953589   0.09631326
25 high   second  0.65695959  -1.10357835

Find the rank of each value in columns if some columns are categorical

Using numcolwise function from plyr package to find the rank of each value in columns if some columns are categorical in the data frame df −

Level<-sample(c("low","medium","high"),25,replace=TRUE)
Group<-sample(c("first","second"),25,replace=TRUE)
DV1<-rnorm(25)
DV2<-rnorm(25)
df<-data.frame(Level,Group,DV1,DV2)
library(plyr)
numcolwise(rank)(df)

Output

  DV1 DV2
1  11 15
2   5  3
3  15  5
4  23 23
5   8 17
6   4 11
7  17 10
8  16 21
9   7 24
10  6 14
11 14  1
12 10  9
13 19 19
14 22 12
15  9 16
16 20  2
17 18 22
18 21 18
19 13  7
20  2 25
21  1 20
22 24  6
23  3 13
24 12  8
25 25  4

Advertisements