How to find the position of maximum of each numerical column if some columns are categorical in R data frame?



To find the position of maximum of each numerical column if some columns are categorical in R data frame, we can follow the below steps −

  • First of all, create a data frame.

  • Then, use numcolwise function from plyr package to find the maximum of each numerical column if some columns are categorical.

Example 1

Create the data frame

Let’s create a data frame as shown below −

Level<-sample(c("low","medium","high"),25,replace=TRUE)
Group<-sample(c("first","second"),25,replace=TRUE)
DV1<-rpois(25,5)
DV2<-rpois(25,10)
df1<-data.frame(Level,Group,DV1,DV2)
df1

Output

On executing, the above script generates the below output(this output will vary on your system due to randomization) −

   Level  Group  DV1 DV2
1  low    first  8    7
2  low    first  6   11
3  high   first  2   14
4  medium second 3   11
5  low    second 4   10
6  medium second 7    7
7  high   second 4   15
8  low    second 3    8
9  high   second 5    6
10 medium second 3   13
11 medium second 1   13
12 low    first  3   10
13 high   first  6   10
14 high   first  5   14
15 medium first 10   11
16 low    first  6   7
17 medium second 7   10
18 high   second 5   11
19 medium second 4   11
20 low    first  5   13
21 medium first  2    9
22 medium first  6   12
23 low    second 5    8
24 low    second 6   10
25 low    second 2    6

Find the maximum of each column if some columns are categorical

Using numcolwise function from plyr package to find the maximum of each numerical column if some columns are categorical in data frame df1 −

Level<-sample(c("low","medium","high"),25,replace=TRUE)
Group<-sample(c("first","second"),25,replace=TRUE)
DV1<-rpois(25,5)
DV2<-rpois(25,10)
df1<-data.frame(Level,Group,DV1,DV2)
library(plyr)
numcolwise(which.max)(df1)

Output

  DV1 DV2
1 24  15

Example 2

Create the data frame

Let’s create a data frame as shown below −

factor1<-sample(c("Super","Lower","Medium"),25,replace=TRUE)
factor2<-sample(c("I","II","III"),25,replace=TRUE)
v1<-rnorm(25)
v2<-rnorm(25)
df2<-data.frame(factor1,factor2,v1,v2)
df2

Output

On executing, the above script generates the below output(this output will vary on your system due to randomization) −

   factor1  factor2   v1           v2
1  Lower      II   -0.88708231   0.30097842
2  Super      I    -1.15358512  -0.50595244
3  Lower      II   -0.07962128  -0.74934137
4  Super      I    -1.48634012   0.19566058
5  Lower     III    1.14577383  -1.09185066
6  Super     II     0.88951251  -0.02418110
7  Lower     III    0.13711621  -1.02686656
8  Super      I     0.27011965   1.26320650
9  Medium    III    0.16775174  -1.92041942
10 Medium    III   -0.15766279   1.26627694
11 Medium     I   -1.23267080   -0.93831033
12 Medium    II    0.38065869   2.09701663
13 Medium     I   -1.45391083  -0.08486117
14 Lower     III   0.80940837  -1.06338634
15 Medium     II   0.20411080  -0.29534513
16 Lower     III   0.59453629   2.64966638
17 Medium     III  0.31227512   1.68916757
18 Lower      I    2.89731076   0.96783335
19 Super     III  -0.06000641   0.58903660
20 Lower     III   0.92520811  -1.03121594
21 Medium    III   1.85323653  -1.33632487
22 Medium     II   1.13713484  -1.27496569
23 Super      I    0.52744948   0.28164512
24 Lower      I    0.17266053   0.57324301
25 Lower     II    2.67321967  -1.80427360

Find the maximum of each column if some columns are categorical

Using numcolwise function from plyr package to find the maximum of each numerical column if some columns are categorical in data frame df2 −

factor1<-sample(c("Super","Lower","Medium"),25,replace=TRUE)
factor2<-sample(c("I","II","III"),25,replace=TRUE)
v1<-rnorm(25)
v2<-rnorm(25)
df2<-data.frame(factor1,factor2,v1,v2)
library(plyr)
numcolwise(which.max)(df2)

Output

  v1 v2
1 18 16

Advertisements