- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to convert all words of a string or categorical variable in an R data frame to uppercase?
Most of the times the format of the data we get is not we are looking for therefore, we need to change that according to our need. When the levels of categorical variables are represented by words instead of numbers then we can convert those levels to lowercase or to uppercase. Sometimes, this is done just to make the information look user friendly. Mostly, we find that the values are in lowercase, so we can convert it to the upper case with the help of sapply function.
Example
Consider the below data frame −
> x1<-letters[1:20] > x2<-20:1 > x3<-rep(c("india","china","usa","saudi arabia","jordan"),times=4) > df<-data.frame(x1,x2,x3) > df x1 x2 x3 1 a 20 india 2 b 19 china 3 c 18 usa 4 d 17 saudi arabia 5 e 16 jordan 6 f 15 india 7 g 14 china 8 h 13 usa 9 i 12 saudi arabia 10 j 11 jordan 11 k 10 india 12 l 9 china 13 m 8 usa 14 n 7 saudi arabia 15 o 6 jordan 16 p 5 india 17 q 4 china 18 r 3 usa 19 s 2 saudi arabia 20 t 1 jordan > df_new<-as.data.frame(sapply(df, toupper)) > df_new x1 x2 x3 1 A 20 INDIA 2 B 19 CHINA 3 C 18 USA 4 D 17 SAUDI ARABIA 5 E 16 JORDAN 6 F 15 INDIA 7 G 14 CHINA 8 H 13 USA 9 I 12 SAUDI ARABIA 10 J 11 JORDAN 11 K 10 INDIA 12 L 9 CHINA 13 M 8 USA 14 N 7 SAUDI ARABIA 15 O 6 JORDAN 16 P 5 INDIA 17 Q 4 CHINA 18 R 3 USA 19 S 2 SAUDI ARABIA 20 T 1 JORDAN
Let’s have a look at one more example, where we have first letter of second variable in upper case −
> y1<-letters[26:7] > y2<-rep(c("Statistics","Biology","Psychology","Marketing","Physics"),each=4) > y3<-rep(c(2,4,6,8),times=5) > df_y<-data.frame(y1,y2,y3) > df_y y1 y2 y3 1 z Statistics 2 2 y Statistics 4 3 x Statistics 6 4 w Statistics 8 5 v Biology 2 6 u Biology 4 7 t Biology 6 8 s Biology 8 9 r Psychology 2 10 q Psychology 4 11 p Psychology 6 12 o Psychology 8 13 n Marketing 2 14 m Marketing 4 15 l Marketing 6 16 k Marketing 8 17 j Physics 2 18 i Physics 4 19 h Physics 6 20 g Physics 8 > df_y_new<-as.data.frame(sapply(df_y, toupper)) > df_y_new y1 y2 y3 1 Z STATISTICS 2 2 Y STATISTICS 4 3 X STATISTICS 6 4 W STATISTICS 8 5 V BIOLOGY 2 6 U BIOLOGY 4 7 T BIOLOGY 6 8 S BIOLOGY 8 9 R PSYCHOLOGY 2 10 Q PSYCHOLOGY 4 11 P PSYCHOLOGY 6 12 O PSYCHOLOGY 8 13 N MARKETING 2 14 M MARKETING 4 15 L MARKETING 6 16 K MARKETING 8 17 J PHYSICS 2 18 I PHYSICS 4 19 H PHYSICS 6 20 G PHYSICS 8
Advertisements