- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to create random sample based on group columns of a data.table in R?
Random sampling helps us to reduce the biasedness in the analysis. If we have data in groups then we might want to find a random sample based on groups. For example, if we have a data frame with a group variable and each group contains ten values then we might want to create a random sample where we will have two values randomly selected from each group. This can be done by using sample function inside .SD
Example
Consider the below data.table −
library(data.table) Group<-rep(c("A","B","C","D","E"),times=4) Percentage<-sample(1:100,20) dt1<-data.table(Group,Percentage) dt1
Output
Group Percentage 1: A 97 2: B 68 3: C 19 4: D 32 5: E 98 6: A 48 7: B 94 8: C 54 9: D 7 10: E 76 11: A 10 12: B 31 13: C 59 14: D 84 15: E 41 16: A 99 17: B 1 18: C 72 19: D 42 20: E 17
Creating a random sample of size 2 from each group −
Example
dt1[,.SD[sample(.N, min(2,.N))],by=Group]
Output
Group Percentage 1: A 48 2: A 99 3: B 94 4: B 31 5: C 54 6: C 59 7: D 42 8: D 84 9: E 98 10: E 76
Let’s have a look at another example −
Example
Class<-rep(c("First","Second","Third","Fourth"),times=10) Experience<-sample(1:5,40,replace=TRUE) dt2<-data.table(Class,Experience) head(dt2,10)
Output
Class Experience 1: First 4 2: Second 2 3: Third 4 4: Fourth 2 5: First 4 6: Second 5 7: Third 3 8: Fourth 5 9: First 3 10: Second 5
Example
tail(dt2,10)
Output
Class Experience 1: Third 4 2: Fourth 2 3: First 5 4: Second 2 5: Third 1 6: Fourth 4 7: First 5 8: Second 2 9: Third 4 10: Fourth 4
Example
dt2[,.SD[sample(.N, min(5,.N))],by=Class]
Output
Class Experience 1: First 3 2: First 3 3: First 4 4: First 5 5: First 5 6: Second 5 7: Second 2 8: Second 5 9: Second 2 10: Second 1 11: Third 3 12: Third 1 13: Third 4 14: Third 3 15: Third 4 16: Fourth 2 17: Fourth 5 18: Fourth 2 19: Fourth 4 20: Fourth 2
Advertisements