An inner join return only the rows in which the left table have matching keys in the right table and an outer join returns all rows from both tables, join records from the left which have matching keys in the right table. This can be done by using merge function.ExampleInner Join> df1 = data.frame(CustomerId = c(1:5), Product = c(rep("Biscuit", 3), rep("Cream", 2))) > df1 CustomerId Product 1 1 Biscuit 2 2 Biscuit 3 3 Biscuit 4 4 Cream 5 5 Cream > df2 = data.frame(CustomerId = c(2, 5, 6), City = c(rep("Chicago", 2), rep("NewYorkCity", 1))) > df2 CustomerId City ... Read More
The use of set.seed is to make sure that we get the same results for randomization. If we randomly select some observations for any task in R or in any statistical software it results in different values all the time and this happens because of randomization. If we want to keep the values that are produced at first random selection then we can do this by storing them in an object after randomization or we can fix the randomization procedure so that we get the same results all the time.ExampleRandomization without set.seed> sample(1:10) [1] 4 10 5 3 1 6 ... Read More
This can be done by using list function.Example> df1
We can do this by using filter and grepl function of dplyr package.ExampleConsider the mtcars data set.> data(mtcars) > head(mtcars) mpg cyl disp hp drat wt qsec vs am gear carb Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 Valiant 18.1 ... Read More
This can be done by using theme argument in ggplot2Example> df df x y 1 long text label a -0.8080940 2 long text label b 0.2164785 3 long text label c 0.4694148 4 long text label d 0.7878956 5 long text label e -0.1836776 6 long text label f 0.7916155 7 long text label g 1.3170755 8 long text label h 0.4002917 9 long text label i 0.6890988 10 long text label j 0.6077572Plot is created as follows −> library(ggplot2) > ggplot(df, aes(x=x, y=y)) + geom_point() + theme(text = element_text(size=20), axis.text.x = element_text(angle=90, hjust=1))
The easiest way to do it is by using select_if function of dplyr package but we can also do it through lapply.Using dplyr> df df X1 X2 X3 X4 X5 1 1 11 21 a k 2 2 12 22 b l 3 3 13 23 c m 4 4 14 24 d n 5 5 15 25 e o 6 6 16 26 f p 7 7 17 27 g q 8 8 18 28 h r 9 9 19 29 i s 10 10 20 30 j t >library("dplyr") > select_if(df, is.numeric) X1 X2 X3 1 1 11 21 2 2 12 22 3 3 13 23 4 4 14 24 5 5 15 25 6 6 16 26 7 7 17 27 8 8 18 28 9 9 19 29 10 10 20 30Using lapply> numeric_only df[ , numeric_only] X1 X2 X3 1 1 11 21 2 2 12 22 3 3 13 23 4 4 14 24 5 5 15 25 6 6 16 26 7 7 17 27 8 8 18 28 9 9 19 29 10 10 20 30
We can do this by setting the column to NULLExample> library(data.table) > df data_table data_table[, x:=NULL] > data_table numbers 1: 1 2: 2 3: 3 4: 4 5: 5 6: 6 7: 7 8: 8 9: 9 10: 10To delete two columns> df Data_table Data_table numbers 1: 0 2: 1 3: 2 4: 3 5: 4 6: 5 7: 6 8: 7 9: 8 10: 9
This can be done by using scale function.Example> data data x y 1 49.57542 2.940931 2 49.51565 2.264866 3 50.70819 2.918803 4 49.09796 2.416676 5 49.90089 2.349696 6 49.03445 3.883145 7 51.29564 4.072614 8 49.11014 3.526852 9 49.41255 3.320530 10 49.42131 3.033730 > standardized_data standardized_data x y [1,] -0.1774447 -0.20927607 [2,] -0.2579076 -1.28232321 [3,] 1.3476023 -0.24439768 [4,] -0.8202493 -1.04137095 [5,] 0.2607412 -1.14768085 [6,] -0.9057468 1.28619932 [7,] 2.1384776 1.58692277 [8,] -0.8038439 0.72069363 [9,] -0.3967165 0.39321942 [10,] -0.3849124 -0.06198639 attr(,"scaled:center") x y 49.707220 3.072784 attr(,"scaled:scale") x y 0.7427788 0.6300430
It can be done by using weekdays function.Example< df = data.frame(date=c("2020-07-01", "2020-08-10", "2020-11-15")) < df$day
We can use rm remove all or few objects.Example< x>-rnorm(100,0.5) < y>-1:100 < z>-rpois(100,5) < a>-rep(1:5,20)To remove all objects> rm(list=ls()) ls() character(0)To remove all except a> rm(list=setdiff(ls(), "a")) > ls() [1] "a"To remove all except x and a> rm(list=ls()[! ls() %in% c("x","a")]) ls() [1] "a" "x"