Inner Join and Outer Join of Two Data Frames in R

Nizamuddin Siddiqui
Updated on 06-Jul-2020 15:00:12

844 Views

An inner join return only the rows in which the left table have matching keys in the right table and an outer join returns all rows from both tables, join records from the left which have matching keys in the right table. This can be done by using merge function.ExampleInner Join> df1 = data.frame(CustomerId = c(1:5), Product = c(rep("Biscuit", 3), rep("Cream", 2))) > df1   CustomerId Product 1 1 Biscuit 2 2 Biscuit 3 3 Biscuit 4 4 Cream 5 5 Cream > df2 = data.frame(CustomerId = c(2, 5, 6), City = c(rep("Chicago", 2), rep("NewYorkCity", 1))) > df2 CustomerId City ... Read More

Why We Should Use set.seed() in R

Nizamuddin Siddiqui
Updated on 06-Jul-2020 14:58:52

5K+ Views

The use of set.seed is to make sure that we get the same results for randomization. If we randomly select some observations for any task in R or in any statistical software it results in different values all the time and this happens because of randomization. If we want to keep the values that are produced at first random selection then we can do this by storing them in an object after randomization or we can fix the randomization procedure so that we get the same results all the time.ExampleRandomization without set.seed> sample(1:10) [1] 4 10 5 3 1 6 ... Read More

Make List of Data Frames in R

Nizamuddin Siddiqui
Updated on 06-Jul-2020 14:57:34

268 Views

This can be done by using list function.Example> df1

Filter Rows Containing a Certain String in R

Nizamuddin Siddiqui
Updated on 06-Jul-2020 14:55:13

2K+ Views

We can do this by using filter and grepl function of dplyr package.ExampleConsider the mtcars data set.> data(mtcars) > head(mtcars) mpg cyl disp hp drat wt qsec vs am gear carb Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 Valiant 18.1 ... Read More

Change Orientation and Font Size of X-Axis Labels using ggplot2 in R

Nizamuddin Siddiqui
Updated on 06-Jul-2020 14:53:37

996 Views

This can be done by using theme argument in ggplot2Example> df df x y 1 long text label a -0.8080940 2 long text label b 0.2164785 3 long text label c 0.4694148 4 long text label d 0.7878956 5 long text label e -0.1836776 6 long text label f 0.7916155 7 long text label g 1.3170755 8 long text label h 0.4002917 9 long text label i 0.6890988 10 long text label j 0.6077572Plot is created as follows −> library(ggplot2) > ggplot(df, aes(x=x, y=y)) + geom_point() + theme(text = element_text(size=20), axis.text.x = element_text(angle=90, hjust=1))

Select Only Numeric Columns from an R Data Frame

Nizamuddin Siddiqui
Updated on 06-Jul-2020 14:51:29

1K+ Views

The easiest way to do it is by using select_if function of dplyr package but we can also do it through lapply.Using dplyr> df df X1 X2 X3 X4 X5 1 1 11 21 a k 2 2 12 22 b l 3 3 13 23 c m 4 4 14 24 d n 5 5 15 25 e o 6 6 16 26 f p 7 7 17 27 g q 8 8 18 28 h r 9 9 19 29 i s 10 10 20 30 j t >library("dplyr") > select_if(df, is.numeric) X1 X2 X3 1 1 11 21 2 2 12 22 3 3 13 23 4 4 14 24 5 5 15 25 6 6 16 26 7 7 17 27 8 8 18 28 9 9 19 29 10 10 20 30Using lapply> numeric_only df[ , numeric_only] X1 X2 X3 1 1 11 21 2 2 12 22 3 3 13 23 4 4 14 24 5 5 15 25 6 6 16 26 7 7 17 27 8 8 18 28 9 9 19 29 10 10 20 30

Delete Columns by Their Name in Data Table in R

Nizamuddin Siddiqui
Updated on 06-Jul-2020 14:50:15

1K+ Views

We can do this by setting the column to NULLExample> library(data.table) > df data_table data_table[, x:=NULL] > data_table numbers 1: 1 2: 2 3: 3 4: 4 5: 5 6: 6 7: 7 8: 8 9: 9 10: 10To delete two columns> df Data_table Data_table numbers 1: 0 2: 1 3: 2 4: 3 5: 4 6: 5 7: 6 8: 7 9: 8 10: 9

Standardize Columns in an R Data Frame

Nizamuddin Siddiqui
Updated on 06-Jul-2020 14:46:24

251 Views

This can be done by using scale function.Example> data data x y 1 49.57542 2.940931 2 49.51565 2.264866 3 50.70819 2.918803 4 49.09796 2.416676 5 49.90089 2.349696 6 49.03445 3.883145 7 51.29564 4.072614 8 49.11014 3.526852 9 49.41255 3.320530 10 49.42131 3.033730 > standardized_data standardized_data x y [1,] -0.1774447 -0.20927607 [2,] -0.2579076 -1.28232321 [3,] 1.3476023 -0.24439768 [4,] -0.8202493 -1.04137095 [5,] 0.2607412 -1.14768085 [6,] -0.9057468 1.28619932 [7,] 2.1384776 1.58692277 [8,] -0.8038439 0.72069363 [9,] -0.3967165 0.39321942 [10,] -0.3849124 -0.06198639 attr(,"scaled:center") x y 49.707220 3.072784 attr(,"scaled:scale") x y 0.7427788 0.6300430

Find Day of the Week from Data Frame in R

Nizamuddin Siddiqui
Updated on 06-Jul-2020 14:43:52

252 Views

It can be done by using weekdays function.Example< df = data.frame(date=c("2020-07-01", "2020-08-10", "2020-11-15")) < df$day

Remove All Objects Except One or Few in R

Nizamuddin Siddiqui
Updated on 06-Jul-2020 14:42:16

3K+ Views

We can use rm remove all or few objects.Example< x>-rnorm(100,0.5) < y>-1:100 < z>-rpois(100,5) < a>-rep(1:5,20)To remove all objects> rm(list=ls()) ls() character(0)To remove all except a> rm(list=setdiff(ls(), "a")) > ls() [1] "a"To remove all except x and a> rm(list=ls()[! ls() %in% c("x","a")]) ls() [1] "a" "x"

Advertisements