Server Side Programming Articles - Page 1654 of 2650

How to join two data frames with the same row order using dplyr in R?

Nizamuddin Siddiqui
Updated on 12-Aug-2020 12:50:47

203 Views

When we have one common column in two data frames then joining of those data frames might be used to create a bigger data frame. This will help us to analyze a combined data set with many characteristics. We can do this by using inner_join function of dplyr package.ExampleConsider the below data frames −> set.seed(111) > x1 R1 df1 df1 x1 R1  1 1  78  2 2  84  3 3  83  4 4  47  5 5  25  6 1  59  7 2  69  8 3  35  9 4  72 10 5  26 11 1  49 12 2  45 13 3  74 14 4   8 15 5 100 16 1  96 17 2  24 18 3  48 19 4  95 20 5   7 > x1 R2 df2 df2 x1 R2 1 1 21 2 2 15 3 1 1 4 2 9 5 1 63 6 2 40 7 1 25 8 2 35 9 1 71 10 2 52Loading dplyr package −> library(dplyr)Merging two data frames −> inner_join(df2,df1) Joining, by = "x1" x1 R2 R1 1 1 21 78 2 1 21 59 3 1 21 49 4 1 21 96 5 2 15 84 6 2 15 69 7 2 15 45 8 2 15 24 9 1 1 78 10 1 1 59 11 1 1 49 12 1 1 96 13 2 9 84 14 2 9 69 15 2 9 45 16 2 9 24 17 1 63 78 18 1 63 59 19 1 63 49 20 1 63 96 21 2 40 84 22 2 40 69 23 2 40 45 24 2 40 24 25 1 25 78 26 1 25 59 27 1 25 49 28 1 25 96 29 2 35 84 30 2 35 69 31 2 35 45 32 2 35 24 33 1 71 78 34 1 71 59 35 1 71 49 36 1 71 96 37 2 52 84 38 2 52 69 39 2 52 45 40 2 52 24

How to convert multiple numerical variables to factor variable in R?

Nizamuddin Siddiqui
Updated on 12-Aug-2020 12:45:17

1K+ Views

Sometimes the data type for a variable is not correct and it is very common that a factor variable is read as a numeric variable, especially in cases where factor levels are represented by numbers. If we do not change the data type of a factor variable then the result of the analysis will be incorrect. Therefore, if a factor variable has a different data type than factor then it must be converted to factor data type. To convert multiple variables to factor type, we can create a vector that will have the name of all factor variables then using ... Read More

How to create an empty matrix in R?

Nizamuddin Siddiqui
Updated on 12-Aug-2020 12:39:17

5K+ Views

An empty matrix can be created in the same way as we create a regular matrix in R but we will not provide any value inside the matrix function. The number of rows and columns can be different and we don’t need to use byrow or bycol argument while creating an empty matrix because it is not useful since all the values are missing. In R, one column is created by default for a matrix, therefore, to create a matrix without a column we can use ncol =0.Example> M1 M1      [, 1]  [1, ] NA  [2, ] NA ... Read More

How to plot means inside boxplot using ggplot2 in R?

Nizamuddin Siddiqui
Updated on 12-Aug-2020 12:30:27

490 Views

When we create a boxplot, it shows the minimum value, maximum value, first quartile, median, and the third quartile but we might want to plot means as well so that the comparison between factor levels can be made on the basis of means also. To create this type of plot, we first need to find the group-wise means then it can be used with geom_text function of ggplot2.ExampleConsider the CO2 data in base R −> head(CO2, 20) Plant Type Treatment conc uptake 1 Qn1 Quebec nonchilled 95 16.0 2 Qn1 Quebec nonchilled 175 30.4 3 Qn1 Quebec nonchilled 250 34.8 ... Read More

How to create a scatterplot in R with legend position inside the plot area using ggplot2?

Nizamuddin Siddiqui
Updated on 12-Aug-2020 12:27:10

321 Views

Legends help us to differentiate the values of the response variable while creating the scatterplot. In this way, we can understand how one level of a factor variable affects the response variable. The legend is preferred to be positioned at left bottom, top right, top left, and bottom right. We can use theme function to position the legends.ExampleConsider the below data frame −> set.seed(99) > x1 x2 F df library(ggplot2)Creating the plot with different legend positions −Consider the below data frame −> ggplot(df, aes(x=x1, y=x2, colour=F)) + geom_point(aes(colour=F)) + + theme(legend.justification = c(1, 0), legend.position = c(1, 0))Output> ggplot(df, aes(x=x1, ... Read More

How to create a subset for a factor level in an R data frame?

Nizamuddin Siddiqui
Updated on 12-Aug-2020 12:24:52

4K+ Views

In data analysis, we often deal with factor variables and these factor variables have different levels. Sometimes, we want to create subset of the data frame in R for specific factor levels to analyze the data only for that particular level of the factor variable. This can be simply done by using subset function.ExampleConsider the below data frame −> set.seed(99) > Factor Percentage df df   Factor Percentage 1   India 48 2   China 33 3     USA 44 4      UK 22 5  Canada 62 6   India 32 7   China 13 8     ... Read More

How to convert a vector into matrix in R?

Nizamuddin Siddiqui
Updated on 12-Aug-2020 12:20:02

11K+ Views

To convert a vector into matrix, just need to use matrix function. We can also define the number of rows and columns, if required but if the number of values in the vector are not a multiple of the number of rows or columns then R will throw an error as it is not possible to create a matrix for that vector.Here, we will read vectors by their names to make it easy but you can change their names if you want. There are four vectors of different lengths that are shown in these examples −Examples > Vector1 Vector1 [1] ... Read More

How to create stacked bar plot in which each bar sum to 1 or 100% in R?

Nizamuddin Siddiqui
Updated on 12-Aug-2020 12:14:30

412 Views

A stacked bar plot consists multiple bars in one bar, it shows one category for a categorical variable with its levels. Mostly, the stacked bar chart is created with the count of the levels in each category but if we want to create it with percentage for individual categories of the categorical variables then it can be done as well. We can use prop.table function to create the proportion of levels for each category then create the bar plot.ExampleConsider the below data frame −> set.seed(99) > x1 x2 x3 df df x1 x2 x3 1 48 98 68 2 33 ... Read More

How to create a lagged variable in R for groups?

Nizamuddin Siddiqui
Updated on 12-Aug-2020 12:11:53

466 Views

Lagged variable is the type of variable that contains the previous value of the variable for which we want to create the lagged variable and the first value is neglected. Therefore, we will always have one missing value in each of the groups, if we are creating a lagged variable that depends on a grouping variable or factor variable.ExampleConsider the below data frame:> set.seed(2) > Factor Rate df df Factor Rate 1 F1 12 2 F1 54 3 F1 18 4 F1 26 5 F1 14 6 F2 25 7 F2 81 8 F2 47 9 F2 15 10 F2 ... Read More

How to multiple a matrix rows in R with a vector?

Nizamuddin Siddiqui
Updated on 12-Aug-2020 12:08:19

905 Views

When we multiple a matrix with a vector in R, the multiplication is done by column but if we want to do it with rows then we can use transpose function. We can multiply the transpose of the matrix with the vector and then take the transpose of that multiplication this will result in the multiplication by rows.ExampleConsider the below matrix −> M1 M1    [,1] [,2] [,3] [,4] [,5] [1,] 1    6   11   16    21 [2,] 2    7   12   17    22 [3,] 3    8   13   18    23 [4,] 4    9   14   19    24 [5,] 5   10   15   20    25 > V1 M1*V1    [,1] [,2] [,3] [,4] [,5] [1,] 1    6   11   16   21 [2,] 4   14   24   34   44 [3,] 9   24   39   54   69 [4,] 16  36   56   76   96 [5,] 25  50   75  100  125Row-wise Multiplication −> t(t(M1)*V1)    [,1] [,2] [,3] [,4] [,5] [1,] 1   12   33   64  105 [2,] 2   14   36   68 110 [3,] 3   16   39   72 115 [4,] 4   18   42   76 120 [5,] 5   20   45   80 125Let’s have a look at one more example −> M2 M2       [,1] [,2] [,3] [,4] [,5] [1,]   72    5   36   11   76 [2,]   61   38   17   73   25 [3,]   96    9   62   79   64 [4,]   77   53   80   78   50 [5,]   81   15   21   43   23 > V2 V2 [1] 28 20 1 68 86 > t(t(M2)*V2) [,1] [,2] [,3] [,4] [,5] [1,] 2016 100 36 748 6536 [2,] 1708 760 17 4964 2150 [3,] 2688 180 62 5372 5504 [4,] 2156 1060 80 5304 4300 [5,] 2268 300 21 2924 1978

Advertisements