Articles on Trending Technologies

Technical articles with clear explanations and examples

How to subset rows that do not contain NA and blank in one of the columns in an R data frame?

Nizamuddin Siddiqui
Nizamuddin Siddiqui
Updated on 12-Aug-2020 5K+ Views

It is possible that we get data sets where a column contains NA as well as blank, therefore, it becomes necessary to deal with these values. One of the ways to deal with these values is selecting the rows where we do not have them. This can be done by subsetting through single square brackets.ExampleConsider the below data frame −> set.seed(1) > x1 x2 x3 df df x1 x2 x3 1 4 1 5 2 39 5 3 1 3 5 4 34 4 5 5 23 1 6 43 7 14 3 8 18 ...

Read More

How to convert a list to matrix in R?

Nizamuddin Siddiqui
Nizamuddin Siddiqui
Updated on 12-Aug-2020 12K+ Views

If we have a list that contain vectors having even number of elements in total then we can create a matrix of those elements. example, if a list contain 8 vectors and the total number of elements in those 8 vectors is 100 or any other multiple of 2 then we can create a matrix of those elements. This can be done by using unlist function inside matrix function.ExampleConsider the below list x −> x x [[1]]   [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 [[2]]   [1] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 [[3]]   [1] 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 [[4]]   [1] 76 77 78 79 80  81 82 83 84 85 86 87 88 89 90 91 92 93 94  [20] 95 96 97 98 99 100 [[5]]  [1] 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 [20] 120 121 122 123 124 125 [[6]]  [1] 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 [20] 145 146 147 148 149 150 [[7]]  [1] 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 [20] 170 171 172 173 174 175 [[8]]  [1] 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 [20] 195 196 197 198 199 200 > Matrix_x Matrix_x      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]  [1,]   1   2   3    4    5    6    7    8    9   10  [2,]  11  12  13   14   15   16   17   18   19   20  [3,]  21  22  23   24   25   26   27   28   29   30  [4,]  31  32  33   34   35   36   37   38   39   40  [5,]  41  42  43   44   45   46   47   48   49   50  [6,]  51  52  53   54   55   56   57   58   59   60  [7,]  61  62  63   64   65   66   67   68   69   70  [8,]  71  72  73   74   75   76   77   78   79   80  [9,]  81  82  83   84   85   86   87   88   89   90 [10,]  91  92  93   94   95   96   97   98   99  100 [11,] 101 102 103  104  105  106  107  108  109  110 [12,] 111 112 113  114  115  116  117  118  119  120 [13,] 121 122 123  124  125  126  127  128  129  130 [14,] 131 132 133  134  135  136  137  138  139  140 [15,] 141 142 143  144  145  146  147  148  149  150 [16,] 151 152 153  154  155  156  157  158  159  160 [17,] 161 162 163  164  165  166  167  168  169  170 [18,] 171 172 173  174  175  176  177  178  179  180 [19,] 181 182 183  184  185  186  187  188  189  190 [20,] 191 192 193  194  195  196  197  198  199  200 > Matrix_x Matrix_x [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]  [1,]  1  21 41 61 81 101 121 141 161 181  [2,]  2  22 42 62 82 102 122 142 162 182  [3,]  3  23 43 63 83 103 123 143 163 183  [4,]  4  24 44 64 84 104 124 144 164 184  [5,]  5  25 45 65 85 105 125 145 165 185  [6,]  6  26 46 66 86 106 126 146 166 186  [7,]  7  27 47 67 87 107 127 147 167 187  [8,]  8  28 48 68 88 108 128 148 168 188  [9,]  9  29 49 69 89 109 129 149 169 189 [10,] 10  30 50 70 90 110 130 150 170 190 [11,] 11  31 51 71 91 111 131 151 171 191 [12,] 12  32 52 72 92 112 132 152 172 192 [13,] 13  33 53 73 93 113 133 153 173 193 [14,] 14  34 54 74 94 114 134 154 174 194 [15,] 15  35  55 75 95 115 135 155 175 195 [16,] 16  36 56 76 96 116 136 156 176 196 [17,] 17  37  57 77 97 117 137 157 177 197 [18,] 18  38  58 78 98 118 138 158 178 198 [19,] 19  39  59 79 99 119 139 159 179 199 [20,] 20  40  60 80 100 120 140 160 180 200

Read More

How to import csv file data from Github in R?

Nizamuddin Siddiqui
Nizamuddin Siddiqui
Updated on 12-Aug-2020 2K+ Views

If you have a csv file on Github then it can be directly imported in R by using its URL but make sure that you click on Raw option on Github page where the data is stored. Many people do not click on Raw option therefore they read HTML instead of CSV and get confused. Here, I am sharing a public data set that contains the list of data sets. This data set has 12 variables. Now let’s import it −> Data str(Data) 'data.frame': 57 obs. of 12 variables: $ Dataset.Name : Factor w/ 57 levels " ", "2008 Election ...

Read More

How to change the automatic sorting of X-axis of a bar plot using ggplot2 in R?

Nizamuddin Siddiqui
Nizamuddin Siddiqui
Updated on 12-Aug-2020 1K+ Views

If there is a category for which the frequency is significantly different from others then the X-axis labels of the bar plot using ggplot2 are automatically sorted to present the values alternatively. We might want to keep the original sequence of categories that is available in the categorical variable. Therefore, we can store the categorical variable as a factor and then create the bar plot.ExampleConsider the below data frame −> Group Frequency df df Group Frequency 1 India 12 2 USA 18 3 UK 35 4 Germany 20 > ...

Read More

How to create a vector with dates between two dates in R?

Nizamuddin Siddiqui
Nizamuddin Siddiqui
Updated on 12-Aug-2020 2K+ Views

Create a vector with dates is not an easy task but with help of seq and as.Date it becomes easy in R. With the help of these functions we can create a vector in R that contain dates between two dates. But this cannot be done in reverse order, for example, if we want to have future date as first element of the vector then it would not be possible.Example> V1 V1  [1] "2020-01-01" "2020-01-02" "2020-01-03" "2020-01-04" "2020-01-05"  [6] "2020-01-06" "2020-01-07" "2020-01-08" "2020-01-09" "2020-01-10" [11] "2020-01-11" "2020-01-12" "2020-01-13" "2020-01-14" "2020-01-15" [16] "2020-01-16" "2020-01-17" "2020-01-18" "2020-01-19" "2020-01-20" [21] "2020-01-21" "2020-01-22" "2020-01-23" ...

Read More

How to convert row index number or row index name of an R data frame to a vector?

Nizamuddin Siddiqui
Nizamuddin Siddiqui
Updated on 12-Aug-2020 546 Views

We might want to extract row index irrespective of its type (whether numeric or string) to do some calculations if it is incorrectly set as a row index. It happens during the data collection process or incorrect processing of data. Also, since row indexes are helpful to access row we must have proper names to them instead of values that might makes confusion. For example, if a data frame has row indexes as 43, 94, etc. then it might be confusing. Therefore, we should convert row indexes to a vector or a column if required.ExampleConsider the below data frame (Here, ...

Read More

How to create a scatterplot in R using ggplot2 with transparency of points?

Nizamuddin Siddiqui
Nizamuddin Siddiqui
Updated on 12-Aug-2020 365 Views

A scatterplot is used to observe the relationship between two continuous variables. If the sample size is large then the points on the plot lie on each other and does not look appealing. Also, the interpretation of such type of scatterplots is not an easy task, therefore, we can increase the transparency of points on the plot to make it more appealing. We can do this by using alpha argument in geom_point of ggplot2.ExampleConsider the below data frame −> set.seed(123) > x y df library(ggplot2) > ggplot(df, aes(x, y))+geom_point()Output> ggplot(df, aes(x, y))+geom_point(alpha=0.10)Output> ggplot(df, aes(x, y))+geom_point(alpha=0.05)Output

Read More

How to find the mean of each variable using dplyr by factor variable with ignoring the NA values in R?

Nizamuddin Siddiqui
Nizamuddin Siddiqui
Updated on 12-Aug-2020 466 Views

If there are NA’s in our data set for multiple values of numerical variables with the grouping variable then using na.rm = FALSE needs to be performed multiple times to find the mean or any other statistic for each of the variables with the mean function. But we can do it with summarise_all function of dplyr package that will result in the mean of all numerical variables in just two lines of code.ExampleLoading dplyr package −> library(dplyr)Consider the ToothGrowth data set in base R −> str(ToothGrowth) 'data.frame': 60 obs. of 3 variables: $ len : num 4.2 11.5 7.3 5.8 ...

Read More

How to join two data frames with the same row order using dplyr in R?

Nizamuddin Siddiqui
Nizamuddin Siddiqui
Updated on 12-Aug-2020 233 Views

When we have one common column in two data frames then joining of those data frames might be used to create a bigger data frame. This will help us to analyze a combined data set with many characteristics. We can do this by using inner_join function of dplyr package.ExampleConsider the below data frames −> set.seed(111) > x1 R1 df1 df1 x1 R1  1 1  78  2 2  84  3 3  83  4 4  47  5 5  25  6 1  59  7 2  69  8 3  35  9 4  72 10 5  26 11 1  49 12 2  45 13 3  74 14 4   8 15 5 100 16 1  96 17 2  24 18 3  48 19 4  95 20 5   7 > x1 R2 df2 df2 x1 R2 1 1 21 2 2 15 3 1 1 4 2 9 5 1 63 6 2 40 7 1 25 8 2 35 9 1 71 10 2 52Loading dplyr package −> library(dplyr)Merging two data frames −> inner_join(df2,df1) Joining, by = "x1" x1 R2 R1 1 1 21 78 2 1 21 59 3 1 21 49 4 1 21 96 5 2 15 84 6 2 15 69 7 2 15 45 8 2 15 24 9 1 1 78 10 1 1 59 11 1 1 49 12 1 1 96 13 2 9 84 14 2 9 69 15 2 9 45 16 2 9 24 17 1 63 78 18 1 63 59 19 1 63 49 20 1 63 96 21 2 40 84 22 2 40 69 23 2 40 45 24 2 40 24 25 1 25 78 26 1 25 59 27 1 25 49 28 1 25 96 29 2 35 84 30 2 35 69 31 2 35 45 32 2 35 24 33 1 71 78 34 1 71 59 35 1 71 49 36 1 71 96 37 2 52 84 38 2 52 69 39 2 52 45 40 2 52 24

Read More

How to convert multiple numerical variables to factor variable in R?

Nizamuddin Siddiqui
Nizamuddin Siddiqui
Updated on 12-Aug-2020 2K+ Views

Sometimes the data type for a variable is not correct and it is very common that a factor variable is read as a numeric variable, especially in cases where factor levels are represented by numbers. If we do not change the data type of a factor variable then the result of the analysis will be incorrect. Therefore, if a factor variable has a different data type than factor then it must be converted to factor data type. To convert multiple variables to factor type, we can create a vector that will have the name of all factor variables then using ...

Read More
Showing 39821–39830 of 61,248 articles
Advertisements