Create New Column with Row Sums in R Data Frame

Nizamuddin Siddiqui
Updated on 11-Aug-2020 09:12:07

1K+ Views

In data analysis, there are many situations we have to deal with and one of them is creating a new column that has the row sums of only some rows. These sums will be repeated so that we get the total number of values equal to the number of rows in the data frame. We can use rowSums with rep function to create such type of columns.ExampleConsider the below data frame −> set.seed(99) > x1 x2 x3 x4 x5 df df x1 x2 x3 x4 x5 1 0.7139625 4 9.321058 0.33297863 4 2 0.9796581 2 4.298837 -1.47926432 11 3 0.5878287 ... Read More

Deal with Error: Undefined Columns Selected in R Data Frame Subsetting

Nizamuddin Siddiqui
Updated on 11-Aug-2020 09:03:47

52K+ Views

The error “undefined columns selected when subsetting data frame” means that R does not understand the column that you want to use while subsetting the data frame. Generally, this happens when we forget to use comma while subsetting with single square brackets.ExampleConsider the below data frame −> set.seed(99) > x1 x2 x3 x4 x5 df df x1 x2 x3 x4 x5 1 0.7139625 4 9.321058 0.33297863 4 2 0.9796581 2 4.298837 -1.47926432 11 3 0.5878287 3 7.389898 -0.07847958 5 4 0.9438585 4 7.873764 -1.35241100 6 5 0.1371621 2 5.534758 -1.17969925 4 6 0.6226740 4 8.786676 -1.15705659 5 7 -0.3638452 1 ... Read More

Combine Lists in R

Nizamuddin Siddiqui
Updated on 11-Aug-2020 08:56:18

378 Views

When we have multiple lists but they have similar type of data then we might want to combine or merge those lists. This will be helpful to use because we can perform the calculations using one list name instead of applying them on multiple ones. We can combine multiple lists with the help of mapply function.ExampleConsider the below lists −> List1 List1 [[1]]   [1] "a" "b" "c" "d" "e" [[2]]   [1]  1   2   3   4   5 [[3]]   [1]  5   4   3   2   1 [[4]]   [1] 25 [[5]]   ... Read More

Find Unique Values in a Column of an R Data Frame

Nizamuddin Siddiqui
Updated on 11-Aug-2020 08:48:12

11K+ Views

Categorical variables have multiple categories but if the data set is large and the categories are also large in numbers then it becomes a little difficult to recognize them. Therefore, we can extract unique values for categorical variables that will help us to easily recognize the categories of a categorical variable. We can do this by using unique for every column of an R data frame.ExampleConsider the below data frame −> x1 x2 x3 x4 df df    x1 x2  x3     x4 1  A  5 India     a 2  A  5 India     b 3  A ... Read More

Extract Unique Combinations of Two or More Variables in R Data Frame

Nizamuddin Siddiqui
Updated on 11-Aug-2020 08:41:56

10K+ Views

An R data frame can have a large number of categorical variables and these categorical form different combinations. For example, one value of a variable could be linked with two or more values of the other variable. Also, one categorical variable can have all unique categories. We can find this unique combination for as many variables as we want and it can be done with the help of unique function.ExampleConsider the below data frame −> x1 x2 x3 x4 df df x1 x2 x3 x4 1 1 A a 5 2 2 A b 5 3 3 A c 10 ... Read More

Create Data Frame with One or More Columns as a List in R

Nizamuddin Siddiqui
Updated on 11-Aug-2020 08:37:40

131 Views

Creating a data frame with a column as a list is not difficult but we need to use I with the list so that the list elements do not work as an individual column. Here, you will find the common method to create a list which is incorrect if we want to insert that list in our data, also the correct method is mentioned at the end.The incorrect way −Example> x1 x2 df df      x1 c.1..1. c.2..2. c.3..3. c.4..4. c.5..5. c.6..6. c.7..7. c.8..8. c.9..9. 1     1    1       2       3 ... Read More

Create Bar Graph Using ggplot2 Without Gridlines and Y-Axis Labels in R

Nizamuddin Siddiqui
Updated on 11-Aug-2020 08:12:38

415 Views

A bar graph plotted with ggplot function of ggplot2 shows horizontal and vertical gridlines. If we are interested only in the bar heights then we might prefer to remove the horizontal gridlines. In this way, we can have X-axis that helps us to look at the different categories we have in our variable of interest and get rid of the unnecessary information. This can be done by setting breaks argument to NULL in scale_y_discrete function.ExampleConsider the below data frame −> x y df library(ggplot2)Creating the plot with all gridlines −> ggplot(df, aes(x, y))+ + geom_bar(stat='identity')OutputCreating the plot without horizontal gridlines ... Read More

Convert Data Frame to Data Table in R

Nizamuddin Siddiqui
Updated on 11-Aug-2020 08:10:10

1K+ Views

Since operations with data.table are sometimes faster than the data frames, we might want to convert a data frame to a data.table object. The main difference between data frame and data.table is that data frame is available in the base R but to use data.table we have to install the package data.table. We can do this with the help setDT function in the data.table package.ExampleConsider the below data frame −> set.seed(1) > x1 x2 x3 x4 x5 df df x1 x2 x3 x4 x5 1  -0.1264538 1.7189774 2 6 9.959193 2   0.6836433  1.5821363 3 4 7.477968 3  -0.3356286 ... Read More

Change Axes Labels Using Plot Function in R

Nizamuddin Siddiqui
Updated on 11-Aug-2020 08:03:29

312 Views

In a plot, the axes labels help us to understand the range of the variables for which the plot is created. While creating a plot in R using plot function, the axes labels are automatically chosen but we can change them. To do this, firstly we have to remove the axes then add each of the axes with the labels we want and then create the box for the plot.ExampleConsider the below data −> x y plot(x, y)OutputChanging the axes labels for X and Y axes −> plot(x, y, axes=FALSE)+ + axis(side = 1, at = c(2, 5, 10))+ + ... Read More

Get Row or Column Index by Name in R

Nizamuddin Siddiqui
Updated on 11-Aug-2020 08:00:43

787 Views

We might prefer to use row index or column index during the analysis instead of using their numbers, therefore, we can get them with the help of grep function. While dealing with a large data set it becomes helpful because large data sets have large number of rows and columns so it is easier to recall them with their indexes instead of numbers. Specifically, column indexes are needed, on the other hand, rows are required in special cases only such as analysing a particular case.ExampleConsider the below data frame −> set.seed(1) > x1 x2 x3 x4 x5 df head(df, 20) ... Read More

Advertisements