Programming Articles - Page 1805 of 3363

How to change a column in an R data frame with some conditions?

Nizamuddin Siddiqui
Updated on 11-Aug-2020 14:07:43

579 Views

Sometimes, the column value of a particular column has some relation with another column and we might need to change the value of that particular column based on some conditions. We need to make this change to check how the change in the values of a column can make an impact on the relationship between the two columns under consideration. In R, we can use single square brackets to make the changes in the column values.ExampleConsider the below data frame −> set.seed(1) > x1 x2 x3 df df x1 x2 x3 1 4 4 4.462839 2 4 1 3.941181 3 ... Read More

How to increase the length of an R data frame by repeating the number of rows?

Nizamuddin Siddiqui
Updated on 11-Aug-2020 13:43:01

678 Views

If we strongly believe that new data collection will result in the same type of data then we might want to stretch our data frame in R with more rows. Although, this is not recommended because we lose unbiasedness in the data due to this process but it is done to save time and money that will be invested in new data collection. In R, we can use rep with seq_len function to repeat the number of rows of an R data frame.ExampleConsider the below data frame −> x1 x2 df df x1 x2 1 Fruits 2 2 Vegetables 5 ... Read More

How to find group-wise summary statistics for an R data frame?

Nizamuddin Siddiqui
Updated on 11-Aug-2020 13:37:38

513 Views

To compare different groups, we need the summary statistics for each of the groups. It helps us to observe the differences between the groups. The summary statistics provides the minimum value, first quartile, median, third quartile, and the maximum values. Therefore, we can compare each of these values for the groups. To find the group-wise summary statistics for an R data frame, we can use tapply function.ExampleConsider the below data frame −> set.seed(99) > x1 x2 df head(df, 20) x1 x2 1 48 G1 2 33 G2 3 44 G3 4 22 G4 5 99 G5 6 62 G1 7 ... Read More

How to find the sum of column values of an R data frame?

Nizamuddin Siddiqui
Updated on 11-Aug-2020 13:24:05

546 Views

An R data frame contain columns that might represent a similar type of variables; therefore, we might want to find the sum of the values for each of the columns and make a comparison based on the sum. This can be done with the help of sum function but first we need to extract the columns to find the sum.ExampleConsider the below data frame −> set.seed(1) > x1 x2 x3 x4 x5 x6 x7 df df x1 x2 x3 x4 x5 x6 x7 1 -0.62645381 1.41897737 0.83547640 3.9016178 1.4313313 1.879633 2.494043 2 0.18364332 1.28213630 0.74663832 1.4607600 1.8648214 2.542116 4.343039 3 ... Read More

How to convert all words of a string or categorical variable in an R data frame to uppercase?

Nizamuddin Siddiqui
Updated on 11-Aug-2020 13:19:16

363 Views

Most of the times the format of the data we get is not we are looking for therefore, we need to change that according to our need. When the levels of categorical variables are represented by words instead of numbers then we can convert those levels to lowercase or to uppercase. Sometimes, this is done just to make the information look user friendly. Mostly, we find that the values are in lowercase, so we can convert it to the upper case with the help of sapply function.ExampleConsider the below data frame −> x1 x2 x3 df df   x1 x2 ... Read More

How to select rows with group wise minimum or maximum values of a variable in an R data frame using dplyr?

Nizamuddin Siddiqui
Updated on 11-Aug-2020 13:15:01

2K+ Views

If an R data frame contains a group variable that has many group levels then finding the minimum and maximum values of a discrete or continuous variable based on the group levels becomes difficult. But this can be done with slice function in dplyr package.Consider the below data frame that has one group variable and continuous as well as discrete variables −> set.seed(2) > x1 x2 x3 x4 x5 x6 x7 Group df df x1 x2 x3 x4 x5 x6 x7 Group 1 85 8 14 7 8 2.900301 749 1 2 79 7 12 4 3 3.331022 200 2 ... Read More

How to split a data frame in R into multiple parts randomly?

Nizamuddin Siddiqui
Updated on 11-Aug-2020 13:01:29

2K+ Views

When a data frame is large, we can split it into multiple parts randomly. This might be required when we want to analyze the data partially. We can do this with the help of split function and sample function to select the values randomly.ExampleConsider the trees data in base R −> str(trees) 'data.frame': 31 obs. of 3 variables: $ Girth : num 8.3 8.6 8.8 10.5 10.7 10.8 11 11 11.1 11.2 ... $ Height: num 70 65 63 72 81 83 66 75 80 75 ... $ Volume: num 10.3 10.3 10.2 16.4 18.8 19.7 15.6 18.2 22.6 19.9 ... Read More

How to convert empty values to NA in an R data frame?

Nizamuddin Siddiqui
Updated on 11-Aug-2020 12:58:17

751 Views

When our data has empty values then it is difficult to perform the analysis, we might to convert those empty values to NA so that we can understand the number of values that are not available. This can be done by using single square brackets.ExampleConsider the below data frame that has some empty values −> x1 x2 x3 df df x1 x2 x3 1 1 2 5 2 2 2 5 3 3 2 4 4 1 2 4 5 2 4 4 6 3 4 4 7 1 4 4 8 2 4 2 9 3 2 10 1 2 11 2 12 3 13 1 4 14 2 4 15 3 4 16 4 17 18 19 2 20 1Converting empty values to NA −> df[df == ""] df x1 x2 x3 1 1 2 5 2 2 2 5 3 3 2 4 4 1 2 4 5 2 4 4 6 3 4 4 7 1 4 4 8 2 4 2 9 3 2 10 1 2 11 2 12 3 13 1 4 14 2 4 15 3 4 16 4 17 18 19 2 20 1

How to remove empty rows from an R data frame?

Nizamuddin Siddiqui
Updated on 11-Aug-2020 12:54:43

730 Views

During the survey or any other medium of data collection, getting all the information from all units is not possible. Sometimes we get partial information and sometimes nothing. Therefore, it is possible that some rows in our data are completely blank and some might have partial data. The blank rows can be removed and the other empty values can be filled with methods that helps to deal with missing information.ExampleConsider the below data frame, it has some missing rows and some missing values −> x1 x2 x3 df df x1 x2 x3 1 1 2 5 2 2 2 5 ... Read More

How to select columns in R based on the string that matches with the column name using dplyr?

Nizamuddin Siddiqui
Updated on 11-Aug-2020 12:48:03

1K+ Views

Selection of columns in R is generally done with the column number or its name with $ delta operator. We can also select the columns with their partial name string or complete name as well without using $ delta operator. This can be done with select and matches function of dplyr package.ExampleLoading dplyr package −> library(dplyr)Consider the BOD data in base R −> str(BOD) 'data.frame': 6 obs. of 2 variables: $ Time : num 1 2 3 4 5 7 $ demand: num 8.3 10.3 19 16 15.6 19.8 - attr(*, "reference")= chr "A1.4, p. 270"Selecting the column of BOD ... Read More

Advertisements