Server Side Programming Articles - Page 1657 of 2646

How to remove empty rows from an R data frame?

R Programming Server Side Programming Programming

Updated on 11-Aug-2020 12:54:43

746 Views

During the survey or any other medium of data collection, getting all the information from all units is not possible. Sometimes we get partial information and sometimes nothing. Therefore, it is possible that some rows in our data are completely blank and some might have partial data. The blank rows can be removed and the other empty values can be filled with methods that helps to deal with missing information.ExampleConsider the below data frame, it has some missing rows and some missing values −> x1 x2 x3 df df x1 x2 x3 1 1 2 5 2 2 2 5 ... Read More

How to select columns in R based on the string that matches with the column name using dplyr?

R Programming Server Side Programming Programming

Nizamuddin Siddiqui

Updated on 11-Aug-2020 12:48:03

1K+ Views

Selection of columns in R is generally done with the column number or its name with $ delta operator. We can also select the columns with their partial name string or complete name as well without using $ delta operator. This can be done with select and matches function of dplyr package.ExampleLoading dplyr package −> library(dplyr)Consider the BOD data in base R −> str(BOD) 'data.frame': 6 obs. of 2 variables: $ Time : num 1 2 3 4 5 7 $ demand: num 8.3 10.3 19 16 15.6 19.8 - attr(*, "reference")= chr "A1.4, p. 270"Selecting the column of BOD ... Read More

How to select the first row for each level of a factor variable in an R data frame?

R Programming Server Side Programming Programming

Nizamuddin Siddiqui

Updated on 11-Aug-2020 12:37:42

648 Views

Comparison of rows is an influential part of data analysis, sometimes we compare variable with variable, value with value, case or row with another case or row, or even a complete data set with another data set. This is required to check the accuracy of data values and its consistency therefore we must do it. For this purpose, we need to select the required rows, columns etc. To select the first row for each level of a factor variable we can use duplicated function with ! sign.ExampleConsider the below data frame −> x1 x2 x3 df head(df, 20) x1 ... Read More

How to create line chart for all columns of a data frame a in R?

R Programming Server Side Programming Programming

Nizamuddin Siddiqui

Updated on 11-Aug-2020 12:33:23

281 Views

To check the trend of all columns of a data frame, we need to create line charts for all of those columns. These line charts help us to understand how data points fall or rise for the columns. Once we know the trend, we can try to find the out the reasons behind them and take appropriate actions. We can plot line charts for each of the column by using plot.ts function that plots data as a time series.ExampleConsider the below data frame.> set.seed(1) > x1 x2 x3 x4 x5 x6 df head(df, 20) x1 x2 x3 x4 x5 x6 ... Read More

How to find the index of the minimum and maximum value of a vector in R?

R Programming Server Side Programming Programming

Nizamuddin Siddiqui

Updated on 11-Aug-2020 12:26:53

491 Views

While doing the data exploration in an analytical project, we sometimes need to find the index of some values, mostly the indices of minimum and maximum values to check whether the corresponding data row has some crucial information or we may neglect it. Also, these values sometimes transformed to another values based on the data characteristics if we don’t want to neglect them.Example> x which(x==min(x)) [1] 1 > which(x==max(x)) [1] 25 > set.seed(2) > x1 x1 [1] 85 79 70 6 32 8 17 93 81 76 41 50 75 65 3 80 96 50 55 [20] 63 8 33 ... Read More

How to find the number of days and number of weeks between two dates in R?

R Programming Server Side Programming Programming

Nizamuddin Siddiqui

Updated on 11-Aug-2020 09:20:30

460 Views

In data analysis, time series is one of the common data we have to deal with and it might also contain dates data along with other variables. We might want to find the difference between two times to check how many days or weeks have changed the time series. This can be easily done with the help of difftime function.Example> difftime(strptime("25/07/2021", format = "%d/%m/%Y"), + strptime("25/07/2020", format = "%d/%m/%Y"), units="weeks") Time difference of 52.14286 weeks > difftime(strptime("25.07.2021", format = "%d.%m.%Y"), + strptime("25.07.2020", format = "%d.%m.%Y"), units="weeks") Time difference of 52.14286 weeks > difftime(strptime("25.07.2021", format = "%d.%m.%Y"), + strptime("25.07.2020", format = ... Read More

How to extract the regression coefficients, standard error of coefficients, t scores, and p-values from a regression model in R?

R Programming Server Side Programming Programming

Nizamuddin Siddiqui

Updated on 11-Aug-2020 09:17:33

973 Views

Regression analysis output in R gives us so many values but if we believe that our model is good enough, we might want to extract only coefficients, standard errors, and t-scores or p-values because these are the values that ultimately matters, specifically the coefficients as they help us to interpret the model. We can extract these values from the regression model summary with delta $ operator.ExampleConsider the below data −> set.seed(99) > x1 x2 x3 x4 x5 x6 x7 y Regression_Model summary(Regression_Model) Call: lm(formula = y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7) ... Read More

How to create a new column with a subset of row sums in an R data frame?

R Programming Server Side Programming Programming

Nizamuddin Siddiqui

Updated on 11-Aug-2020 09:12:07

1K+ Views

In data analysis, there are many situations we have to deal with and one of them is creating a new column that has the row sums of only some rows. These sums will be repeated so that we get the total number of values equal to the number of rows in the data frame. We can use rowSums with rep function to create such type of columns.ExampleConsider the below data frame −> set.seed(99) > x1 x2 x3 x4 x5 df df x1 x2 x3 x4 x5 1 0.7139625 4 9.321058 0.33297863 4 2 0.9796581 2 4.298837 -1.47926432 11 3 0.5878287 ... Read More

How to deal with error “undefined columns selected when subsetting data frame” in R?

R Programming Server Side Programming Programming

Nizamuddin Siddiqui

Updated on 11-Aug-2020 09:03:47

53K+ Views

The error “undefined columns selected when subsetting data frame” means that R does not understand the column that you want to use while subsetting the data frame. Generally, this happens when we forget to use comma while subsetting with single square brackets.ExampleConsider the below data frame −> set.seed(99) > x1 x2 x3 x4 x5 df df x1 x2 x3 x4 x5 1 0.7139625 4 9.321058 0.33297863 4 2 0.9796581 2 4.298837 -1.47926432 11 3 0.5878287 3 7.389898 -0.07847958 5 4 0.9438585 4 7.873764 -1.35241100 6 5 0.1371621 2 5.534758 -1.17969925 4 6 0.6226740 4 8.786676 -1.15705659 5 7 -0.3638452 1 ... Read More

How to combine lists in R?

R Programming Server Side Programming Programming

Nizamuddin Siddiqui

Updated on 11-Aug-2020 08:56:18

406 Views

When we have multiple lists but they have similar type of data then we might want to combine or merge those lists. This will be helpful to use because we can perform the calculations using one list name instead of applying them on multiple ones. We can combine multiple lists with the help of mapply function.ExampleConsider the below lists −> List1 List1 [[1]] [1] "a" "b" "c" "d" "e" [[2]] [1] 1 2 3 4 5 [[3]] [1] 5 4 3 2 1 [[4]] [1] 25 [[5]] ... Read More