## How to create a data frame of the maximum value for each group in an R data frame using dplyr?

Updated on 10-Aug-2020 14:06:37
Sometimes subsetting of group wise maximum values is required while doing the data analysis and this subset of the data frame is used for comparative analysis. The main objective is to compare these maximums with each other or with a threshold value. In R, we can find the group wise maximum value by using group_by and slice functions in dplyr package.ExampleConsider the below data frame −> x y df head(df, 20) x y 1 S1 1 2 S1 2 3 S1 3 4 S1 4 5 ... Read More

## How to deal with warning “removed n rows containing missing values” while using ggplot2 in R?

Updated on 10-Aug-2020 12:16:43
The warning “removed n rows containing missing values” occurs when we incorrectly specify the range of the values for X-axis or Y-axis. We can this range in ggplot function using scale_x_continuous(limits=c(?, ?)) for x axis and scale_y_continuous(limits=c(?, ?)) for y axis. If the range will be larger than the actual data range then there will be no warning otherwise, we will get the warning for the number of missing values.ExampleConsider the below data frame −> set.seed(2) > x y df library(ggplot2)Creating the plot with Y-axis limits from 0 to 5−> ggplot(df, aes(x, y))+ + geom_point()+ + scale_y_continuous(limits=c(0, 5)) Warning message: ... Read More

## How to join points on a scatterplot with smooth lines in R using plot function?

Updated on 10-Aug-2020 14:04:45
It is very difficult to join points on a scatterplot with smooth lines if the scatteredness is high but we might want to look at the smoothness that cannot be understood by just looking at the points. It is also helpful to understand whether the model is linear or not. We can do this by plotting the model with loess using plot function.ExampleConsider the below data −> set.seed(3) > x y Model summary(Model) Call: loess(formula = y ~ x) Number of Observations: 10 Equivalent Number of Parameters: 4.77 Residual Standard Error: 8.608 Trace of smoother matrix: 5.27 (exact) Control ... Read More

## How to find the standard error of mean in R?

Updated on 10-Aug-2020 14:03:41
The standard error of mean is the standard deviation divided by the square root of the sample size. The easiest way to find the standard error of mean is using the formula to find its value.Example> set.seed(1)We will find the standard errors for a normal random variable, sequence of numbers from one to hundred, a random sample, a binomial random variable, and uniform random variable using the same formula. And at the end, I will confirm whether we used the correct method or not for all types of variables we have considered here.> x x [1] -0.6264538 0.1836433 -0.8356286 ... Read More

## How to find the inverse of a matrix in R?

Updated on 10-Aug-2020 14:02:10
The inverse of a matrix can be calculated in R with the help of solve function, most of the times people who don’t use R frequently mistakenly use inv function for this purpose but there is no function called inv in base R to find the inverse of a matrix.ExampleConsider the below matrices and their inverses −> M1 M1 M1    [, 1] [, 2] [1, ] 1 3 [2, ] 2 4 > solve(M1) [, 1] [, 2] [1, ] -2 1.5 [2, ] 1 -0.5 > M2 M2 ... Read More

## How to include a factor level in bar blot using ggplot2 in R if that level has a frequency zero.

Updated on 10-Aug-2020 13:57:01
In research, sometimes we get a count of zero for a particular level of a factor variable but we might want to plot that in the bar plot so that anyone who look at the plot can easily understand what is missing and compare all the factor levels. In ggplot2, it can be done with the help of scale_x_discrete function.> x df df$x df$x [1] S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 Levels: S1 S2 S3 S4 S5Loading ggplot2 package −> library(ggplot2)Now when ... Read More

## How to save matrix created in R as tables in a text file with column names same as the matrix?

Updated on 10-Aug-2020 13:55:43
Matrix data is sometimes need to be saved as table in text files, the reason behind this is storage capacity of text files. But when we save a matrix as text files in R, the column names are misplaced therefore we need to take care of those names and it can be done by setting column names to the desired value.> M M       [, 1] [, 2] [, 3] [, 4] [1, ] 1 5 9 13 [2, ] 2 ... Read More

## How to create a bar chart using ggplot2 with facets that are in the order of the data in R?

Updated on 10-Aug-2020 12:01:51
Since visualization is an essential part of data analysis, we should make sure that the plots are created in a form that is easily readable for users. For this purpose, the facets in a bar chart helps us to understand the factor variable levels for another factor. To create such type of bar chart, we can use facet_grid function of ggplot2 package.ExampleConsider the below data frame −> set.seed(99) > y class quantity df library(ggplot2)Creating the plot with class on X-axis and y on Y-axis without any facet −> ggplot(df, aes(class, y))+ + geom_bar(stat="identity")OutputCreating the plot with class on X-axis, y ... Read More

## How to stop printing messages while loading a package in R?

Updated on 10-Aug-2020 11:59:32
There are some annoying messages we get while loading a package in R and they are not useful until and unless we are not loading a new package. Since these messages looks like outputs they might be confusing especially when we are analysing string data. Therefore, we must get rid of them.An example of message while loading BSDA package:>> library(BSDA)Loading required package − latticAttaching package − ‘BSDA’The following object is masked from ‘package:datasets’ −OrangeHere we have some messages while loading the package BSDA but we might not be interested in those messages if we are sure that package is installed ... Read More

## How to arrange a list of scatterplots in R using grid.arrange?

Updated on 10-Aug-2020 13:46:12
In predictive modeling, we get so many variables in our data set and we want to visualize the relationship among these variables at a time. This helps us to understand how one variable changes with the other, and on the basis of that we can use the better modeling technique. To create a list of plots we can use grid.arrange function in gridExtra package that can arrange plots based on our need.ExampleConsider the below data frame −> set.seed(10) > df head(df, 20)        x1            x2        x3     x4 1 ... Read More
Advertisements