Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
R Programming Articles
Page 5 of 174
How to deal with the error "Error in int_abline---plot.new has not been called yet" in R?
The above error means plot is not being created yet hence abline function cannot be used to draw anything on the plot. Therefore, a plot needs to be created first to use abline function for creating a line or any other thing. Mostly, abline is used to create regression line on the plot, thus we need to create a scatterplot first before using abline.Exampleabline(lm(y~x))Output
Read MoreHow to create correlation matrix plot in R?
To create a correlation matrix plot, we can use ggpairs function of GGally package. For example, if we have a data frame called df that contains five columns then the correlation matrix plot can be created as ggpairs(df). A correlation matrix plot using ggpairs display correlation value as well as scatterplot and the distribution of variable on diagonal.Examplelibrary(GGally) ggpairs(df)Output
Read MoreHow to create a boxplot using ggplot2 for single variable without X-axis labels in R?
The important part of a boxplot is Y−axis because it helps to understand the variability in the data and hence, we can remove X−axis labels if we know the data description. To create a boxplot using ggplot2 for single variable without X−axis labels, we can use theme function and set the X−axis labels to blank as shown in the below example.Exampleggplot(df,aes(x=factor(0),y))+geom_boxplot()+theme(axis.title.x=element_blank(),axis.text.x=element_blank(),axis.ticks.x=element_blank())Output
Read MoreHow to perform shapiro test for all columns in an R data frame?
The shapiro test is used to test for the normality of variables and the null hypothesis for this test is the variable is normally distributed. If we have numerical columns in an R data frame then we might to check the normality of all the variables. This can be done with the help of apply function and shapiro.test as shown in the below example.Exampleapply(df, 2, shapiro.test)Output$x1 Shapiro-Wilk normality test data: newX[, i] W = 0.94053, p-value = 0.2453 $x2 Shapiro-Wilk normality test data: newX[, i] W = 0.95223, p-value = 0.4022 $x3 Shapiro-Wilk normality test data: newX[, i] W = ...
Read MoreHow to divide the data frame rows in R by row standard deviation?
To divide the data frame row values by row standard deviation in R, we can follow the below steps −First of all, create a data frame.Then, use apply function to divide the data frame row values by row standard deviation.Creating the data frameLet's create a data frame as shown below −> x y df dfOn executing, the above script generates the below output(this output will vary on your system due to randomization) − x y 1 1.48 0.86 2 -0.14 -0.58 3 -0.25 1.22 4 0.18 0.25 5 0.50 0.68 6 -1.34 -0.21 ...
Read MoreHow to create a sample or samples using probability distribution in R?
A probability distribution is the type of distribution that gives a specific probability to each value in the data set. For example, if we have a variable say X that contains three values say 1, 2, and 3 and each of them occurs with the probability defined as 0.25, 0.50, and 0.25 respectively then the function that gives the probability of occurrence of each value in X is called the probability distribution. In R, we can create the sample or samples using probability distribution if we have a predefined probabilities for each value or by using known distributions such as ...
Read MoreHow to subset a named vector based on names in R?
To subset a named vector based on names, we can follow the below steps −Create a named vector.Subset the vector using grepl.Create the named vectorLet’s create a name vector as shown below −V
Read MoreHow to set the number of digits to be printed for summary command without using options(digits) in R?
To set the nuber of digits to be printed for summary command without using options(digits), we can use digits argument while printing the summary. −Example 1Using mtcars data and finding the summary statistics with number of digits set to 2 −summary(mtcars, digits=2)On executing, the above script generates the below output(this output will vary on your system due to randomization) −Output mpg cyl disp hp drat Min. :10 Min. :4.0 Min. : 71 Min. ...
Read MoreHow to find the summary by categorical variable in R?
To find the summary by categorical variable, we can follow the below steps −Use inbuilt data sets or create a new data set.Find the summary statistics with by function.Use inbuilt data setLet’s consider mtcars data set in base R −data(mtcars) head(mtcars, 25)On executing, the above script generates the below output(this output will vary on your system due to randomization) − mpg cyl disp hp drat wt qsec vs am gear carb Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 ...
Read MoreHow to filter data frame by categorical variable in R?
To filter data frame by categorical variable in R, we can follow the below steps −Use inbuilt data sets or create a new data set and look at top few rows in the data set.Then, look at the bottom few rows in the data set.Check the data structure.Filter the data by categorical column using split function.Use inbuilt data setLet’s consider CO2 data set in base R −data(CO2) head(CO2, 10)On executing, the above script generates the below output(this output will vary on your system due to randomization) −Grouped Data: uptake ~ conc | Plant Plant Type Treatment conc uptake 1 ...
Read More