Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
R Programming Articles
Page 53 of 174
How to identify the difference between Kolmogorov Smirnov test and Chi Square Goodness of fit test in R?
The Chi Square Goodness of fit test is used to test whether the distribution of nominal variables is same or not as well as for other distribution matches and on the other hand the Kolmogorov Smirnov test is only used to test to the goodness of fit for a continuous data. The difference is not about the programming tool, it is a concept of statistics.Example> x xOutput[1] 0.078716115 -0.682154062 0.655436957 -1.169616157 -0.688543382 [6] 0.646087104 0.472429834 2.277750805 0.963105637 0.414918478 [11] 0.575005958 -1.286604138 -1.026756390 2.692769261 -0.835433410 [16] 0.007544065 0.925296720 1.058978610 0.906392907 0.973050503Example> ks.test(x, pnorm) One-sample Kolmogorov-Smirnov test data: x D ...
Read MoreHow to change the width of whisker lines in a boxplot using ggplot2 in R?
In R, by default the whisker lines are as wide as the box of the boxplot but it would be great if we reduce that width or increase it because it will get attention of the viewer in that way. This can be done by using the width argument inside the stat_boxplot function of ggplot2 package. Check out the below example to understand how it works.ExampleConsider the below data frame −Example> x y df dfOutputx y 1 B 5 2 B 4 3 A 6 4 A 9 5 B 2 6 B 4 7 B 6 8 B 2 ...
Read MoreHow to create a column in an R data frame that contains the multiplication of two columns?
Sometimes we need the multiplication of two columns and create a new column so that the multiplication can be used further for analysis. For example, to calculate BMI we need mass and height and the height is squared, therefore, we would be needing the square of height. For this purpose, we can either multiply height with height or simply take the square both the ways work. Hence, if only have height column in an R data frame then we can multiply it with itself.ExampleConsider the below data frame −> set.seed(957) > x y z df dfOutputx y z 1 0 ...
Read MoreHow to create a residual plot in R with better looking aesthetics?
The default residual plot can be created by using the model object name in base R but that is not very attractive. To create a residual plot with better looking aesthetics, we can use resid_panel function of ggResidpanel package. It is created in the same way as the residual plot in base R, also it results in all the relevant graph in one window.ExampleConsider the below data frame −> x y df dfOutputx y 1 0.48508894 0.217379409 2 0.75113573 -0.657179470 3 -0.13075185 -0.549613217 4 -0.26867557 1.156736294 5 0.40407850 0.640387394 6 -0.23816272 -0.807847198 7 -0.57278583 0.600249694 8 -0.78222676 -0.711133218 9 1.70161645 ...
Read MoreHow to create an empty data frame with fixed number of rows and without columns in R?
To create an empty data frame with fixed number of rows but no columns, we can use data.frame function along with the matrix function. That means we need to create a matrix without any column using matrix and save it in a data frame using data.frame function as shown in the below examples.Example1> df1 df1Outputdata frame with 0 columns and 10 rows Example2> df2 df2Outputdata frame with 0 columns and 100 rows Example3> df3 df3Outputdata frame with 0 columns and 39 rows Example4> df4 df4Outputdata frame with 0 columns and 20 rows Example5> df5 df5Outputdata frame with 0 columns and ...
Read MoreHow to get top values of a numerical column of an R data frame in decreasing order?
To get the top values in an R data frame, we can use the head function and if we want the values in decreasing order then sort function will be required. Therefore, we need to use the combination of head and sort function to find the top values in decreasing order. For example, if we have a data frame df that contains a column x then we can find top 20 values of x in decreasing order by using head(sort(df$x, decreasing=TRUE), n=20).ExampleConsider the CO2 data frame in base R −> str(CO2)OutputClasses ‘nfnGroupedData’, ‘nfGroupedData’, ‘groupedData’ and 'data.frame': 84 obs. of 5 ...
Read MoreHow to find the number of positive values in an R vector?
We know that positive values are greater than 0, therefore, we can use this condition with length function to find the number of positive values in a vector. For example, if we have a vector x that contains some positive and some negative values and we want to find the number of values that are positive then we can use the command length(x[x>0]).Example1> x1 x1Output[1] 0.21314126 1.23449384 -1.02721325 -0.23168203 -1.36368881 -0.82416287 [7] 0.31224895 -0.90773340 0.10312288 -0.38914253 0.01196499 0.44875369 [13] 0.40820219 0.70172242 -0.23766272 -0.01023414 1.12403398 0.05837136 [19] -0.67403563 -0.26134292 0.31192384 -1.25116951 0.22115555 0.46544495 [25] 0.76567139 0.76948285 -1.42650924 0.24616899 0.18043015 1.04896235 [31] ...
Read MoreHow to remove names from a named vector in R?
To assign names to the values of vector, we can use names function and the removal of names can be done by using unname function. For example, if we have a vector x that has elements with names and we want to remove the names of those elements then we can use the command unname(x).Example1> x1 names(x1) x1OutputG K N V P F F A P D L N K J V H S L F C M F H T I V 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ...
Read MoreHow to create horizontal legend using ggplot2 in R?
The default legend direction is vertical but it can be changed to horizontal as well and for this purpose we can use legend.direction argument of theme function of ggplot2 package. For example, if we want to create a bar chart with x as categories and y as frequencies tjat are contained in a data frame df then the bar chart with horizontal legends for categories in x can be created as −ggplot(df, aes(x, y, fill=x))+geom_bar(stat="identity")+theme(legend.direction="horizontal")ExampleConsider the below data frame −> x y df dfOutputx y 1 A 27 2 B 25 3 C 28Loading ggplot2 package and creating the bar ...
Read MoreHow to set the X-axis labels in histogram using ggplot2 at the center in R?
The boundary argument of geom_histogram function and breaks argument of scale_x_continuous function can help us to set the X-axis labels in histogram using ggplot2 at the center. We need to be careful about choosing the boundary and breaks depending on the scale of the X-axis values. Check out the below example to understand how it works.ExampleConsider the below data frame −Example> x df dfOutputx 1 5 2 7 3 6 4 4 5 7 6 7 7 10 8 3 9 6 10 6 11 5 12 4 13 4 14 6 15 7 16 4 17 1 18 11 ...
Read More