A histogram using weights represent the weighted distribution of the values. In R, we can use weighted.hist function of plotrix package to create this type of histogram and we just need the values and weights corresponding to each value. Since plotrix is not frequently used, we must make sure that we install this package using install.packages("plotrix") then load it in R environment.Loading plotrix package −library("plotrix")Consider the below vector and the weight associated with that vector −Examplex
To find the critical value for t test in R, we need to use qt function. This function requires level of significance and the sample size and returns the tabulated or critical value of t distribution. Below examples shows the calculation of critical value for different situations such as left-side test, right-side test or two-sided test.Left side critical value with sample size 30 and 95% confidence level −Example Live Demoqt(0.05, 30)Output[1] -1.697261Right side critical value with sample size 30 and 95% confidence level −Example Live Demoabs(qt(0.05, 30))Output[1] 1.697261 Example Live Demoqt(0.05, 50)Output[1] -1.675905Example Live Demoabs(qt(0.05, 50))Output[1] 1.675905 Example Live Demoqt(0.01, 50)Output[1] -2.403272Example Live Demoabs(qt(0.01, 50))Output[1] 2.403272 ... Read More
Data analysis is a difficult task because it has so much variation in terms of the smaller objectives of a big project. One of the smallest tasks could be finding the minimum value in each row contained in a data frame. For this purpose, we cam use apply function and pass the FUN argument as min so that we can get minimum values.Consider the below data frame −Example Live Demoset.seed(101) x1
To perform the correlation test in R, we need to use cor.test function with two variables and it returns so many values such as test statistic value, degrees of freedom, the p-value, the confidence interval, and the correlation coefficient value. If we want to extract the correlation coefficient value from the correlation test output then estimate function could be used as shown in below examples.Example Live Demox1
The binomial data has two parameters, the sample size and the number of successes. To find the 95% confidence interval we just need to use prop.test function in R but we need to make sure that we put correct argument to FALSE so that the confidence interval will be calculated without continuity correction. In the below examples, we have found the 95% confidence interval for different values of sample size and number of successes.Example Live Demoprop.test(x=25, n=100, conf.level=0.95, correct=FALSE)Output1-sample proportions test without continuity correction data: 25 out of 100, null probability 0.5 X-squared = 25, df = 1, p-value = 5.733e-07 ... Read More
When we have factor column that helps to differentiate between numerical column then we might want to find the maximum value for each of the factor levels. This will help us to compare the factor levels in terms of their maximum and if we want to do this by getting all the columns in the data frame then aggregate function needs to be used with merge function.Consider the below data frame −Example Live Demoset.seed(78) Group
If we have a matrix that contains NA or Inf values and we want to take the subset of that matrix with finite values then only the rows that do not contain NA or Inf values will be the output. We can do this in R by using rowSums and is.finite function with negation operator !.Example Live Demoset.seed(999) M1
Indexing helps us to understand the location of the value in the vector. If we have a vector that contains repeated values then we might want to figure out the last occurrence of the repeated value. For example, if we have a vector x that contains 1, 1, 2, 1, 2 then the last occurrence of repeated values will be 4 and 5 because the last 1 is at 4th position and 2 is at the 5th position. We can find this by using tapply function in R.Example Live Demox1
Usually, a point chart is created to assess the relationship or movement of two variables together but sometimes these points are scattered in a way that makes confusion. Hence, data analyst or researcher try to visualize this type of graph by joining the points with lines. In ggplot2, this joining can be done by using geom_line() function.Consider the below data frame −Example Live Demoset.seed(111) x
To create a normal random vector, we can use rnorm function with mean and standard deviation as well as without passing these arguments. If we have a different vector derived from another distribution or simply represent some numbers then we can use the mean of that vector in the rnorm function for mean argument.Example Live Demoset.seed(101) x1