When we create a histogram and save it in an object name then we can extract the frequencies as count for the mid values or breaks by calling that object. We can consider that mid values or breaks obtained by the object are the actual value against which the frequencies are plotted on the histogram.Examples> x1 Histogram1 Histogram1Output$breaks [1] 0 1 2 3 4 5 6 7 8 9 10 11 12 13 $counts [1] 45 82 150 156 172 142 113 62 43 20 9 5 1 $density [1] 0.045 0.082 0.150 0.156 0.172 0.142 0.113 0.062 0.043 0.020 ... Read More
When we create a plot, it shows the values passed by the function for creating the plot but we might want to display some other values to provide some information through the plot and that information could be a threshold value as a horizontal line or we can also call it a cut off value. This can be done by using geom_hline function of ggplot2 package.ExampleConsider the below data frame −> x y df dfOutput x y 1 0.27810573 2.6545571 2 1.39185082 3.4845292 3 -0.19068920 1.7043852 4 1.00791317 1.4324814 5 -1.74964913 1.7996093 6 -0.13123079 2.5004350 ... Read More
A string vector can contain any value including spaces. Sometimes a string vector is read with some spaces as well and we want to split the vector then extract few values. For example, if a string has “ABC 123” then we might want to extract the number 123 so that we can use it in analysis. If the string vector has strings of equal sizes then it can be easily done with the help of substr function.Examples> x1 x1 [1] "1 00" "1 01" "1 02" "1 03" "1 03" "1 04" > Numeric_Last_two_x1 Numeric_Last_two_x1 [1] "00" "01" "02" "03" ... Read More
Often, we have duplicate values in a factor column that means a factor column has many levels and each of these levels occur many times. In this situation, if we have a frequency column then we want to find the total of that frequency based on the values of a factor column and this can be done by using aggregate function.Example Live DemoConsider the below data frame −> set.seed(109) > Class Frequency df1 df1Output Class Frequency 1 E 9 2 D 5 3 B 10 ... Read More
There are many units of measurements for a single object or item. For example, weight can be measured in milligrams, grams, kilograms, tons, oz, lbs, etc. Now suppose we have two variables that belong to the same unit of measurement as weight of a coca cola cans and weight of apple juice, if the weights given for both of these variables have different units like one having grams and the other having oz then we might want to convert one of them. This will help us to compare both the variables easily without conflicting the scale of measurements. Therefore, we ... Read More
Creating a numeric vector is the first step towards learning R programming and there are many ways to do that but if we want to generate a sequence of number then it is a bit different thing, not totally different. We can create a vector with a sequence of numbers by using − if the sequence of numbers needs to have only the difference of 1, otherwise seq function can be used.Example Live Demo> x1 x1Output[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 ... Read More
When we create a bar plot or any other plot with legend, the background of the legend is white but it can be changed to any color with the help of scales package. We can make changes in the legend of a plot using alpha in legend.background argument of theme function. This will help us to change the background color of the legend.Example Live Demo> x y df dfOutput x y 1 0 25 2 100 28 3 150 32 4 200 25Creating a bar plot with legend −> library(ggplot2) > ggplot(df, aes(x, y, fill=x))+geom_bar(stat="identity")OutputChanging the background color of the ... Read More
The values of the categorical variable can be represented by numbers, by characters, by a combination of numbers and characters, by special characters, by numerical signs or any other method. But when we create the bar plot, if the size of a label name is large then we might want to reduce it by representing it with a different word or character or sign that gives the same meaning and it can be done by using expression argument inside scale_x_discrete.ExampleConsider the below data frame − Live Demo> x y df dfOutput x y 1 0 25 2 100 28 3 ... Read More
In data analysis, sometimes we need to use zeros for certain calculations, either to nullify the effect of a variable or for some other purpose based on the objective of the analysis. To deal with such type of situations, we need a zero value or a vector of many ways to create a vector with zeros in R. The important thing is the length of the vector.Examples> x1 x1 [1] 0 0 0 0 0 0 0 0 0 0 > x2 x2 [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... Read More
The standardized coefficients in regression are also called beta coefficients and they are obtained by standardizing the dependent and independent variables. Standardization of the dependent and independent variables means that converting the values of these variables in a way that the mean and the standard deviation becomes 0 and 1 respectively. We can find the standardized coefficients of a linear regression model by using scale function while creating the model.ExampleConsider the below data frame − Live Demo> set.seed(99) > x y df1 df1Output x y 1 1.7139625 1.2542310 2 1.9796581 2.9215504 3 1.5878287 2.7500544 4 ... Read More