To create the boxplot for multiple categories, we should create a vector for categories and construct data frame for categorical and numerical column. Once the construction of the data frame is done, we can simply use boxplot function in base R to create the boxplots by using tilde operator as shown in the below example.ExampleConsider the below data frame − Live DemoCategories
The second most used measure of central tendency median is calculated when we have ordinal data or the continuous data has outliers, also if there are factors data then we might need to find the median for levels to compare them with each other. The easiest way to do this is finding summary with aggregate function.ExampleConsider the below data frame that contains one factor column − Live Demoset.seed(191) x1
To find the root mean square error, we first need to find the residuals (which are also called error and we need to root mean square for these values) then root mean of these residuals needs to be calculated. Therefore, if we have a linear regression model object say M then the root mean square error can be found as sqrt(mean(M$residuals^2)).Example Live Demox1
To remove the first replicate in a vector using another vector, we can use match function. For example, if we have a vector x that contain values 0, 1, 2, 3, 4, 5 and 0, 1, 2 are repeated in another vector say then the removal of this replicate will result in values 3, 4, and 5. This can be done by using x[-match(y,x)].Example Live Demox1
To create a list of matrices, we simply need to find the matrix object inside list function. For example, if we have five matrix objects either of same or different dimensions defined as Matrix1, Matrix2, Matrix3, Matrix4, and Matrix5 then the list of these matrices can be created as −List_of_Matrix
To create triplets of a vector elements, we can use combn function. For example, if we have a vector x that contain values 1, 2, 3, 4, 5 then the unordered triplets of x can be created by using combn(x,3). This will create a matrix where the elements of the vector x would not be arranged in a particular order.Example Live Demox1
To create a data frame with a column having repeated values, we simply need to use rep function and we can repeat the values in a sequence of the values passed or repeating each value a particular number of times. For example, if we have three values 1, 2, 3 then the data frame can be created by repeating these values as 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3 or by repeating the same as 1, 1, 1, 2, 2, 2, 3, 3, 3.Example 1 Live Demox
We might want to find the position of a value in a matrix column which is less than a certain value. This will help us to identify the position of critical or threshold values in each column. For example, if we have a matrix M that contains 5 rows and 5 columns with vales in the range of 1 to 100 then we might want to find the index of values in each column that are less than 50 so that we can understand how many columns have such type of values. In R, we can easily do this by ... Read More
Subsetting of factor columns can be done by creating an object of all columns using sapply with is.factor to extract only factor column in the future then passing that object into subsetting operator single square brackets. For example, if we have a data frame df that contains three columns x, y, z and two of them say x and y are factor columns then we can use Factors
In general, the default shape of points in a scatterplot is circular but it can be changed to other shapes using integers or sequence or the variable. We just need to use the argument shape inside geom_point function and pass the variable name. For example, if we want to create the scatterplot with varying shapes of a variable x then we can use geom_point(shape=x). And if we want to change the size then integer values can be used.ExampleConsider the below data frame − Live Demoset.seed(151) x