Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Articles on Trending Technologies
Technical articles with clear explanations and examples
How to create a rank variable using mutate function of dplyr package in R?
A rank variable is created to convert a numerical variable into ordinal variable. This is useful for non-parametric analysis because if the distribution of the numerical variable is not normal or there are assumptions of parametric analysis that cannot be followed by the numerical variable then the raw variable values are not analyzed directly. To create a rank variable using mutate function, we can use dense_rank argument.ExampleConsider the below data frame −set.seed(7) x1
Read MoreHow to create boxplot with horizontal lines on the minimum and maximum in R?
A boxplot shows the minimum, first quartile, median, third quartile, and maximum. When we create a boxplot with ggplot2 it shows the boxplot without horizontal lines on the minimum and maximum, if we want to create the horizontal lines we can use stat_boxplot(geom= 'errorbar') with ggplot function of ggplot2.ExampleConsider the below data frame −set.seed(101) Gender
Read MoreHow to perform mathematical operations on elements of a list in R?
A list can contain many elements and each of them can be of different type but if they are numerical then we can perform some mathematical operations on them such as addition, multiplication, subtraction, division, etc. To do this, we can use Reduce function by mentioning the mathematical operation and the list name as Reduce(“Mathematical_Operation”, List_name).Examplex1
Read MoreHow to write the plot title in multiple lines using plot function in R?
Mostly, the main title of a plot is short but we might have a long line to write for the main title of the plot. For example, a short version might be “Scatterplot” and a longer version might be “Scatterplot between X and Y”. Therefore, in plot function of R we can use line breaks for the main title as "Scatterplot between X and Y".Exampleset.seed(123) x
Read MoreHow to fill the missing values of an R data frame from the mean of columns?
Dealing with missing values is one of the initial steps in data analysis and it is also most difficult because we don’t fill the missing values with the appropriate method then the result of the whole analysis might become meaningless. Therefore, we must be very careful about dealing with missing values. Mostly for learning purposes, people use mean to fill the missing values but can use many other values depending on our data characteristic. To fill the missing value with mean of columns, we can use na.aggregate function of zoo package.ExampleConsider the below data frame −x1
Read MoreHow to create a scatterplot with log10 of dependent variable in R?
Most of the times, the relationship between independent variable and dependent variable is not linear. Therefore, we want to transform the dependent variable or independent variable based on our experiences. Hence, we also want to plot those transformations to visualize the relationship, one such transformation is taking log10 of the dependent variable. To plot this transformation of the dependent variable, we can use scale_y_continuous(trans='log10').ExampleConsider the below data frame −set.seed(10) x
Read MoreWhat is the difference between NA and in R?
The missing values are represented by NA but if we read them as "NA" then it becomes a level of a factor variable. If we believe that a vector is numeric and we have an "NA" in that vector then it will not be a numeric vector. On the other hand, if we have a vector with NA then it will be a numeric vector.Examplesx1
Read MoreHow to find the 95% confidence interval for the slope of regression line in R?
The slope of the regression line is a very important part of regression analysis, by finding the slope we get an estimate of the value by which the dependent variable is expected to increase or decrease. But the confidence interval provides the range of the slope values that we expect 95% of the times when the sample size is same. To find the 95% confidence for the slope of regression line we can use confint function with regression model object.ExampleConsider the below data frame −set.seed(1) x
Read MoreHow to select a column of a matrix by column name in R?
When we create a matrix in R, its column names are not defined but we can name them or might import a matrix that might have column names. If the column names are not defined then we simply use column numbers to extract the columns but if we have column names then we can select the column by name as well as its name.Example1M1
Read MoreHow to draw a violin plot in R?
A violin plot is similar to a boxplot but looks like a violin and shows the distribution of the data for different categories. It shows the density of the data values at different points. In R, we can draw a violin plot with the help of ggplot2 package as it has a function called geom_violin for this purpose.ExampleConsider the below data frame −set.seed(1) x
Read More