Change Bar Color to Grey Shade in ggplot2

Nizamuddin Siddiqui
Updated on 07-Sep-2020 06:00:14

1K+ Views

When we create a bar graph using ggplot2, the color of the bars is dark grey but it can be changed to different colors or we can also give different shades of grey to them. This will be helpful if we are plotting a pattern of categorical data. For example, plotting educational level on X-axis with frequencies of years of experience on Y-axis. We can do this by using scale_fill_grey function of ggplot2 package.ExampleConsider the below data frame − Live Demo> x Freq df dfOutput x Freq 1 A 14 2 B 12 3 C 13 4 D 15> library(ggplot2) > ggplot(df, ... Read More

Show All X-Axis Labels in a Bar Graph Using Barplot Function in R

Nizamuddin Siddiqui
Updated on 07-Sep-2020 05:51:30

662 Views

In base R, the barplot function easily creates a barplot but if the number of bars is large or we can say that if the categories we have for X-axis are large then some of the X-axis labels are not shown in the plot. Therefore, if we want them in the plot then we need to use las and cex.names.ExampleConsider the below data and bar graph − Live Demo> x names(x) barplot(x)OutputShowing all the X-axis labels −> barplot(x,las=2,cex.names=0.5)Output

Generate Date Sequence for Fixed Months in R

Nizamuddin Siddiqui
Updated on 04-Sep-2020 13:27:07

915 Views

Every month have common dates except few such as February do not have 30 or 31 and even 29 in some years and there are months that contain 30 days while some contains 31 days. Therefore, finding a date say the first date, a middle date, or a last date is not an easy task but it can be done with the help of seq function in base R.Examples Live Demo> seq(as.Date("2020-01-01"), length=12, by="1 month")Output[1] "2020-01-01" "2020-02-01" "2020-03-01" "2020-04-01" "2020-05-01" [6] "2020-06-01" "2020-07-01" "2020-08-01" "2020-09-01" "2020-10-01" [11] "2020-11-01" "2020-12-01"Example Live Demo> seq(as.Date("2020-01-01"), length=36, by="1 month") Output[1] "2020-01-01" "2020-02-01" "2020-03-01" "2020-04-01" "2020-05-01" [6] ... Read More

Match Two String Vectors with Case Differences in R

Nizamuddin Siddiqui
Updated on 04-Sep-2020 13:06:16

188 Views

We know that, R is a case sensitive programming language, hence matching strings of different case is not simple. For example, if a vector contains tutorialspoint and the other contains TUTORIALSPOINT then to check whether the strings match or not, we cannot use match function directly. To do this, we have to convert the lowercase string to uppercase or uppercase to lowercase with the match function.Examples Live Demo> x1 x1Output[1] "z" "v" "r" "y" "z" "l" "v" "t" "f" "p" "p" "z" "e" "b" "a" "o" "m" "d" [19] "e" "l" "y" "y" "u" "u" "w" "b" "a" "j" "n" "v" ... Read More

T-Test Returns Smallest P-Value of 2.2e-16 in R

Nizamuddin Siddiqui
Updated on 04-Sep-2020 12:37:03

3K+ Views

When we perform a t test in R and the difference between two groups is very large then the p-value of the test is printed as 2.2e – 16 which is a printing behaviour of R for hypothesis testing procedures. The actual p-value can be extracted by using the t test function as t.test(“Var1”, ”Var2”, var.equal=FALSE)$p.value. This p-value is not likely to be the same as 2.2e – 16.Example1 Live Demo> x1 y1 t.test(x1, y1, var.equal=FALSE)Output   Welch Two Sample t-test data: x1 and y1 t = -3617.2, df = 10098, p-value < 2.2e-16 alternative hypothesis: true difference in means is not ... Read More

Concatenate Column Values to Create New Column in R Data Frame

Nizamuddin Siddiqui
Updated on 04-Sep-2020 12:34:39

985 Views

Sometimes we want to combine column values of two columns to create a new column. This is mostly used when we have a unique column that maybe combined with a numerical or any other type of column. Also, we can do this by separating the column values that is going to be created with difference characters. And it can be done with the help of apply function.ExampleConsider the below data frame − Live Demo> ID Country df1 df1Output ID Country 1 1 UK 2 2 UK 3 3 India 4 4 USA 5 5 USA 6 6 UK 7 7 Nepal 8 ... Read More

Draw Gridlines in a Graph with abline Function in R

Nizamuddin Siddiqui
Updated on 04-Sep-2020 12:21:34

192 Views

Gridlines are the horizontal and vertical dotted lines, and they help to organize the chart so that the values on the labels becomes better readable to viewers. This is helpful specially in situations where we plot a large number of data points. A graph drawn by plot function can have gridlines by defining the vertical and horizontal lines using abline.ExampleConsider the below data and scatterplot − Live Demo> x y plot(x,y)OutputAdding gridlines using abline function −> abline(h=seq(0,5,0.5),lty=5) > abline(v=seq(-2,2,0.5),lty=5)Output

Select Rows Based on Range of Values in an R Data Frame

Nizamuddin Siddiqui
Updated on 04-Sep-2020 12:19:29

2K+ Views

Extraction or selection of data can be done in many ways such as based on an individual value, range of values, etc. This is mostly required when we want to either compare the subsets of the data set or use the subset for analysis. The selection of rows based on range of value may be done for testing as well. We can do this by subset function.ExampleConsider the below data frame − Live Demo> x1 x2 x3 df dfOutput x1 x2 x3 1 3 2 6 2 3 4 9 3 4 4 12 4 4 8 12 5 3 5 11 ... Read More

Change Color and Size of Axes Labels in R Plot

Nizamuddin Siddiqui
Updated on 04-Sep-2020 12:16:41

332 Views

The default size of axes labels created by using plot function does not seem to be large enough and also it does not look appealing. Therefore, we might want to change their size and color because the appearance of a plot matters a lot. This can be done by setting colors with col.lab and size with cex.lab.Example Live Demo> x y plot(x,y)OutputChanging the color of axes labels and the size of those axes labels −> plot(x,y,col.lab="blue",cex.lab=2)Output> plot(x,y,col.lab="dark blue",cex.lab=3)Output

Add New Column to R Data Frame with Largest Value in Each Row

Nizamuddin Siddiqui
Updated on 04-Sep-2020 12:14:10

832 Views

When we have a data frame that contains all numerical columns then we might want to find the largest value in each row. For example, if we have a sales data set in which each row represents a customer and columns represent the products with quantities of values as values then we might want to find the maximum of each row to find out who buys which product the most. This can be done by using max with apply function for rows.ExampleConsider the below data frame − Live Demo> x1 x2 x3 x4 x5 df1 df1Output      x1     ... Read More

Advertisements