Select Columns of an R Data Frame Not in a Vector

Nizamuddin Siddiqui
Updated on 04-Sep-2020 08:17:38

478 Views

An R data frame can have so many columns and we might want to select them except a few. In this situation, it is better to extract columns by deselecting the columns that are not needed instead of selecting the columns that we need because the number of columns needed are more than the columns that are not needed. This can be done easily with the help of ! sign and single square brackets.ExampleConsider the below data frame − Live Demo> Age Gender Salary ID Education Experience df dfOutput ID    Gender    Age     Salary    Experience    Education 1 ... Read More

Add New Column in R Data Frame by Combining Two Columns

Nizamuddin Siddiqui
Updated on 04-Sep-2020 08:03:44

747 Views

A data frame can have multiple types of column and some of them could be combined to make a single column based on their characteristics. For example, if a column has characters and the other has numbers then we might want to join them by separating with a special character to showcase them as an identity.ExampleConsider the below data frame − Live Demo> ID Frequency set.seed(111) > ID Frequency df dfOutput   ID Frequency 1 A    78 2 B    84 3 C    83 4 D    47 5 E    25 6 F    59 7 G    69 ... Read More

Find Mean of Corresponding Elements of Multiple Matrices in R

Nizamuddin Siddiqui
Updated on 04-Sep-2020 07:54:13

929 Views

If the elements of multiple matrices represent the same type of characteristic then we might want to find the mean of those elements. For example, if we have matrices M1, M2, M3, and M4 stored in a list and the first element represent the rate of a particular thing, say Rate of decay of rusty iron during rainy season, then we might want to find the mean of first element of matrix M1, M2, M3, and M4. This mean can be found by using Reduce function.ExampleConsider the below matrices and their list − Live Demo> M1 M1Output   [, 1] [, 2] ... Read More

Add Percentage Column for Groups in R Data Frame

Nizamuddin Siddiqui
Updated on 04-Sep-2020 07:45:19

4K+ Views

In data analysis, we often need to find the percentage of values that exists in a data group. This helps us to understand which value occurs frequently and which one has low frequency. Also, plotting of percentages through pie charts can be done and that gives a better view of the data to the readers. Adding a new column as percentage for groups is not a challenge if we can use mutate function of dplyr package, here you will get the examples from that.Example1 Live Demo> Group Frequency df1 df1OutputGroup Frequency 1 1 67 2 1 58 3 1 54 4 ... Read More

Change Background Color of a Plot in R

Nizamuddin Siddiqui
Updated on 04-Sep-2020 07:39:15

1K+ Views

To change the focus of a plot we can do multiple things and one such thing is changing the background of the plot. If the background color of a plot is different than white then obviously it will get attention of the readers because this is unusual as most of the times the plots have white backgrounds, hence if we want to attract readers on the plot then we might use this technique. It can be done by using par(bg= "color_name").ExampleCreating a simple histogram − Live Demo> x hist(x)OutputExampleCreating histogram with different background colors −> par(bg="green") > hist(x)Output> par(bg="yellow") > hist(x)Outputpar(bg="blue") ... Read More

Sort Vector in Increasing Order with Numbers and Characters in R

Nizamuddin Siddiqui
Updated on 04-Sep-2020 07:35:05

219 Views

A vector can contain numbers, characters or both. The sorting of vectors that contain only numbers or only characters is not very difficult but if a vector contains both of them then it is a little tedious task. In R, we can sort a vector that contains numbers as well as characters with the help of order function but before doing this sorting we must look at the vector very carefully to check if the characters are different for the elements of the vector or not, if they are different then we can’t do this sorting in the manner explained ... Read More

Create a Row at the End of an R Data Frame with Column Totals

Nizamuddin Siddiqui
Updated on 04-Sep-2020 07:32:57

256 Views

In data analysis, we often need column totals, especially in situations where we want to perform the analysis in a step by step manner. There are many analytical techniques in which we find the column totals such as ANALYSIS OF VARIANCE, CORRELATION, REGRESSION, etc. To find the column totals, we can use colSums function and use the single square brackets to put these totals as a row in the data frame.Example1Consider the below data frame − Live Demo> x1 x2 x3 df1 df1Output  x1 x2 x3 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 ... Read More

Select First and Last Row Based on Group Column in R Data Frame

Nizamuddin Siddiqui
Updated on 04-Sep-2020 07:30:13

1K+ Views

Extraction of data is necessary in data analysis because extraction helps us to keep the important information about a data set. This important information could be the first row and the last row of groups as well, also we might want to use these rows for other type of analysis such as comparing the initial and last data values among groups. We can extract or select the first and last row based on group column by using slice function of dplyr package.Example Live DemoConsider the below data frame: > x1 x2 df1 head(df1, 12)Output  x1 x2 1  1  3 2  1 ... Read More

Get List of Data Sets in Base R or Package in R

Nizamuddin Siddiqui
Updated on 04-Sep-2020 07:27:32

8K+ Views

There are many data sets available in base R and in different packages of R. The characteristics of these data sets are very different, for example, some data sets are time series data, some have only numerical columns, some have numerical as well as factor columns, some includes character columns with other type of columns. Therefore, it becomes helpful to everyone who want to learn the use of R programming. To get the list of available data sets in base R we can use data() but to get the list of data sets available in a package we first need ... Read More

Plot Multiple Time Series Using ggplot2 in R

Nizamuddin Siddiqui
Updated on 04-Sep-2020 06:58:38

1K+ Views

For a one point of time, we might have multiple time series data, this could be weather for multiple cities, price variation in multiple products, demand expectancy at different locations, or anything that changes with time and measured for multiple things or locations. If we have such type of time series data then we would be needing to plot that data in a single plot and it can be done with the help of geom_line function of ggplot2 package.ExampleConsider the below data frames − Live Demo> x1 y1 df1 df1Output   x1 y1 1 1 -0.1165387 2 2 -0.9084062 3 3 0.4696637 ... Read More

Advertisements