Found 33676 Articles for Programming

How to count the number of values that satisfy a condition in an R vector?

Nizamuddin Siddiqui
Updated on 07-Nov-2020 07:37:47

3K+ Views

Sometimes we want to find the frequency of values that satisfy a certain condition. For example, if we have a vector say x that contains randomly selected integers starting from 1 and ends at 100, in this case we might want to find how many values are exactly equal to 10. This can be done by using which and length function.Example1Live Demo> x1 x1Output[1] 5 7 3 3 2 7 3 7 6 3Example> length(which(x1==5)) [1] 1 > length(which(x1==7)) [1] 3 > length(which(x1==3)) [1] 4Example2Live Demo> x2 x2Output[1] 4 1 5 5 5 3 8 9 8 4 8 1 ... Read More

How to remove a common suffix from column names in an R data frame?

Nizamuddin Siddiqui
Updated on 07-Nov-2020 07:35:45

8K+ Views

To remove a common suffix from column names we can use gsub function. For example, if we have a data frame df that contains column defined as x1df, x2df, x3df, and x4df then we can remove df from all the column names by using the below command:colnames(df) x1Data x2Data x3Data df1 df1Outputx1Data x2Data x3Data 1 29.26500 26.64124 2.598983 2 21.82170 23.41442 4.134393 3 22.71918 25.21586 4.442823 4 19.88633 25.23487 3.338448 5 20.48989 23.33683 3.829757 6 29.07910 25.54084 3.519393 7 24.28573 23.67258 4.667397 8 27.99849 22.97148 4.100405 9 23.48148 25.36574 2.618030 10 26.39401 23.80191 4.235092 11 29.39867 24.36261 2.782559 12 30.11137 ... Read More

How to find the standard deviation if NA’s are present in a column of an R data frame?

Nizamuddin Siddiqui
Updated on 07-Nov-2020 07:34:26

4K+ Views

If there exists an NA in a vector or column of an R data frame, the output of the sd command for standard deviation results in NA. To solve this problem, we need to use na.rm=TRUE as we do it for vectors that do not contain missing values. For example, if we have a column of a data frame df defined as x that contains missing values then sd of x can be calculated as sd(df$x).ExampleConsider the below data frame:Live Demo> set.seed(3521) > x df1 df1Outputx 1 NA 2 5.107864 3 4.797851 4 5.184345 5 4.680958 6 5.245151 7 5.760667 ... Read More

How to set the legends using ggplot2 on top-right side in R?

Nizamuddin Siddiqui
Updated on 07-Nov-2020 07:32:44

3K+ Views

The default position of legend in a plot created by using ggplot2 is right hand side but we can change the position by using theme function that has legend.position argument and legend.justification argument. To set the legend on top-right side we can use legend.position="top" and legend.justification="right".ExampleConsider the below data frame:Consider the below data frame:Live Demo> x freq df dfOutputx freq 1 Mango 212 2 Guava 220 3 Pomegranate 218Loading ggplot2 package and creating bar chart with legend:> library(ggplot2) > ggplot(df, aes(x, freq, fill=x))+geom_bar(stat="identity")Output:Creating the bar chart with legend on top-right hand side of the chart:Example> ggplot(df, aes(x, freq, fill=x))+geom_bar(stat="identity")+theme(legend.position="top", legend.justification="right")Output:Read More

How to create a bar chart using ggplot2 with dots drawn at the center of top edge of the bars in R?

Nizamuddin Siddiqui
Updated on 07-Nov-2020 07:30:58

583 Views

Aesthetics is one of the most important aspect of a chart, hence we should try to use the best possible aesthetic properties in a plot. In a bar chart, we can represent the center of bars in many ways and one such way is using dots at the center of the top edge of the bars. We can use geom_point function by defining colour argument to put points at the center of top edge of the bars in a bar chart created by using ggplot2.ExampleConsider the below data frame:> freq df dfOutputx freq 1 Mango 212 2 Guava 220 3 ... Read More

How to find the cumulative sums by using two factor columns in an R data frame?

Nizamuddin Siddiqui
Updated on 07-Nov-2020 07:29:30

377 Views

Generally, cumulative sums are calculated for a single variable and in some cases based on a single categorical variable, there are very few situations when we want to do it for two categorical variables. If we want to find it for two categorical variables then we need to convert the data frame to a data.table object and use the cumsum function to define the column with cumulative sums.ExampleConsider the below data frame:Live Demo> set.seed(1361) > Factor1 Factor2 Response df1 df1OutputFactor1 Factor2 Response 1 A T2 9 2 B T1 8 3 B T1 2 4 A T2 3 5 B ... Read More

How to create a vector with lowercase as well as uppercase letters in R?

Nizamuddin Siddiqui
Updated on 07-Nov-2020 07:27:15

871 Views

To create a vector with lowercase we can use the word letters and for uppercase the word LETTERS is used in R. If we want to create a vector with both type of letters then both of these words can be combined using c function and if want to create a vector with randomly sampled values of lowercase and uppercase letters then sample function can be used.ExamplesLive Demo> x1 x1Output[1] "A" "B" "C" "D" "a" "b" "c" "d"ExampleLive Demo> x2 x2Output[1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S" ... Read More

How to repeat a random sample in R?

Nizamuddin Siddiqui
Updated on 07-Nov-2020 07:25:40

3K+ Views

The random sample can be repeated by using replicate function in R. For example, if we have a vector that contains 1, 2, 3, 4, 5 and we want to repeat this random sample five times then replicate(5, x) can be used and the output will be matrix of the below form:[, 1] [, 2] [, 3] [, 4] [, 5] [1, ] 1 1 1 1 1 [2, ] 2 2 2 2 2 [3, ] 3 3 3 3 3 [4, ] 4 4 4 4 4 [5, ] 5 5 5 5 5Example 1Live Demo> x1 x1Output[1] ... Read More

How to find minimum value in a numerical vector which is written as a character vector in R?

Nizamuddin Siddiqui
Updated on 06-Nov-2020 14:21:17

302 Views

To find the minimum value of a numeric vector we can directly use min function but if the values are read with double-inverted commas then min function does not work. In this case, we will have to use as.numeric with the vector name so that it can be converted to numeric form before finding the minimum value. For example, if we have a character vector that contains 1, 2, 3, and 4 then the minimum can be found as min(as.numeric(x)).Example1 Live Demox1

How to subset a data.table in R by removing specific columns?

Nizamuddin Siddiqui
Updated on 06-Nov-2020 14:19:05

520 Views

After getting some experience with data frame people generally move on to data.table object because it is easy to play with a data.table object as compared to a data frame. We also need to create a subset of a data.table object and it can be easily done with the help of square brackets. For example, if we have a data.table object called DT that contains 10 columns and we want to create a subset of 1 to 8 columns then we can use DT[,−c(9,10),with=FALSE] to create that subset.Examplelibrary(data.table) x1

Advertisements