R Programming Articles

Page 41 of 174

How to remove rows for categorical columns that has three or less combination of duplicates in an R data frame?

Nizamuddin Siddiqui
Nizamuddin Siddiqui
Updated on 11-Mar-2026 416 Views

In Data Analysis, we sometimes decide the size of the data or sample size based on our thoughts and this might result in removing some part of the data. One such thing could be removing three or less duplicate combinations of categorical columns and it can be done with the help of filter function of dplyr package by grouping with group_by function.Example1y1

Read More

How to create a large vector with repetitive elements of varying size in R?

Nizamuddin Siddiqui
Nizamuddin Siddiqui
Updated on 11-Mar-2026 475 Views

To create a large vector of repetitive elements of varying size we can use the rep function along with the logical vector as an index. The logical vector that contains TRUE or FALSE will define the selection or omission of the values in the vector created with the help of rep function as shown in the below examples. If the vector created by using rep is larger than the logical vector then the logical vector will be recycled.Example1x1

Read More

How to find the percentage of missing values in an R data frame?

Nizamuddin Siddiqui
Nizamuddin Siddiqui
Updated on 11-Mar-2026 1K+ Views

To find the percentage of missing values in an R data frame, we can use sum function with the prod function. For example, if we have a data frame called df that contains some missing values then the percentage of missing values can be calculated by using the command: (sum(is.na(df))/prod(dim(df)))*100Example1y1

Read More

How to replace the outliers with 5th and 95th percentile values in R?

Nizamuddin Siddiqui
Nizamuddin Siddiqui
Updated on 11-Mar-2026 417 Views

There are many ways to define an outlying value and it can be manually set by the researchers as well as technicians. Also, we can use 5th percentile for the lower outlier and the 95th percentile for the upper outlier. For this purpose, we can use squish function of scales package as shown in the below examples.Example1library(scales) x1

Read More

How to convert multiple columns into single column in an R data frame?

Nizamuddin Siddiqui
Nizamuddin Siddiqui
Updated on 11-Mar-2026 10K+ Views

To convert multiple columns into single column in an R data frame, we can use unlist function. For example, if we have data frame defined as df and contains four columns then the columns of df can be converted into a single by using data.frame(x=unlist(df)).Example1y1

Read More

How to count the number of duplicate rows in an R data frame?

Nizamuddin Siddiqui
Nizamuddin Siddiqui
Updated on 11-Mar-2026 3K+ Views

To count the number of duplicate rows in an R data frame, we would first need to convert the data frame into a data.table object by using setDT and then count the duplicates with Count function. For example, if we have a data frame called df then the duplicate rows will be counted by using the command − setDT(df)[,list(Count=.N),names(df)].Example1y1

Read More

How to create a categorical variable using a data frame column in R?

Nizamuddin Siddiqui
Nizamuddin Siddiqui
Updated on 11-Mar-2026 1K+ Views

If a variable is numerical then it can be converted into a categorical variable by defining the lower and upper limits. For example, age starting from 21 and ending at 25 can be converted into a category say 21−25. To convert an R data frame column into a categorical variable, we can use cut function.Example1y1

Read More

How to find the counts of categories in categorical columns in an R data frame?

Nizamuddin Siddiqui
Nizamuddin Siddiqui
Updated on 11-Mar-2026 6K+ Views

If we have two categorical columns in an R data frame then we can find the frequency/count of each category with respect to each category in the other column. This will help us to compare the frequencies for all categories. To find the counts of categories, we can use table function as shown in the below examples.Example1y1

Read More

How to replace vector values less than 2 with 2 in an R vector?

Nizamuddin Siddiqui
Nizamuddin Siddiqui
Updated on 11-Mar-2026 201 Views

If we have a vector that contains values with less than, equal to, and greater than 2 and the value 2 is the threshold. If this threshold value is defined for lower values and we want to replace the values that are less than 2 with 2 then pmax function can be used. For example, for a vector x, it will be done as pmax(x,2).Example1x1

Read More

How to find the frequency of values greater than or equal to a certain value in R?

Nizamuddin Siddiqui
Nizamuddin Siddiqui
Updated on 11-Mar-2026 1K+ Views

In Data Analysis, we often need to look for less than, less than equal to, greater than, or greater than equal to values to compare them with some threshold. Sometimes we also require the frequency of these values. Therefore, we can use sum function for this purpose. For example, if a vector x has 10 integer values then to check how many of them are greater than or equal to 10, we can use the command sum(x>=10).Example1x1=5)Output[1] 83Example2x2=5)Output[1] 8Example3x3=0.25)Output[1] 38Example4x4=10)Output[1] 49Example5x5=4)Output[1] 21

Read More
Showing 401–410 of 1,740 articles
« Prev 1 39 40 41 42 43 174 Next »
Advertisements