- R Tutorial
- R - Home
- R - Overview
- R - Environment Setup
- R - Basic Syntax
- R - Data Types
- R - Variables
- R - Operators
- R - Decision Making
- R - Loops
- R - Functions
- R - Strings
- R - Vectors
- R - Lists
- R - Matrices
- R - Arrays
- R - Factors
- R - Data Frames
- R - Packages
- R - Data Reshaping

- R Data Interfaces
- R - CSV Files
- R - Excel Files
- R - Binary Files
- R - XML Files
- R - JSON Files
- R - Web Data
- R - Database

- R Charts & Graphs
- R - Pie Charts
- R - Bar Charts
- R - Boxplots
- R - Histograms
- R - Line Graphs
- R - Scatterplots

- R Statistics Examples
- R - Mean, Median & Mode
- R - Linear Regression
- R - Multiple Regression
- R - Logistic Regression
- R - Normal Distribution
- R - Binomial Distribution
- R - Poisson Regression
- R - Analysis of Covariance
- R - Time Series Analysis
- R - Nonlinear Least Square
- R - Decision Tree
- R - Random Forest
- R - Survival Analysis
- R - Chi Square Tests

- R Useful Resources
- R - Interview Questions
- R - Quick Guide
- R - Useful Resources
- R - Discussion

- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who

Factors are the data objects which are used to categorize the data and store it as levels. They can store both strings and integers. They are useful in the columns which have a limited number of unique values. Like "Male, "Female" and True, False etc. They are useful in data analysis for statistical modeling.

Factors are created using the **factor ()** function by taking a vector as input.

# Create a vector as input. data <- c("East","West","East","North","North","East","West","West","West","East","North") print(data) print(is.factor(data)) # Apply the factor function. factor_data <- factor(data) print(factor_data) print(is.factor(factor_data))

When we execute the above code, it produces the following result −

[1] "East" "West" "East" "North" "North" "East" "West" "West" "West" "East" "North" [1] FALSE [1] East West East North North East West West West East North Levels: East North West [1] TRUE

On creating any data frame with a column of text data, R treats the text column as categorical data and creates factors on it.

# Create the vectors for data frame. height <- c(132,151,162,139,166,147,122) weight <- c(48,49,66,53,67,52,40) gender <- c("male","male","female","female","male","female","male") # Create the data frame. input_data <- data.frame(height,weight,gender) print(input_data) # Test if the gender column is a factor. print(is.factor(input_data$gender)) # Print the gender column so see the levels. print(input_data$gender)

When we execute the above code, it produces the following result −

height weight gender 1 132 48 male 2 151 49 male 3 162 66 female 4 139 53 female 5 166 67 male 6 147 52 female 7 122 40 male [1] TRUE [1] male male female female male female male Levels: female male

The order of the levels in a factor can be changed by applying the factor function again with new order of the levels.

data <- c("East","West","East","North","North","East","West", "West","West","East","North") # Create the factors factor_data <- factor(data) print(factor_data) # Apply the factor function with required order of the level. new_order_data <- factor(factor_data,levels = c("East","West","North")) print(new_order_data)

When we execute the above code, it produces the following result −

[1] East West East North North East West West West East North Levels: East North West [1] East West East North North East West West West East North Levels: East West North

We can generate factor levels by using the **gl()** function. It takes two integers as input which indicates how many levels and how many times each level.

gl(n, k, labels)

Following is the description of the parameters used −

**n**is a integer giving the number of levels.**k**is a integer giving the number of replications.**labels**is a vector of labels for the resulting factor levels.

v <- gl(3, 4, labels = c("Tampa", "Seattle","Boston")) print(v)

When we execute the above code, it produces the following result −

Tampa Tampa Tampa Tampa Seattle Seattle Seattle Seattle Boston [10] Boston Boston Boston Levels: Tampa Seattle Boston

Advertisements