MATLAB - Categorical Arrays



Categorical arrays is a data type in MATLAB that allows work with discrete data.Discrete data refers to a type of data that consists of distinct, separate values or categories.Discrete data can only take on specific, finite values, often in the form of whole numbers or distinct categories.

A categorical array offers effective storage and user-friendly handling of non-numeric data, and also helps in preserving descriptive labels for the data values.

Here are some few advantages of using categorical arrays.

  • Categorical arrays use memory more efficiently than cell arrays or regular arrays of strings. This is particularly beneficial when dealing with large datasets.
  • Categorical arrays enable faster data manipulation and analysis. Many functions and operations are optimized for categorical arrays, leading to improved performance compared to working with strings.
  • Categorical arrays make it easy to manage and maintain consistent categories throughout your data.
  • Visualizations created using categorical arrays often have clearer labels and legends, making it easier to convey insights from your data.

Also there are few disadvantages of using categorical arrays −

  • Categorical arrays are most suitable for managing discrete data. If your data has continuous or numerical data, using categorical arrays might not be the best option, as they are primarily designed for non-numeric, categorical information.
  • Complex operations are difficult to perform on categorical arrays.

Creating Categorical Arrays

Categorical Arrays can be created in MATLAB by using −

  • categorical() Function
  • discretize() Function

Using categorical() Function

Arrays of strings or numbers can be converted into a categorical array by using the categorical() function.

Syntax

C = categorical(A)

Here C is the categorical array created from the given array A.

Example 1

In the example we are going to make use of cell array data with details as shown below.

data = {'Red', 'Blue', 'Green', 'Red', 'Green'};
primary_colors  = categorical(data)

The cell array is converted into a categorical array by using the function categorical().

When you execute in matlab command window the output is −

>> data = {'Red', 'Blue', 'Green', 'Red', 'Green'};
primary_colors  = categorical(data)

primary_colors = 

   1x5 categorical array

   Red      Blue      Green      Red      Green 

>> 

If you want to know the categories from the primary_colors you can make use of the method categories() functions that helps to fetch the unique categories from a given categorical array.

categories(primary_colors)

The method will give the distinct categories present in the categorical array primary_colors.

When you execute the same in matlab command window the output is as follows −

>> categories(primary_colors)

ans =

   3x1 cell array
   
   {'Blue' }
   {'Green'}
   {'Red'  }

>>

Example 2

In this example we are going to create a numeric array first and later assign categories to the integer values.

A = [1 3 2; 2 1 3; 3 1 2]

C = categorical(A,[1 2 3],{'AA' 'BB' 'CC'})

In above example A is a matrix with 3 rows and 3 columns. Each element of this matrix is an integer.

Now the code C = categorical(A,[1 2 3],{'AA' 'BB' 'CC'}) converts the numeric array A into a categorical array C using specified categories.

  • A is the array you want to convert.
  • [1 2 3] specifies the distinct integer values present in the array A.
  • {'AA' 'BB' 'CC'} provides the corresponding categories you want to assign to the integer values 1, 2, and 3.

Each element of the categorical array C corresponds to the elements of the original array A, and it maps the integer values to their corresponding categories. For instance −

  • The integer 1 in A is mapped to 'CC' in C.
  • The integer 2 in A is mapped to 'BB' in C.
  • The integer 3 in A is mapped to 'AA' in C.

On execution in Matlab the output is −

>> A = [1 3 2; 2 1 3; 3 1 2]

C = categorical(A,[1 2 3],{'AA' 'BB' 'CC'})

A =

   1     3     2
   2     1     3
   3     1     2

C = 

  3x3 categorical array

   AA      CC      BB 
   BB      AA      CC 
   CC      AA      BB 

>> 

Example 3

In this example we are going to create an ordinal categorical array.

An ordinal categorical array follows a natural ordering or hierarchy among the categories present.

A = [3 2;3 3;3 2;2 1;3 2]
valueset = [1:3];
categorynames = {'poor' 'fair' 'good'};

B = categorical(A,valueset,categorynames ,'Ordinal',true)

The matrix A is a 5x2 matrix containing integer values. This matrix has 5 rows and 2 columns, with each element being an integer.

valueset is a vector [1, 2, 3] which defines the possible integer values present in the matrix A. This represents the levels of the ordinal categories.

categorynames is a cell array {'poor', 'fair', 'good'} that assigns the corresponding category names to the values in valueset. The order of category names corresponds to the order of values in valueset.

The code : B = categorical(A, valueset, categorynames, 'Ordinal', true); 

converts the numeric array A into an ordinal categorical array B using the specified valueset and categorynames.

  • A is the numeric array to be converted.
  • valueset defines the distinct integer values present in the array.
  • categorynames provides the corresponding category names for the integer values.
  • 'Ordinal', true specifies that the categorical array B should be treated as ordinal, meaning that the order of categories matters.

On executing the output is as follows −

>> A = [3 2;3 3;3 2;2 1;3 2]
valueset = [1:3];
categorynames = {'poor' 'fair' 'good'};

B = categorical(A,valueset,categorynames ,'Ordinal',true)

A =

   3     2
   3     3
   3     2
   2     1
   3     2

B = 

  5x2 categorical array

   good      fair 
   good      good 
   good      fair 
   fair      poor 
   good      fair 

>> 

Using discretize() Function

Another way to create category type data is using the discretize() function in Matlab.

This function takes a numeric array and divides it into discrete categories or bins based on specified edges or the number of desired bins.

Syntax

Y = discretize(X,edges)

Here

X − The input data that you want to discretize.

edges − The edges of the bins or intervals into which you want to categorize the data.

Example

scores = [68, 75, 82, 90, 55, 78, 92, 60, 88, 72];
edges = [0, 60, 80, 100];
categories = discretize(scores, edges)

In this example, the edges array defines the intervals for categorizing the scores. Scores below 60 will be categorized as "Low", scores between 60 and 80 as "Medium", and scores between 80 and 100 as "High".

On execution the output is as follows −

>> scores = [68, 75, 82, 90, 55, 78, 92, 60, 88, 72];
edges = [0, 60, 80, 100];
categories = discretize(scores, edges)

categories =

   2     2     3     3     1     2     3     2     3     2
Advertisements