Which function of scipy.cluster.vq module is used to assign codes from a code book to observations?


Before implementing k-means algorithms, the scipy.cluster.vq.vq(obs, code_book, check_finite = True) used to assign codes to each observation from a code book. It first compares each observation vector in the ‘M’ by ‘N’ obs array with the centroids in the code book. Once compared, it assigns the code to the closest centroid. It requires unit variance features in the obs array, which we can achieve by passing them through the scipy.cluster.vq.whiten(obs, check_finite = True)function.

Parameters

Below are given the parameters of the function scipy.cluster.vq.vq(obs, code_book, check_finite = True)

  • obs− ndarray

It is an ‘M’ by ‘N’ array where each row is an observation, and the columns are the features seen during each observation. The example is given below −

obs = [[ 1., 1., 1.],
   [ 2., 2., 2.],
   [ 3., 3., 3.],
   [ 4., 4., 4.]]
  • code_book− ndarray

It is also an ‘M’ by ‘N’ array, usually generated by using k-means algorithm, where each row holds a different code, and the columns are the features of that code.

The example is given below −

code_book = [
   [ 1., 2., 3., 4.],
   [ 1., 2., 3., 4.],
   [ 1., 2., 3., 4.]]
  • check_finite− bool,optional

This parameter is used to check whether the input matrices contain only finite numbers. Disabling this parameter may give you a performance gain but it may also result in some problems like crashes or non-termination if the observations do contain infinites. The default value of this parameter is True.

Returns

  • code− ndarray

It returns a ‘M’ array which holds the code book index for each observation.

  • dist− ndarray

It also returns the distance, which is also called distortion, between each observation and its nearest code.

Example

import numpy as np
from scipy.cluster.vq import vq
code_book = np.array([[1.,1.,1.],
   [2.,2.,2.]])
observations = np.array([[2.9, 1.3, 1.9],
   [1.7, 3.2, 1.1],
   [1.0, 0.2, 1.7,]])
vq(observations, code_book)

Output

(array([1, 1, 0]), array([1.14455231, 1.52970585, 1.06301458]))

Updated on: 24-Nov-2021

116 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements