Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Scientific Computing Articles
Page 3 of 3
Calculating the Hamming distance using SciPy
Hamming distance calculates the distance between two binary vectors. Mostly we find the binary strings when we use one-hot encoding on categorical columns of data. In one-hot encoding the integer variable is removed and a new binary variable will be added for each unique integer value. For example, if a column had the categories say ‘Length’, ‘Width’, and ‘Breadth’. We might one-hot encode each example as a bitstring with one bit for each column as follows −Length = [1, 0, 0]Width = [0, 1, 0]Breadth = [0, 0, 1]The Hamming distance between any of the two categories mentioned above, can ...
Read MoreWhat is the difference between scipy.cluster.vq.kmeans() and scipy.cluster.vq.kmeans2() methods?
The scipy.cluster.vq()has two methods to implement k-means clustering namely kmeans() and kmeans2(). There is a significant difference in the working of both these methods. Let us understand it −scipy.cluster.vq.kmeans(obs, k_or_guess, iter=20, thresh=1e-05, check_finite=True)− The kmeans() method forms k clusters by performing k-means algorithm on a set of observation vectors. To determine the stability of the centroids, this method uses a threshold value to compare the change in average Euclidean distance between the observations and their corresponding centroids. The output of this method is a code book mapping centroid to codes and vice versa.scipy.cluster.vq.kmeans2(data, k, iter=10, thresh=1e-05, minit='random', missing='warn', check_finite=True)− The ...
Read MoreWhat is scipy.cluster.vq.kmeans()method?
The scipy.cluster.vq.kmeans(obs, k_or_guess, iter=20, thresh=1e- 05, check_finite=True)method forms k clusters by performing a k-means algorithm on a set of observation vectors. To determine the stability of the centroids, this method uses a threshold value to compare the change in average Euclidean distance between the observations and their corresponding centroids. The output of this method is a code book mapping centroid to codes and vice versa.Below is given the detailed explanation of its parameters−Parametersobs− ndarrayIt is an ‘M’ by ‘N’ array where each row is an observation, and the columns are the features seen during each observation. Before using, these features ...
Read MoreWhich function of scipy.cluster.vq module is used to assign codes from a code book to observations?
Before implementing k-means algorithms, the scipy.cluster.vq.vq(obs, code_book, check_finite = True) used to assign codes to each observation from a code book. It first compares each observation vector in the ‘M’ by ‘N’ obs array with the centroids in the code book. Once compared, it assigns the code to the closest centroid. It requires unit variance features in the obs array, which we can achieve by passing them through the scipy.cluster.vq.whiten(obs, check_finite = True)function.ParametersBelow are given the parameters of the function scipy.cluster.vq.vq(obs, code_book, check_finite = True) −obs− ndarrayIt is an ‘M’ by ‘N’ array where each row is an observation, and ...
Read MoreWhich function of scipy.cluster.vq module is used to normalize observations on each feature dimension?
Before implementing k-means algorithms, it is always beneficial to rescale each feature dimension of the observation set. The function scipy.cluster.vq.whiten(obs, check_finite = True)is used for this purpose. To give it unit variance, it divides each feature dimension of the observation by its standard deviation (SD).ParametersBelow are given the parameters of the function scipy.cluster.vq.whiten(obs, check_finite = True) −obs− ndarrayIt is an array, to be rescaled, where each row is an observation, and the columns are the features seen during each observation. The example is given below −obs = [[ 1., 1., 1.], [ 2., 2., 2.], ...
Read MoreHow can we call the documentation for NumPy and SciPy?
If you are unsure of how to use a particular function or variable in NumPy and SciPy, you can call for the documentation with the help of ‘?’. In Jupyter notebook and IPython shell we can call up the documentation as follows −ExampleIf you want to know NumPy sin () function, you can use the below code −import numpy as np np.sin?OutputWe will get the details about sin() function something like as follows −We can also view the source with the help of double question mark (??) as follows −import numpy as np np.sin??Similarly, if you want to see the ...
Read MoreTo work with SciPy, do I need to import the NumPy functions explicitly?
When SciPy is imported, you do not need to explicitly import the NumPy functions because by default all the NumPy functions are available through SciPy namespace. But as SciPy is built upon the NumPy arrays, we must need to know the basics of NumPy.As most parts of linear algebra deals with vectors and matrices only, let us understand the basic functionalities of NumPy vectors and matrices.Creating NumPy vectors by converting Python array-like objectsLet us understand this with the help of following example−Exampleimport numpy as np list_objects = [10, 20, 30, 40, 50, 60, 70, 80, 90] array_new = np.array(list_objects) print ...
Read More