Sub-packages in Python SciPy Library

Gaurav Kumar
Updated on 14-Dec-2021 11:29:20

1K+ Views

To cover different scientific computing domains, SciPy library is organized into various sub-packages. These sub-packages are explained below −Clustering package (scipy.cluster) − This package contains clustering algorithms which are useful in information theory, target detection, compression, communications, and some other areas also. It has two modules namely scipy.cluster.vq and scipy.cluster.hierarchy. As the name entails, the first module i.e., vq module supports only vector quantization and k-meansalgorithms. Whereas the second module i.e., hierarchy module provides functions for agglomerative and hierarchical clustering.Constants(scipy.constants) − It contains mathematical and physical constants. Mathematical constants include pi, golden and golden_ratio. Physical constants include c, speed_of_light, planck, gravitational_constant, etc.Legacy ... Read More

What is SciPy and Why Should We Use It

Gaurav Kumar
Updated on 14-Dec-2021 11:27:50

457 Views

SciPy, pronounced as “Sigh Pie”, is an ecosystem of Python open-source libraries for performing Mathematical, Scientific, and Engineering computations. SciPy stands for Scientific Python and is comprised of the following core packages, called SciPy ecosystem −NumPy − NumPy is a base N-dimensional array package for SciPy that allows us to efficiently work with data in arrays.Matplotlib − Matplotlib is used to create comprehensive 2-D charts and plots from data.Pandas − Pandas is an open-source Python package used to organize and analyze our data.Apart from SciPy ecosystem, there are other related but distinct entities SciPy refers to −Community − It refers to the community of ... Read More

Calculate Minkowski Distance Using SciPy

Gaurav Kumar
Updated on 14-Dec-2021 10:38:44

621 Views

The Minkowski distance, a generalized form of Euclidean and Manhattan distance, is the distance between two points. It is mostly used for distance similarity of vectors. Below is the generalized formula to calculate Minkowski distance in n-dimensional space −$$\mathrm{D= \big[\sum_{i=1}^{n}|r_i-s_i|^p\big]^{1/p}}$$Here, si and ri are data points.n denotes the n-space.p represents the order of the normSciPy provides us with a function named minkowski that returns the Minkowski Distance between two points. Let’s see how we can calculate the Minkowski distance between two points using SciPy library −Example# Importing the SciPy library from scipy.spatial import distance # Defining the points A = ... Read More

Calculate Manhattan Distance Using SciPy

Gaurav Kumar
Updated on 14-Dec-2021 10:24:49

1K+ Views

The Manhattan distance, also known as the City Block distance, is calculated as the sum of absolute differences between the two vectors. It is mostly used for the vectors that describe objects on a uniform grid such as a city block or chessboard. Below is the generalized formula to calculate Manhattan distance in n-dimensional space −$$\mathrm{D =\sum_{i=1}^{n}|r_i-s_i|}$$Here, si and ri are data points.n denotes the n-space.SciPy provides us with a function named cityblock that returns the Manhattan Distance between two points. Let’s see how we can calculate the Manhattan distance between two points using SciPy library−Example# Importing the SciPy library ... Read More

Calculate Euclidean Distance Using SciPy

Gaurav Kumar
Updated on 14-Dec-2021 10:24:02

992 Views

Euclidean distance is the distance between two real-valued vectors. Mostly we use it to calculate the distance between two rows of data having numerical values (floating or integer values). Below is the formula to calculate Euclidean distance −$$\mathrm{d(r, s) =\sqrt{\sum_{i=1}^{n}(s_i-r_i)^2} }$$Here, r and s are the two points in Euclidean n-space.si and ri are Euclidean vectors.n denotes the n-space.Let’s see how we can calculate Euclidean distance between two points using SciPy library −Example# Importing the SciPy library from scipy.spatial import distance # Defining the points A = (1, 2, 3, 4, 5, 6) B = (7, 8, 9, 10, 11, ... Read More

Implement K-Means Clustering on Diabetes Dataset with SciPy

Gaurav Kumar
Updated on 14-Dec-2021 08:59:17

850 Views

The Pima Indian Diabetes dataset, which we will be using here, is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. Based on the following diagnostic factors, this dataset can be used to place a patient in ether diabetic cluster or non-diabetic cluster −PregnanciesGlucoseBlood PressureSkin ThicknessInsulinBMIDiabetes Pedigree FunctionAgeYou can get this dataset in .CSV format from Kaggle website.ExampleThe example below will use SciPy library to create two clusters namely diabetic and non-diabetic from the Pima Indian diabetes dataset.#importing the required Python libraries: import matplotlib.pyplot as plt import numpy as np from scipy.cluster.vq import whiten, kmeans, vq ... Read More

Implement K-Means Clustering with SciPy by Splitting Random Data into 3 Clusters

Gaurav Kumar
Updated on 14-Dec-2021 08:48:44

180 Views

Yes, we can also implement a K-means clustering algorithm by splitting the random data in 3 clusters. Let us understand with the example below −Example#importing the required Python libraries: import numpy as np from numpy import vstack, array from numpy.random import rand from scipy.cluster.vq import whiten, kmeans, vq from pylab import plot, show #Random data generation: data = vstack((rand(200, 2) + array([.5, .5]), rand(150, 2))) #Normalizing the data: data = whiten(data) # computing K-Means with K = 3 (3 clusters) centroids, mean_value = kmeans(data, 3) print("Code book :", centroids, "") print("Mean of Euclidean distances :", mean_value.round(4)) ... Read More

Implement K-Means Clustering with Scipy on Random Data

Gaurav Kumar
Updated on 14-Dec-2021 08:42:53

433 Views

K-means clustering algorithm, also called flat clustering, is a method of computing the clusters and cluster centers (centroids) in a set of unlabeled data. It iterates until we find the optimal centroid. The clusters, we might think of a group of data points whose inter-point distances are small as compared to the distances to the point outside of that cluster. The number of clusters identified from unlabeled data is represented by ‘K’ in K-means algorithm.Given an initial set of K centers, the K-means clustering algorithm can be done using SciPy library by executing by the following steps −Step1− Data point ... Read More

Canva or Adobe Spark: Which is Better?

Zahwah Jameel
Updated on 09-Dec-2021 11:35:59

273 Views

Starting out new in graphics design? Confused about which platform would be best for you? This small article may help you find the solution to your problem. Adobe Spark and Canva are the biggest emerging names in the world of graphics design so here is a comparative study to help you decide which one will suit you the best.What is Adobe Spark?Adobe Spark is a design platform which can be used to create small videos, webpages and designs. It also allows you to share these creations on social media platforms.What is Canva?Canva is a free graphics design platform that aids ... Read More

Tips to Create an Engaging Infographic in Canva

Zahwah Jameel
Updated on 09-Dec-2021 11:10:00

201 Views

How to Make Your Infographic Stand Out?Stuck at a creative block? Don’t know how to effectively express your ideas through infographics? This article will help you kick-start your journey of creating an engaging infographic using Canva. Here we will highlight some of the major points that you need to keep in mind in order to create an engaging infographic.Focus on the Targeted AudienceWhen creating an infographic, there is always a vision in mind, whether it’s a small business looking to increase its traffic on social platforms or an educator looking for an effective method to educate their students. For this, ... Read More

Advertisements