Feature scaling is an important step in the data pre-processing stage in building machine learning algorithms. It helps normalize the data to fall within a specific range.At times, it also helps in increasing the speed at which the calculations are performed by the machine.Why it is needed?Data fed to the learning algorithm as input should remain consistent and structured. All features of the input data should be on a single scale to effectively predict the values. But in real-world, data is unstructured, and most of the times, not on the same scale.This is when normalization comes into picture. It is ... Read More
Pre-processing data refers to cleaning of data, removing invalid data, noise, replacing data with relevant values and so on.Data pre-processing basically refers to the task of gathering all the data (which is collected from various resources or a single resource) into a common format or into uniform datasets (depending on the type of data). The output of one step becomes the input to the next step and so on.Mean values might have to be removed from input data to get specific result. Let us understand how it can be achieved using scikit-learn library.Exampleimport numpy as np from sklearn import preprocessing ... Read More
Decision tree is the basic building block of the random forest algorithm. It is considered as one of the most popular algorithms in machine learning and is used for classification purposes. The decision given out by a decision tree can be used to explain why a certain prediction was made. This means the in and out of the process would be clear to the user. They are also known as CART, i.e Classification And Regression Trees. It can be visualized as a binary tree (the one studied in data structures and algorithms).Every node in the tree represents a single input ... Read More
Scikit-learn, commonly known as sklearn is a library in Python that is used for the purpose of implementing machine learning algorithms. It is powerful and robust, since it provides a wide variety of tools to perform statistical modelling.This includes classification, regression, clustering, dimensionality reduction, and much more with the help of a powerful, and stable interface in Python. Built on Numpy, SciPy and Matplotlib libraries.Before passing the input data to the Machine Learning algorithm, it has to be split into training and test dataset.Once the data is fit to the chosen model, the input dataset is trained on this model. ... Read More
Visualizing data is an important step since it helps understand what is going on in the data without actually looking at the numbers and performing complicated computations. Seaborn is a library that helps in visualizing data. It comes with customized themes and a high level interface.General scatter plots, histograms, etc can’t be used when the variables that need to be worked with are categorical in nature. This is when categorical scatterplots need to be used.Plots such as ‘stripplot’, ‘swarmplot’ are used to work with categorical variables. The ‘stripplot’ function is used when atleast one of the variables is categorical. The ... Read More
Data present in large amounts needs to be dealt with properly. This is why computers with large capacities are used. Scientific and technical computations of large datasets can be done with the help of a library in Python known as SciPy. SciPy is short of ‘Scientific Python’.The Numpy library in Python is a pre-requisite to SciPy because SciPy is built on top of Numpy. Ensure that Numpy library is installed before installing SciPy library. It is an open-source software that is easily available to install and use.It has many features of data science and machine learning that are required to ... Read More
Visualizing data is an important step since it helps understand what is going on in the data without actually looking at the numbers and performing complicated computations. Seaborn is a library that helps in visualizing data. It comes with customized themes and a high-level interface.Kernel Density Estimation, also known as KDE is a method in which the probability density function of a continuous random variable can be estimated.This method is used for the analysis of the non-parametric values. While using ‘distplot’, if the argument ‘kde’ is set to True and ‘hist’ is set to False, the KDE can be visualized.Let ... Read More
Scikit-learn, commonly known as sklearn is an open-source library in Python that is used for the purpose of implementing machine learning algorithms.This includes classification, regression, clustering, dimensionality reduction, and much more with the help of a powerful, and stable interface in Python. This library is built on Numpy, SciPy and Matplotlib libraries.Let us see an example to load data −Examplefrom sklearn.datasets import load_iris my_data = load_iris() X = my_data.data y = my_data.target feature_name = my_data.feature_names target_name = my_data.target_names print("Feature names are : ", feature_name) print("Target names are : ", target_name) print("First 8 rows of the dataset are : ", X[:8])OutputFeature ... Read More
Scikit-learn, commonly known as sklearn is a library in Python that is used for the purpose of implementing machine learning algorithms.It is an open-source library hence it can be used free of cost. Powerful and robust, since it provides a wide variety of tools to perform statistical modelling. This includes classification, regression, clustering, dimensionality reduction, and much more with the help of a powerful, and stable interface in Python. This library is built on Numpy, SciPy and Matplotlib libraries.It can be installed using the ‘pip’ command as shown below −pip install scikit-learnThis library focuses on data modelling.There are many models ... Read More
Identity MatrixAn identity Matrix is a matrix which is n × n square matrix where the diagonal consist of ones and the other elements are all zeros.For example an identity matrix of order is will be −const arr = [ [1, 0, 0], [0, 1, 0], [0, 0, 1] ];We are required to write a JavaScript function that takes in a number, say n, and returns an identity matrix of n*n order.ExampleFollowing is the code −const num = 5; const constructIdentity = (num = 1) => { const res = []; for(let i = 0; ... Read More