Mithilesh Pradhan

Mithilesh Pradhan

44 Articles Published

Articles by Mithilesh Pradhan

Page 2 of 5

How to Create simulated data for classification in Python

Mithilesh Pradhan
Mithilesh Pradhan
Updated on 26-Mar-2026 469 Views

In this tutorial we will learn how to create simulated data for classification in Python using popular libraries like scikit-learn and Faker. Introduction Simulated data can be defined as any data not representing the real phenomenon but which is generated synthetically using parameters and constraints. This synthetic data mimics real-world patterns and relationships while being completely controllable. When and Why Do We Need Simulated Data? Sometimes while prototyping a particular algorithm in Machine Learning or Deep Learning we generally face a scarcity of good real-world data which can be useful to us. Sometimes there is no ...

Read More

Improving model accuracy with cross validation technique

Mithilesh Pradhan
Mithilesh Pradhan
Updated on 26-Sep-2023 460 Views

Introduction Cross Validation (CV) is a way of training machine learning models in which multiple models are trained on a part of the data and then accessing their performance or testing them on a independent unseen set of data. In the Cross-validation technique, we generally split the original train data into different parts iteratively so that the algorithm trains and validates itself on each portion of the data none of them are left out in the process In this article let us have a deep good understanding of the Cross-Validation technique and its significance in improving Model accuracy. Cross Validation ...

Read More

Checking the normality of a data set or a feature

Mithilesh Pradhan
Mithilesh Pradhan
Updated on 26-Sep-2023 696 Views

Introduction Normality is defined as the phenomenon of belonging to a normal or Gaussian distribution in statistical terms. The normality of a dataset is the test for a dataset or variable if it follows a normal distribution. Many tests can be performed to check the normality of a dataset among which the most popular ones are the Histogram method, the QQ plot, and the KS Test. Normality testing – Checking for Normality There are both statistical and graphical approaches to determining the normality of a dataset or a feature. Let us look through some of these methods. Graphical Methods Histogram ...

Read More

What is OOB error?

Mithilesh Pradhan
Mithilesh Pradhan
Updated on 26-Sep-2023 641 Views

Introduction OOB or Out of Bag error and OOB Score is a term related to Random Forests. Random Forest is an ensemble of decision trees that improves the prediction from that of a single decision tree.OOB error is used to measure the error in the prediction of tree-based models like random forests, decision trees, and other ML models using the bagging method. In an OOB sample, the number of wrong classifications is an OOB error. In this article let's explore OOB error/score. Before moving ahead let us a short overview of Random Forest and Decision Trees. Random Forest Algorithm Random ...

Read More

The Hathaway Effect: Does The Anne Hathaway Effect Really True?

Mithilesh Pradhan
Mithilesh Pradhan
Updated on 26-Sep-2023 1K+ Views

Introduction Today Machine Learning plays a crucial role in predicting stock prices and the growth of popular organizations and investment banks. While working on many such problems we consider many relations and correlations between different kinds of factors. The Anne Hathaway Effect is one such peculiar correlation related to popular businessman and investor Warren Buffet, Anne Hathaway, and his company Berkshire Hathaway(BRK). In this article let us know more about the effects and observations around this phenomenon. The Anne Hathaway Effect The Hathaway effect news was first picked up by CNBC. According to this effect, whenever Anne ...

Read More

Techniques to find similarities in recommendation system

Mithilesh Pradhan
Mithilesh Pradhan
Updated on 26-Sep-2023 518 Views

Introduction Similarity metrics are crucial in Recommendation Systems to find users with similar behavior, pattern, or taste. Nowadays Recommendation systems are found in lots of useful applications such as Movie Recommendations as in Netflix, Product Recommendations as in Ecommerce, Amazon, etc. Organizations use preference matrices to capture use behavioral and feedback data on products on specific attributes. They also capture the sequence and trend of users purchasing products and users with similar behavior are captured in the process. In this article, let's understand in brief the idea behind a recommendation system and explore the similar techniques and measures involved in ...

Read More

Limitations of fixed basis function

Mithilesh Pradhan
Mithilesh Pradhan
Updated on 26-Sep-2023 478 Views

Introduction Fixed basis functions are functions that help us to extend linear models in Machine Learning, by taking linear combinations of nonlinear functions. Since Linear models depend on the linear combination of parameters, they suffer a significant limitation. The radial function thus helps model such a group of models by utilizing non-linearity in the data while keeping the parameters linear. Different linear combinations of the fixed basis functions are used within the linear regression by creating complex functions. In this article let us look into the fixed basis function and its limitations Fixed Basis function A linear regression model ...

Read More

Handling sparsity issues in recommendation system

Mithilesh Pradhan
Mithilesh Pradhan
Updated on 22-Sep-2023 657 Views

Introduction In Recommendation Systems, Collaborative filtering is one of the approaches to building a model and finding seminaries between users. This concept is highly used in Ecommerce sites and OTT and video-sharing platforms. One of the highly talked about issues that such systems face while in the initial modeling phase is that of data sparsity, which occurs when only a few users give ratings or reviews on the platform and are in any way involved in the interaction. In this article let us understand the problem of data sparsity in the Recommendation System and know about ways to handle it. ...

Read More

Difference Between Training and Testing Data

Mithilesh Pradhan
Mithilesh Pradhan
Updated on 22-Sep-2023 7K+ Views

Introduction In Machine Learning, a good model is generated if we have a good representation and amount of data. Data may be divided into different sets that serve a different purposes while training a model. Two very useful and common sets of data are the training and testing set. The training set is the part of the original dataset used to train the model and find a good fit. Testing data is part of the original data used to validate the model train and analyze the metrics calculated. In this article lets us explore training and testing data sets in ...

Read More

Non-Linear SVM in Machine Learning

Mithilesh Pradhan
Mithilesh Pradhan
Updated on 27-Aug-2023 4K+ Views

Introduction Support Vector Machine (SVM) is one of the most popular supervised Machine Learning algorithms for classification as well as regression. The SVM Algorithm strives to find a line of best fit between n−dimensional data to separate them into classes. a new data point can thus be classified into one of these classes. The SVM algorithm creates two hyperplanes while maximizing the margin between them. The points that lie on these hyperplanes are known as Support Vectors and hence the name Support Vector Machine. The below diagram shows the decision boundary and hyperplanes for an SVM that is used to ...

Read More
Showing 11–20 of 44 articles
Advertisements