Mithilesh Pradhan - Published 44 Articles

Article Categories

Selected Reading

Articles by Mithilesh Pradhan

44 articles

Exploring Data Distribution

Machine Learning Python Server Side Programming Programming

Mithilesh Pradhan

Updated on 27-Feb-2025 722 Views

Introduction The distribution of data gives us useful insights into the data while working with any data science or machine learning use case. Data Distribution is how the data is available and its present condition, the information about specific parts of the data, any outliers in the data as well as central tendencies related to the data. To explore the data distribution there popular graphical methods that prove beneficial while working with the data. In this article let us explore these methods. Know more about your data: The Graphical Way Histograms & KDE Density Plots Histograms are the most ...

Improving model accuracy with cross validation technique

Machine Learning Artificial Intelligence Data Science

Mithilesh Pradhan

Updated on 26-Sep-2023 422 Views

Introduction Cross Validation (CV) is a way of training machine learning models in which multiple models are trained on a part of the data and then accessing their performance or testing them on a independent unseen set of data. In the Cross-validation technique, we generally split the original train data into different parts iteratively so that the algorithm trains and validates itself on each portion of the data none of them are left out in the process In this article let us have a deep good understanding of the Cross-Validation technique and its significance in improving Model accuracy. Cross Validation ...

Checking the normality of a data set or a feature

Machine Learning Artificial Intelligence Data Science

Mithilesh Pradhan

Updated on 26-Sep-2023 670 Views

Introduction Normality is defined as the phenomenon of belonging to a normal or Gaussian distribution in statistical terms. The normality of a dataset is the test for a dataset or variable if it follows a normal distribution. Many tests can be performed to check the normality of a dataset among which the most popular ones are the Histogram method, the QQ plot, and the KS Test. Normality testing – Checking for Normality There are both statistical and graphical approaches to determining the normality of a dataset or a feature. Let us look through some of these methods. Graphical Methods Histogram ...

What is OOB error?

Machine Learning Artificial Intelligence Data Science

Mithilesh Pradhan

Updated on 26-Sep-2023 587 Views

Introduction OOB or Out of Bag error and OOB Score is a term related to Random Forests. Random Forest is an ensemble of decision trees that improves the prediction from that of a single decision tree.OOB error is used to measure the error in the prediction of tree-based models like random forests, decision trees, and other ML models using the bagging method. In an OOB sample, the number of wrong classifications is an OOB error. In this article let's explore OOB error/score. Before moving ahead let us a short overview of Random Forest and Decision Trees. Random Forest Algorithm Random ...

The Hathaway Effect: Does The Anne Hathaway Effect Really True?

Machine Learning Artificial Intelligence Data Science

Mithilesh Pradhan

Updated on 26-Sep-2023 971 Views

Introduction Today Machine Learning plays a crucial role in predicting stock prices and the growth of popular organizations and investment banks. While working on many such problems we consider many relations and correlations between different kinds of factors. The Anne Hathaway Effect is one such peculiar correlation related to popular businessman and investor Warren Buffet, Anne Hathaway, and his company Berkshire Hathaway(BRK). In this article let us know more about the effects and observations around this phenomenon. The Anne Hathaway Effect The Hathaway effect news was first picked up by CNBC. According to this effect, whenever Anne ...

Techniques to find similarities in recommendation system

Machine Learning Techniques Organization

Mithilesh Pradhan

Updated on 26-Sep-2023 502 Views

Introduction Similarity metrics are crucial in Recommendation Systems to find users with similar behavior, pattern, or taste. Nowadays Recommendation systems are found in lots of useful applications such as Movie Recommendations as in Netflix, Product Recommendations as in Ecommerce, Amazon, etc. Organizations use preference matrices to capture use behavioral and feedback data on products on specific attributes. They also capture the sequence and trend of users purchasing products and users with similar behavior are captured in the process. In this article, let's understand in brief the idea behind a recommendation system and explore the similar techniques and measures involved in ...

Limitations of fixed basis function

Machine Learning Numpy Server Side Programming

Mithilesh Pradhan

Updated on 26-Sep-2023 448 Views

Introduction Fixed basis functions are functions that help us to extend linear models in Machine Learning, by taking linear combinations of nonlinear functions. Since Linear models depend on the linear combination of parameters, they suffer a significant limitation. The radial function thus helps model such a group of models by utilizing non-linearity in the data while keeping the parameters linear. Different linear combinations of the fixed basis functions are used within the linear regression by creating complex functions. In this article let us look into the fixed basis function and its limitations Fixed Basis function A linear regression model ...

Python | Measure similarity between two sentences using cosine similarity

Python Server Side Programming Programming

Mithilesh Pradhan

Updated on 26-Sep-2023 2K+ Views

Introduction Natural Language Processing for finding the semantic similarity between sentences, words, or text is very common in modern use cases. There are numerous ways to calculate the similarity between texts. One such popular method is cosine similarity. It is used to find the similarity between two vectors that are non-zero in value and measures the cosine of the angle between the two vectors using dot product formula notation. Through this article let us briefly explore cosine similarity and see its implementation using Python. Cosine similarity – Finding similarity between two texts Cosine Similarity is defined as the cosine of ...

Handling sparsity issues in recommendation system

Machine Learning E-Commerce Video Content

Mithilesh Pradhan

Updated on 22-Sep-2023 610 Views

Introduction In Recommendation Systems, Collaborative filtering is one of the approaches to building a model and finding seminaries between users. This concept is highly used in Ecommerce sites and OTT and video-sharing platforms. One of the highly talked about issues that such systems face while in the initial modeling phase is that of data sparsity, which occurs when only a few users give ratings or reviews on the platform and are in any way involved in the interaction. In this article let us understand the problem of data sparsity in the Recommendation System and know about ways to handle it. ...

Difference Between Training and Testing Data

Differences Machine Learning Testing Tools

Mithilesh Pradhan

Updated on 22-Sep-2023 7K+ Views

Introduction In Machine Learning, a good model is generated if we have a good representation and amount of data. Data may be divided into different sets that serve a different purposes while training a model. Two very useful and common sets of data are the training and testing set. The training set is the part of the original dataset used to train the model and find a good fit. Testing data is part of the original data used to validate the model train and analyze the metrics calculated. In this article lets us explore training and testing data sets in ...

Showing 1–10 of 44 articles

« Prev 1 2 3 4 5 Next »