Hyperparameter Tuning in Machine Learning

Machine Learning Artificial Intelligence MLOps

Introduction

Hyperparameter tuning in machine learning is a technique where we tune or change the default parameters of the existing model or algorithm to achieve higher accuracies and better performance. Sometimes when we use the default parameters of the algorithms, it does not suit the existing data as the data can vary according to the problem statement. In that case, the hyperparameter tuning becomes an essential part of the model building to enhance the model's performance.

This article will discuss the algorithm's hyperparameter tuning, advantages, and other related things. This will help one understand the concept of hyperparameter tuning and its need and help one perform it for any kind of data and model.

What is the Need for Hyperparameter Tuning

In machine learning, the behavior and patterns of every data cannot be the same, and the problem statement that we are working with also varies. Almost all machine learning algorithms have their default parameters, which will be applied if no specific parameters are selected.

In every case or project, the default parameters of the machine learning algorithm can not be the best suitable solution for us; the model may perform well with the default parameters but in some cases, the performance and reliability can still be increased by tuning those parameters.

We can tune the parameters of the algorithm as per our need, the type of data, the behavior of the data, data patterns, and the target we want to achieve from the model.

Model Parameters vs. Hyper Parameters

Most probably, you might have heard the terms model parameters and hyperparameters, and you might have thought that they are the same; although the terms' purpose is the same, they are still slightly different from each other.

The model parameters are those which the model derives from training on the data; here, the model trains on the data, observes its patterns, takes a decision, and at the end of the training, the model sets some parameters according to the learning from the data. For example, Linear regression learns from the data and derives the value of slope and y-intercept as per the data behavior and its patterns. Note that these parameters are model derived, and we can not control them.

Whereas we can control the hyperparameters of the model, and we can control those parameters while defining the model according to our needs the model.

Hyperparameters Space

As we know that there are many two types of parameters related to the machine learning model, one which is derived from the model itself and uses the same while predicting, and one which we can control and change as power our need.

Now the job here is to get the best possible combination of the parameters that suits the data and the model to get a very accurate model; as we know, sometimes it may happen that while tuning one of the parameters, the value of another parameter does not suit the model and affects the overall accuracy of the model, so the selection of best combination of parameters is essential.

Hyperparameter space is the platform that helps us perform the hyperparameter tuning on the model; here, we provide it with a range and combination of the algorithm, and it tries every combination possible for specified parameters and returns the combination which best suits the model and its accuracy.

To search in hyperparameter space, there are mainly two types of libraries, GridSearchCV and RandomSearchCV. Here the Cv stands for cross-validation, where the model's accuracy is measured multiple times.

Preventing Data Leakage

During the training and hyperparameters tuning of the model, one thing should be noted the data should be split into three categories: training, testing, and validation. In general, we split the data into two categories, but splitting the data into three categories is advisable while performing the hyperparameter tuning.

Splitting the data into one more category is to prevent data leakage. If the data leakage happens in some cases, then the model will perform well while training, but when we move to the prediction phase, and when the model gets the real-time data to predict, it will fail; So, to validate the data and model it is essential to split it into three categories.

Using GridSearchCV

We can directly use the GridSearchCv by importing it from sklearn.model_selection, and it will iterate through all of the specified parameters of the model. It will return the best suitable condition for the data and the model.

Example

from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
from sklearn.model_selection import GridSearchCV

knnc = KNeighborsClassifier()
knnc.fit(X_train, y_train)
param_grid = {‘n_neighbors’: list(range(1,10)),’algorithm’: (‘auto’, ‘brute’) }
gs = GridSearchCV(knnc,param_grid,cv=10)
gs.fit(X_train, y_train)
Gs.best_params_

As we can see in the above code, we are using KNN classifier as an algorithm, and the grid search cv is provided with a parameters grid, here the GridSearchCV will apply all the parameters grid and will return the best combination of the model parameters when run.

Using RandomSearchCV

Sometimes the computational complexity becomes very high when using GridSearchCv as it attempts to every combination of the parameters and computes for that. Instead, the RandomSearchCv searches the parameters randomly, and the model is trained on random hyperparameters and combinations.

from sklearn.model_selection import RandomizedSearchCV
from sklearn.ensemble import RandomForestClassifier
from scipy.stats import randint as sp_randint
clf = RandomForestClassifier(n_estimators=100)
param_dist = {"max_depth": [4, None],
   "max_features": sp_randint(1, 15),
   "min_samples_split": sp_randint(2, 11),
   "criterion": ["gini", "entropy"]}

samples = 8 
randomCV = RandomizedSearchCV(clf, param_distributions=param_dist, n_iter=samples,cv=5)
randomCV.fit(X, y)
print(randomCV.best_params_)

Key Takeaways

Hyperparanmeytr tuning is essential to get the best out of the model.
We can split the data into one more category called validation split to prevent data leakage.
GridSearchCV is computationally more complex than RandomSearchCV as it trains on each combination of the model’s parameters.
RandomSearchCV can be used if you want quicker results from the model with the best combination of the hyperparameters.

Conclusion

In this article, we discussed the hyperparameters tuning of the machine learning model, the need for it, what is the difference between model’s parameters and hyperparameters, and how one can implement this using GridSearchCV and RandomSearchCV. This will help one to understand the concept of hyperparameter tuning better and will help one to apply the same to any data and model.

Parth Shukla

Updated on: 24-Feb-2023

483 Views

Kickstart Your Career

Get certified by completing the course

Get Started