Supervised learning, one of the most used methods in ML, takes both training data (also called data samples) and its associated output (also called labels or responses) during the training process. The major goal of supervised learning methods is to learn the association between input training data and their labels. For this it performs multiple training data instances.
Let’s understand its working with the help of below given example −
Suppose we have,
Input variables − m and
Output variable − N
The mapping function from the input to output is as follows −
𝑁 = 𝑓(𝑚)
To learn such mapping function, we need an algorithm whose key objective is to approximate the mapping function so well that one can easily predict the N i.e., the output variable for new input data as well.
Supervised algorithms are called supervised because the machine learning model learns from data samples where the output is known in advance. In this sense, the whole process of learning in supervised learning algorithms can be thought as it is being supervised by a supervisor.
Some of the well-known supervised machine learning algorithms are KNN (k-nearest neighbors), Decision tree, Logistic Regression, and Random Forest.
Based on machine learning based tasks, we can divide supervised learning algorithms in following two classes −
Classification − Classification-based tasks predict the categorical output responses or labels for the given training data. This output, which will belong to a specific discrete category, is based on what our ML model learns in the training phase. For example, to predict high-risk patients and discriminate such patients from low-risk patients is a classification task.
Regression − Regression-based tasks predict the continues numerical output responses or labels for the given training data. This output will also be based on what our ML model learns in the training phase. For example, to predict the price of houses is a regression task.