What is Reinforcement Learning? How is it different from supervised and unsupervised learning?

In reinforcement learning methods, a trained agent interacts with a specific environment and takes actions based upon the current state of that environment.

The working of reinforcement learning is as follows −

  • First you need to prepare an agent with some specific set of strategies.
  • Now leave the agent to observe the current state of the environment.
  • Based on the agent's observation, select the optimal policy, and perform suitable action.
  • Based on the action taken, the agent will get reward or penalty.
  • Update the set of strategies used in step 1, if needed. Repeat the process from step1-4 until the agent learns and adopts the optimal policy as well.

Supervised learning methods, as we know, take both training data and its associated output during the training process. But the unsupervised learning methods do not require any labels or responses along with the training data and they learn patterns and relationships from the given raw data. Whereas in reinforcement learning methods the agent interacts with a specific environment in discrete steps.

If we talk about the output, supervised learning methods prediction is based on a class type and unsupervised learning methods discover underlying patterns but in reinforcement learning methods, there is a reward and action system in which the learning agent works.