Practical Multi-Armed Bandit Algorithms in Python

person icon Edward Pie

Practical Multi-Armed Bandit Algorithms in Python

Acquire skills to build digital AI agents capable of adaptively making critical business decisions under uncertainties.

updated on icon Updated on Sep, 2023

language icon Language - English

person icon Edward Pie

architecture icon Python,Data Science


30-days Money-Back Guarantee

Training 5 or more people ?

Get your team access to 19,000+ top Tutorialspoint courses anytime, anywhere.

Course Description

This course is your perfect entry point into the exciting field of Reinforcement Learning where digital Artificial Intelligence agents are built to automatically learn how to make sequential decisions through trial and error. Specifically, this course focuses on the Multi-Armed Bandit problems and the practical hands-on implementation of various algorithmic strategies for balancing between exploration and exploitation. Whenever you desire to consistently make the best choice out of a limited number of options over time, you are dealing with a Multi-Armed Bandit problem and this course teaches you every detail you need to know to be able to build realistic business agents to handle such situations. 

With very concise explanations, this course teaches you how to confidently translate seemingly scary mathematical formulas into Python code painlessly. We understand that not many of us are technically adept in the subject of mathematics so this course intentionally stays away from maths unless it is necessary. And even when it becomes necessary to talk about mathematics, the approach taken in this course is such that anyone with basic algebra skills can understand and most importantly easily translate the maths into code and build useful intuitions in the process.

Some of the algorithmic strategies taught in this course are Epsilon Greedy, Softmax Exploration, Optimistic Initialization, Upper Confidence Bounds, and Thompson Sampling. With these tools under your belt, you are adequately equipped to readily build and deploy AI agents that can handle critical business operations under uncertainties. 

To bridge the gap between theory and application, I've updated this course to include a section where I show how to apply the MAB algorithms in Robotics using the EV3 Mindstorm. I'll soon upload a section that will show how to apply the algorithms taught in this course to optimize advertisements.


What will you learn in this course:

  1. Understanding and being able to identify Multi-Armed Bandit problems.
  2. Modeling real business problems as MAB and implementing digital AI agents to automate them.
  3. Understanding the challenge of RL regarding the exploration-exploitation dilemma.
  4. Practical implementation of the various algorithmic strategies for balancing between exploration and exploitation.
  5. Python implementation of the Epsilon-greedy strategy.
  6. Python implementation of the Softmax Exploration strategy.
  7. Python implementation of the Optimistic Initialization strategy.
  8. Python implementation of the Upper Confidence Bounds (UCB) strategy.
  9. Understand the challenges of RL in terms of the design of reward functions and sample efficiency.
  10. Estimation of action values through incremental sampling.


What are the prerequisites for this course?

  1. Be able to understand basic OOP programs in Python.
  2. Have basic Numpy and Matplotlib knowledge.
  3. Basic algebra skills. If you know how to add, subtract, multiply, and divide numbers, you are good to go.
Practical Multi-Armed Bandit Algorithms in Python


Check out the detailed breakdown of what’s inside the course

Reinforcement Learning & The Multi-Armed Bandit Algorithms
1 Lectures
  • play icon Introduction to reinforcement learning and the multi-armed bandit problem. 22:30 22:30
Implementing Simulated Environments For Your Agents
1 Lectures
Using Sampling Techniques To Estimate Action Values
3 Lectures
Implementing MAB Agents In Python
3 Lectures
Balancing Exploration & Exploitation
5 Lectures

Instructor Details

Edward Pie

Edward Pie

Course Certificate

User your certification to make a career change or to advance in your current career. Salaries are among the highest in the world.

sample Tutorialspoint certificate

Our students work
with the Best

Related Video Courses

View More

Annual Membership

Become a valued member of Tutorials Point and enjoy unlimited access to our vast library of top-rated Video Courses

Subscribe now
People having fun around a laptop

Online Certifications

Master prominent technologies at full length and become a valued certified professional.

Explore Now
People having fun around a laptop

Talk to us