Upper confidence bound Algorithm (Multi Armed Bandit)

Multi armed bandit algorithms are a class of powerful algorithms that run the modern world. They are used every where from running clinical trails with RCT, Massive AB Testing to recommending movies on Netflix. The problem is still open for an optimal or near optimal solution and studies are still being conducted to find a scalable solution Read More… [link]

Introduction to Reinforcement Learning

We humans learn many things from our day to day activities. We observe our environment, take some actions and see how our actions affected our environment and then take our next actions to accomplish some goal. Reinforcement learning technique is similar way of modeling a system where a system sees or observe its environment and takes some actions to achieve some goal. Read More .. [link]