Deep Q-Network



The idea behind Reinforcement Learning is that an agent will learn from the environment by interacting with it and receiving rewards for performing actions. Humans learn through interaction with the surrounding environment. You drop something, it breaks, you know that you don't have to drop it again. Reinforcement Learning is just a computational approach of learning from action.

We don't need to many training sets to learn.We just need occasional feedback that we did the right thing and can then figure out everything else ourselves. This is the task reinforcement learning tries to solve. Reinforcement learning lies somewhere in between supervised and unsupervised learning. Whereas in supervised learning one has a target label for each training example and in unsupervised learning one has no labels at all, in reinforcement learning one has sparse and time-delayed labels – the rewards. Based only on those rewards the agent has to learn to behave in the environment.
Once you have figured out a strategy to collect a certain number of rewards, should you stick with it or experiment with something that could result in even bigger rewards? In the above Breakout game a simple strategy is to move to the left edge and wait there.

should you exploit the known working strategy or explore other, possibly better strategies. explore-exploit dilemma

How do you formalize a reinforcement learning problem, so that you can reason about it? The most common method is to represent it as a Markov decision process.


Here is a great resource for Deep Reinforcement Learning if you need a kick-start 


Deep Reinforcement Learning introduces deep neural networks to solve Reinforcement Learning problems — hence the name “deep.”



Resources: 

  • https://medium.freecodecamp.org/an-introduction-to-reinforcement-learning-4339519de419
  • https://ai.intel.com/demystifying-deep-reinforcement-learning/


Comments

Popular posts from this blog

When To Use Hadoop vs In-Memory vs MPP

Book Review: The Catalyst: How to Change Anyone's Mind

Math for Data Science