Reinforcement learning is one of the paradigms and methodologies of machine learning, which is used to describe and solve the problem that agents maximize returns or achieve specific goals through learning strategies in the process of interaction with the environment. Inspired by behaviorism psychology, reinforcement learning focuses on online learning and tries to maintain a balance between exploration-exploitation. Unlike supervised learning and unsupervised learning, reinforcement learning does not require any data to be given in advance. Instead, it obtains learning information and updates model parameters by receiving reward (feedback) from the environment.
In particular, deep reinforcement learning combines the perceptual ability of deep learning with the decision-making ability of reinforcement learning, which can be controlled directly according to the input image. It is an artificial intelligence method closer to human thinking mode.