[RL] RL Category

Following the wikipeadia

Reinforcement Learning is an area of machihne learning inspired by behavioral psychology, concerned with how software agents ought to take actions in an environment so as to maximzie some notion of cumulative reward.

image-20200412150622271

Behavioral Psychology

Behavior is primarily shaped by reinforcement rather than free-will.

  • behaviors that result in praise/pleasure tend to repeat
  • behaviors that result in punishment/pain tend to become extinct

agent

An entity (learner & decision maker) that is equipped with Sensors end-effectors and goals

image-20200412145820170

Action

  • Used by the agent to interact with the environment.
  • May have many di↵erent temporal granularities and abstractions

image-20200412145849260

reward

A reward R**t is a scalar feedback signal

Indicates how well agent is doing at step t

The agent’s job is to maximize cumulative reward

hypothesis: All goals can be described by the maximization of expected cumulative reward

image-20200412150016842

Main Topics of Reinforcement Learning

Learning: by trial and error

Planning: search, reason, thought, cognition

Prediction: evaluation functions, knowledge

Control: action selection, decision making

Dynamics: how the state changes given the actions of the agent

Model-based RL

  • dynamics are known or are estimated
  • solving RL problems that use models and planning

Model-free RL

  • unknown dynamics
  • explicitly trial-and-error learners

not necessarily iid

image-20200412150553154

P.S. 逆强化学习。

Summary

image-20200412151935910