Artificial Intelligence
DeepMind and OpenAI Use Human Feedback to Maximize the Performance of Reinforcement Learning Agents
A research paper from 2018, introduces a training model that combines human feedback and reward optimization to maximize the knowledge of RL agents.
Published in
6 min readSep 27, 2021