Artificial Intelligence

DeepMind and OpenAI Use Human Feedback to Maximize the Performance of Reinforcement Learning Agents

A research paper from 2018, introduces a training model that combines human feedback and reward optimization to maximize the knowledge of RL agents.

Jesus Rodriguez

Published in

Towards AI

6 min readSep 27, 2021

--

Image Source: https://cloud.withgoogle.com/build/data-analytics/explore-history-machine-learning/

Jesus Rodriguez

Written by Jesus Rodriguez

Writer for

Towards AI

CEO of IntoTheBlock, President of Faktory, President of NeuralFabric and founder of The Sequence , Lecturer at Columbia University, Wharton, Angel Investor...

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams