[AN #79]: Recursive reward modeling as an alignment technique integrated with deep RL LINK: Study demonstrates politically motivated innumeracy - Less Wrong, 01 Jan 2020 Published on January 1, 2020 6:00 PM…