Changes

Jump to: navigation, search

Timeline of OpenAI

32 bytes added, 17:28, 15 May 2020
no edit summary
| 2017 || {{dts|May 15}} || Robotics || Software release || OpenAI releases Roboschool, an open-source software for robot simulation, integrated with OpenAI Gym.<ref>{{cite web |title=Roboschool |url=https://openai.com/blog/roboschool/ |website=openai.com |accessdate=5 April 2020}}</ref>
|-
| 2017 || {{dts|May 16}} || Robotics || Software release || OpenAI introduces a robotics system, trained entirely in simulation and deployed on a physical robot, which can learn a new task after seeing it done once.<ref>{{cite web |title=Robots that Learn |url=https://openai.com/blog/robots-that-learn/ |website=openai.com |accessdate=5 April 2020}}</ref>
|-
| 2017 || {{dts|May 24}} || Reinforcement learning || Software release || OpenAI releases Baselines, a set of implementations of reinforcement learning algorithms.<ref>{{cite web |url=https://blog.OpenAI.com/OpenAI-baselines-dqn/ |publisher=OpenAI Blog |title=OpenAI Baselines: DQN |date=November 28, 2017 |accessdate=May 5, 2018}}</ref><ref>{{cite web |url=https://github.com/OpenAI/baselines |publisher=GitHub |title=OpenAI/baselines |accessdate=May 5, 2018}}</ref>
|-
| 2017 || {{dts|June 12}} || Safety || Publication || "Deep reinforcement learning from human preferences" is first uploaded to the arXiv. The paper is a collaboration between researchers at OpenAI and Google DeepMind.<ref>{{cite web |url=https://arxiv.org/abs/1706.03741 |title=[1706.03741] Deep reinforcement learning from human preferences |accessdate=March 2, 2018}}</ref><ref>{{cite web |url=https://www.gwern.net/newsletter/2017/06 |author=gwern |date=June 3, 2017 |title=June 2017 news - Gwern.net |accessdate=March 2, 2018}}</ref><ref>{{cite web |url=https://www.wired.com/story/two-giants-of-ai-team-up-to-head-off-the-robot-apocalypse/ |title=Two Giants of AI Team Up to Head Off the Robot Apocalypse |publisher=[[wikipedia:WIRED|WIRED]] |accessdate=March 2, 2018 |quote=A new paper from the two organizations on a machine learning system that uses pointers from humans to learn a new task, rather than figuring out its own—potentially unpredictable—approach, follows through on that. Amodei says the project shows it's possible to do practical work right now on making machine learning systems less able to produce nasty surprises.}}</ref>
62,734
edits

Navigation menu