Difference between revisions of "Talk:Timeline of OpenAI"

Revision as of 20:24, 5 May 2020

Removed Rows

In case any of these events turns out to be relevant, please place it back on the timeline or let me know and I'll do it.

Year	Month and date	Domain	Event type	Details
2016	000000002024-05-25-0000May 25		Publication	"Adversarial Training Methods for Semi-Supervised Text Classification" is submitted to the ArXiv. The paper proposes a method that achieves better results on multiple benchmark semi-supervised and purely supervised tasks.^[1]
2016	000000002024-06-21-0000June 21		Publication	"Concrete Problems in AI Safety" is submitted to the arXiv. The paper explores practical problems in machine learning systems.^[2]
2016	000000002024-10-11-0000October 11		Publication	"Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model", a paper on robotics, is submitted to the ArXiv. It investigates settings where the sequence of states traversed in simulation remains reasonable for the real world.^[3]
2016	000000002024-10-18-0000October 18		Publication	"Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data", a paper on safety, is submitted to the ArXiv. It shows an approach to providing strong privacy guarantees for training data: Private Aggregation of Teacher Ensembles (PATE).^[4]
2016	000000002024-11-02-0000November 2		Publication	"Extensions and Limitations of the Neural GPU" is first submitted to the ArXiv. The paper shows that there are two simple ways of improving the performance of the Neural GPU: by carefully designing a curriculum, and by increasing model size.^[5]
2016	000000002024-11-08-0000November 8		Publication	"Variational Lossy Autoencoder", a paper on generative models, is submitted to the ArXiv. It presents a method to learn global representations by combining Variational Autoencoder (VAE) with neural autoregressive models.^[6]
2016	000000002024-11-09-0000November 9		Publication	"RL²: Fast Reinforcement Learning via Slow Reinforcement Learning", a paper on reinforcement learning, is first submitted to the ArXiv. It seeks to bridge the gap in number of trials between the machine learning process which requires a huge number of trials, and animals which can learn new tasks in just a few trials, benefiting from their prior knowledge about the world.^[7]
2016	000000002024-11-11-0000November 11		Publication	"A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models", a paper on generative models, is first submitted to the ArXiv.^[8]
2016	000000002024-11-14-0000November 14		Publication	"On the Quantitative Analysis of Decoder-Based Generative Models", a paper on generative models, is submitted to the ArXiv. It introduces a technique to analyze the performance of decoder-based models.^[9]
2016	000000002024-11-15-0000November 15		Publication	"#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning", a paper on reinforcement learning, is first submitted to the ArXiv.^[10]
2017	000000002024-01-19-0000January 19		Publication	"PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications", a paper on generative models, is submitted to the ArXiv.^[11]
2017	000000002024-02-08-0000February 8		Publication	"Adversarial Attacks on Neural Network Policies" is submitted to the ArXiv. The paper shows that adversarial attacks are effective when targeting neural network policies in reinforcement learning.^[12]
2017	000000002024-03-06-0000March 6		Publication	"Third-Person Imitation Learning", a paper on robotics, is submitted to the ArXiv. It presents a method for unsupervised third-person imitation learning.^[13]
2017	000000002024-03-10-0000March 10		Publication	"Evolution Strategies as a Scalable Alternative to Reinforcement Learning" is submitted to the ArXiv. It explores the use of Evolution Strategies (ES), a class of black box optimization algorithms.^[14]
2017	000000002024-03-12-0000March 12		Publication	"Prediction and Control with Temporal Segment Models", a paper on generative models, is first submitted to the ArXiv. It introduces a method for learning the dynamics of complex nonlinear systems based on deep generative models over temporal segments of states and actions.^[15]
2017	000000002024-03-15-0000March 15		Publication	"Emergence of Grounded Compositional Language in Multi-Agent Populations" is first submitted to ArXiv. The paper proposes a multi-agent learning environment and learning methods that bring about emergence of a basic compositional language.^[16]
2017	000000002024-03-20-0000March 20		Publication	"Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World", a paper on robotics, is subitted to the ArXiv. It explores domain randomization, a simple technique for training models on simulated images that transfer to real images by randomizing rendering in the simulator.^[17]
2017	000000002024-03-21-0000March 21		Publication	"One-Shot Imitation Learning", a paper on robotics, is first submitted to the ArXiv. The paper proposes a meta-learning framework for optimizing imitation learning.^[18]
2017	000000002024-06-07-0000June 7		Publication	"Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments" is submitted to the ArXiv. The paper explores deep reinforcement learning methods for multi-agent domains.^[19]
2017	000000002024-09-13-0000September 13	Reinforcement learning	Publication	"Learning with Opponent-Learning Awareness" is first uploaded to the ArXiv. The paper presents Learning with Opponent-Learning Awareness (LOLA), a method in which each agent shapes the anticipated learning of the other agents in an environment.^[20]^[21]
2017	000000002024-10-17-0000October 17	Robotics	Publication	"Domain Randomization and Generative Models for Robotic Grasping", a paper on robotics, is first submitted to the ArXiv. It explores a novel data generation pipeline for training a deep neural network to perform grasp planning that applies the idea of domain randomization to object synthesis.^[22]
2017	000000002024-10-18-0000October 18		Publication	"Sim-to-Real Transfer of Robotic Control with Dynamics Randomization", a paper on robotics, is first submitted to ArXiv. It describes a solution for strategies that are successful in simulation but may not transfer to their real world counterparts due to modeling error.^[23]
2017	000000002024-10-26-0000October 26		Publication	"Meta Learning Shared Hierarchies", a paper on reinforcement learning, is submitted to the ArXiv. The paper describes the development of a metalearning approach for learning hierarchically structured policies, improving sample efficiency on unseen tasks through the use of shared primitives.^[24]
2017	000000002024-10-31-0000October 31		Publication	"Backpropagation through the Void: Optimizing control variates for black-box gradient estimation", a paper on reinforcement learning, is first submitted to the ArXiv. It introduces a general framework for learning low-variance, unbiased gradient estimators for black-box functions of random variables.^[25]
2017	000000002024-11-02-0000November 2		Publication	"Interpretable and Pedagogical Examples", a paper on language, is first submitted to the ArXiv. It shows that training the student and teacher iteratively, rather than jointly, can produce interpretable teaching strategies.^[26]

↑ Miyato, Takeru; Dai, Andrew M.; Goodfellow, Ian. "Adversarial Training Methods for Semi-Supervised Text Classification". arxiv.org. Retrieved 28 March 2020.
↑ "[1606.06565] Concrete Problems in AI Safety". June 21, 2016. Retrieved July 25, 2017.
↑ Christiano, Paul; Shah, Zain; Mordatch, Igor; Schneider, Jonas; Blackwell, Trevor; Tobin, Joshua; Abbeel, Pieter; Zaremba, Wojciech. "Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model". arxiv.org. Retrieved 28 March 2020.
↑ Papernot, Nicolas; Abadi, Martín; Erlingsson, Úlfar; Goodfellow, Ian; Talwar, Kunal. "Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data". arxiv.org. Retrieved 28 March 2020.
↑ Price, Eric; Zaremba, Wojciech; Sutskever, Ilya. "Extensions and Limitations of the Neural GPU". arxiv.org. Retrieved 28 March 2020.
↑ Chen, Xi; Kingma, Diederik P.; Salimans, Tim; Duan, Yan; Dhariwal, Prafulla; Schulman, John; Sutskever, Ilya; Abbeel, Pieter. "Variational Lossy Autoencoder". arxiv.org. Missing or empty |url= (help); |access-date= requires |url= (help)
↑ Duan, Yan; Schulman, John; Chen, Xi; Bartlett, Peter L.; Sutskever, Ilya; Abbeel, Pieter. "RL2: Fast Reinforcement Learning via Slow Reinforcement Learning". arxiv.org. Retrieved 28 March 2020.
↑ Finn, Chelsea; Christiano, Paul; Abbeel, Pieter; Levine, Sergey. "A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models". arxiv.org. Retrieved 28 March 2020.
↑ Wu, Yuhuai; Burda, Yuri; Salakhutdinov, Ruslan; Grosse, Roger. "On the Quantitative Analysis of Decoder-Based Generative Models". arxiv.org. Retrieved 28 March 2020.
↑ "#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning". arxiv.org. Retrieved 28 March 2020.
↑ Salimans, Tim; Karpathy, Andrej; Chen, Xi; Kingma, Diederik P. "PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications". arxiv.org. Retrieved 28 March 2020.
↑ Huang, Sandy; Papernot, Nicolas; Goodfellow, Ian; Duan, Yan; Abbeel, Pieter. "Adversarial Attacks on Neural Network Policies". arxiv.org. Retrieved 28 March 2020.
↑ Stadie, Bradly C.; Abbeel, Pieter; Sutskever, Ilya. "arxiv.org". arxiv.org. Retrieved 28 March 2020.
↑ Salimans, Tim; Ho, Jonathan; Chen, Xi; Sidor, Szymon; Sutskever, Ilya. "Evolution Strategies as a Scalable Alternative to Reinforcement Learning". arxiv.org. Retrieved 28 March 2020.
↑ Mishra, Nikhil; Abbeel, Pieter; Mordatch, Igor. "Prediction and Control with Temporal Segment Models". arxiv.org. Retrieved 28 March 2020.
↑ Mordatch, Igor; Abbeel, Pieter. "Emergence of Grounded Compositional Language in Multi-Agent Populations". arxiv.org. Retrieved 26 March 2020.
↑ Tobin, Josh; Fong, Rachel; Ray, Alex; Schneider, Jonas; Zaremba, Wojciech; Abbeel, Pieter. "Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World". arxiv.org. Retrieved 28 March 2020.
↑ "One-Shot Imitation Learning". arxiv.org. Retrieved 28 March 2020.
↑ "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments". arxiv.org. Missing or empty |url= (help); |access-date= requires |url= (help)
↑ "[1709.04326] Learning with Opponent-Learning Awareness". Retrieved March 2, 2018.
↑ gwern (August 16, 2017). "September 2017 news - Gwern.net". Retrieved March 2, 2018.
↑ "Domain Randomization and Generative Models for Robotic Grasping". arxiv.org. Retrieved 27 March 2020.
↑ Bin Peng, Xue; Andrychowicz, Marcin; Zaremba, Wojciech; Abbeel, Pieter. "Sim-to-Real Transfer of Robotic Control with Dynamics Randomization". arxiv.org. Retrieved 26 March 2020.
↑ Frans, Kevin; Ho, Jonathan; Chen, Xi ChenXi; Abbeel, Pieter; Schulman, John. "Meta Learning Shared Hierarchies". arxiv.org. Retrieved 26 March 2020.
↑ Grathwohl, Will; Choi, Dami; Wu, Yuhuai; Roeder, Geoffrey; Duvenaud, David. "Backpropagation through the Void: Optimizing control variates for black-box gradient estimation". arxiv.org. Retrieved 26 March 2020.
↑ Milli, Smitha; Abbeel, Pieter; Mordatch, Igor. "Interpretable and Pedagogical Examples". arxiv.org. Retrieved 26 March 2020.

[1] Miyato, Takeru; Dai, Andrew M.; Goodfellow, Ian. "Adversarial Training Methods for Semi-Supervised Text Classification". arxiv.org. Retrieved 28 March 2020.

[2] "[1606.06565] Concrete Problems in AI Safety". June 21, 2016. Retrieved July 25, 2017.

[3] Christiano, Paul; Shah, Zain; Mordatch, Igor; Schneider, Jonas; Blackwell, Trevor; Tobin, Joshua; Abbeel, Pieter; Zaremba, Wojciech. "Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model". arxiv.org. Retrieved 28 March 2020.

[4] Papernot, Nicolas; Abadi, Martín; Erlingsson, Úlfar; Goodfellow, Ian; Talwar, Kunal. "Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data". arxiv.org. Retrieved 28 March 2020.

[5] Price, Eric; Zaremba, Wojciech; Sutskever, Ilya. "Extensions and Limitations of the Neural GPU". arxiv.org. Retrieved 28 March 2020.

[6] Chen, Xi; Kingma, Diederik P.; Salimans, Tim; Duan, Yan; Dhariwal, Prafulla; Schulman, John; Sutskever, Ilya; Abbeel, Pieter. "Variational Lossy Autoencoder". arxiv.org. Missing or empty |url= (help); |access-date= requires |url= (help)

[7] Duan, Yan; Schulman, John; Chen, Xi; Bartlett, Peter L.; Sutskever, Ilya; Abbeel, Pieter. "RL2: Fast Reinforcement Learning via Slow Reinforcement Learning". arxiv.org. Retrieved 28 March 2020.

[8] Finn, Chelsea; Christiano, Paul; Abbeel, Pieter; Levine, Sergey. "A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models". arxiv.org. Retrieved 28 March 2020.

[9] Wu, Yuhuai; Burda, Yuri; Salakhutdinov, Ruslan; Grosse, Roger. "On the Quantitative Analysis of Decoder-Based Generative Models". arxiv.org. Retrieved 28 March 2020.

[10] "#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning". arxiv.org. Retrieved 28 March 2020.

[11] Salimans, Tim; Karpathy, Andrej; Chen, Xi; Kingma, Diederik P. "PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications". arxiv.org. Retrieved 28 March 2020.

[12] Huang, Sandy; Papernot, Nicolas; Goodfellow, Ian; Duan, Yan; Abbeel, Pieter. "Adversarial Attacks on Neural Network Policies". arxiv.org. Retrieved 28 March 2020.

[13] Stadie, Bradly C.; Abbeel, Pieter; Sutskever, Ilya. "arxiv.org". arxiv.org. Retrieved 28 March 2020.

[14] Salimans, Tim; Ho, Jonathan; Chen, Xi; Sidor, Szymon; Sutskever, Ilya. "Evolution Strategies as a Scalable Alternative to Reinforcement Learning". arxiv.org. Retrieved 28 March 2020.

[15] Mishra, Nikhil; Abbeel, Pieter; Mordatch, Igor. "Prediction and Control with Temporal Segment Models". arxiv.org. Retrieved 28 March 2020.

[16] Mordatch, Igor; Abbeel, Pieter. "Emergence of Grounded Compositional Language in Multi-Agent Populations". arxiv.org. Retrieved 26 March 2020.

[17] Tobin, Josh; Fong, Rachel; Ray, Alex; Schneider, Jonas; Zaremba, Wojciech; Abbeel, Pieter. "Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World". arxiv.org. Retrieved 28 March 2020.

[18] "One-Shot Imitation Learning". arxiv.org. Retrieved 28 March 2020.

[19] "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments". arxiv.org. Missing or empty |url= (help); |access-date= requires |url= (help)

[20] "[1709.04326] Learning with Opponent-Learning Awareness". Retrieved March 2, 2018.

[21] wern (August 16, 2017). "September 2017 news - Gwern.net". Retrieved March 2, 2018.

[22] "Domain Randomization and Generative Models for Robotic Grasping". arxiv.org. Retrieved 27 March 2020.

[23] Bin Peng, Xue; Andrychowicz, Marcin; Zaremba, Wojciech; Abbeel, Pieter. "Sim-to-Real Transfer of Robotic Control with Dynamics Randomization". arxiv.org. Retrieved 26 March 2020.

[24] Frans, Kevin; Ho, Jonathan; Chen, Xi ChenXi; Abbeel, Pieter; Schulman, John. "Meta Learning Shared Hierarchies". arxiv.org. Retrieved 26 March 2020.

[25] Grathwohl, Will; Choi, Dami; Wu, Yuhuai; Roeder, Geoffrey; Duvenaud, David. "Backpropagation through the Void: Optimizing control variates for black-box gradient estimation". arxiv.org. Retrieved 26 March 2020.

[26] Milli, Smitha; Abbeel, Pieter; Mordatch, Igor. "Interpretable and Pedagogical Examples". arxiv.org. Retrieved 26 March 2020.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

@@ Line 1: / Line 1: @@
 == Removed Rows ==
-In case any of these events turn our to be relevant, please place them back on the timeline or let me know and I'll do it.
+In case any of these events turns out to be relevant, please place it back on the timeline or let me know and I'll do it.
 {| class="sortable wikitable"

Difference between revisions of "Talk:Timeline of OpenAI"

Revision as of 20:24, 5 May 2020

Removed Rows

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools