Timeline of machine learning

From Timelines
Jump to: navigation, search
The content on this page is forked from the English Wikipedia page entitled "Timeline of machine learning". The original page still exists at Timeline of machine learning. The original content was released under the Creative Commons Attribution/Share-Alike License (CC-BY-SA), so this page inherits this license.

This page is a timeline of machine learning. Major discoveries, achievements, milestones and other major events are included.


Decade Summary
<1950s Statistical methods are discovered and refined.
1950s Pioneering machine learning research is conducted using simple algorithms.
1960s "In the 1960s, the discovery and use of multilayers opened a new path in neural network research."[1] "1960s: Shallow neural networks"[2]
1970s 'AI Winter' caused by pessimism about machine learning effectiveness. "Backpropagation, developed in the 1970s, allows a network to adjust its hidden layers of neurons/nodes to adapt to new situations."[1]
1980s ". In the mid-1980s, artificial neural networks (ANN) came to the foreground, to be then pushed aside by statistical learning systems in the 1990s. "[3] "Convolution emerges"[2] Rediscovery of backpropagation causes a resurgence in machine learning research.
1990s "1990s: Unsupervised deep learning"[2] "Thanks to statistics, machine learning became very famous in 1990s. The intersection of computer science and statistics gave birth to probabilistic approaches in AI. This shifted the field further toward data-driven approaches."[4] "In the early 90’s Machine Learning became very popular again due to the intersection of Computer Science and Statistics"[5] " Work on machine learning shifts from a knowledge-driven approach to a data-driven approach. Scientists begin creating programs for computers to analyze large amounts of data and draw conclusions — or “learn” — from the results."[6] Work on machine learning shifts from a knowledge-driven approach to a data-driven approach. Scientists begin creating programs for computers to analyze large amounts of data and draw conclusions — or “learn” — from the results.[6] "In the 1990s we began to apply machine learning in data mining, adaptive software and web applications, text learning, and language learning."[7] Support vector machines and recurrent neural networks become popular.
2000s Deep learning becomes feasible and neural networks see widespread commercial use. "2006s-present: Modern deep learning"[2]
2010s Machine learning becomes integral to many widely used software services and receives great publicity.


thumb|A simple neural network with two input units and one output unit

[[wikipedia:File:OS2 TD-Gammon Screenshot.png|thumb|OS/2 TD-Gammon game screenshot]]

Year Event Type Caption Event
1642 "Blaise Pascal was 19 when he made an “arithmetic machine” for his tax collector father. It could add, subtract, multiply, and divide. Three centuries later, the IRS uses machine learning to combat tax evasion."[8]
1679 "German mathematician, philosopher, and occasional poet Gottfried Wilhelm Leibniz devised the system of binary code that laid the foundation for modern computing"[8]
1763 Discovery The Underpinngs of Bayes' Theorem Thomas Bayes's work An Essay towards solving a Problem in the Doctrine of Chances is published two years after his death, having been amended and edited by a friend of Bayes, Richard Price.[9] The essay presents work which underpins Bayes theorem.
1770 "A chess-playing automaton debuts, then dupes Europe for decades" "A moving, mechanical device designed to imitate a human, “The Turk” fooled even Napoleon into thinking it could play chess. The jig was up in 1857 when The Turk’s final owner revealed how a person hidden inside moved its arms."[8]
1801 "1801- First Data Storage through the Weaving Loom"[7]
1805 Discovery Least Squares Adrien-Marie Legendre describes the "méthode des moindres carrés", known in English as the least squares method.[10] The least squares method is used widely in data fitting.
1812 Bayes' Theorem Pierre-Simon Laplace publishes Théorie Analytique des Probabilités, in which he expands upon the work of Bayes and defines what is now known as Bayes' Theorem.[11]
1834 "In 1834, Charles Babbage, the father of the computer, conceived a device that could be programmed with punch cards. However, the machine was never built, but all modern computers rely on its logical structure."[12] "The "father of the computer" invents punch-card programming"[8]
1842 "Ada Lovelace's algorithm makes her the world's first computer programmer" "The 27-year-old mathematician described a sequence of operations for solving mathematical problems using Charles' Babbage's theoretical punch-card machine. In the 70s, the US Department of Defense paid homage, naming a new software language Ada."[8]
1847 "Philosopher and closet mystic George Boole created a form of algebra in which all values can be reduced to “true” or “false.” Essential to modern computing, Boolean logic helps a CPU decide how to process new inputs."[8][7]
1890 "1890 - Mechanical System for Statistical calculations" "Herman Hollerith created the first combined system of mechanical calculation and punch cards to rapidly calculate statistics gathered from millions of people."[7]
1913 Discovery Markov Chains Andrey Markov first describes techniques he used to analyse a poem. The techniques later become known as Markov chains.[13]
1936 "In 1936, Alan Turing gave a theory that how a machine can determine and execute a set of instructions."[12]
1940 "In 1940, the first manually operated computer, "ENIAC" was invented, which was the first electronic general-purpose computer. After that stored program computer such as EDSAC in 1949 and EDVAC in 1951 were invented."[12]
1943 "The first case of neural networks was in 1943, when neurophysiologist Warren McCulloch and mathematician Walter Pitts wrote a paper about neurons, and how they work. They decided to create a model of this using an electrical circuit, and therefore the neural network was born."[1] "In 1943, a human neural network was modeled with an electrical circuit. In 1950, the scientists started applying their idea to work and analyzed how human neurons might work."[12]
1949 "First step toward prevalent ML was proposed by Hebb, in 1949, based on a neuropsychological learning formulation. It is called Hebbian Learning theory. With a simple explanation, it pursues correlations between nodes of a Recurrent Neural Network (RNN). It memorizes any commonalities on the network and serves like a memory later."[14]
1950 Turing's Learning Machine Alan Turing proposes a 'learning machine' that could learn and become artificially intelligent. Turing's specific proposal foreshadows genetic algorithms.[15] "Alan Turing creates the “Turing Test” to determine if a computer has real intelligence. To pass the test, a computer must be able to fool a human into believing it is also human."[6][12]
1951 First Neural Network Machine Marvin Minsky and Dean Edmonds build the first neural network machine, able to learn, the SNARC. [16]
1952 "1952 saw the first computer program which could learn as it ran. It was a game which played checkers, created by Arthur Samuel."[1] "Arthur Samuel wrote the first computer learning program. The program was the game of checkers, and the IBM IBM +0% computer improved at the game the more it played, studying which moves made up winning strategies and incorporating those moves into its program."
1952 Machines Playing Checkers Arthur Samuel joins IBM's Poughkeepsie Laboratory and begins working on some of the very first machine learning programs, first creating programs that play checkers.[17] "Arthur Samuel wrote the first computer learning program. The program was the game of checkers, and the IBM computer improved at the game the more it played, studying which moves made up winning strategies and incorporating those moves into its program."[6]
1957 Discovery Perceptron Frank Rosenblatt invents the perceptron while working at the Cornell Aeronautical Laboratory.[18] The invention of the perceptron generated a great deal of excitement and widely covered in the media.[19] "Frank Rosenblatt designed the first neural network for computers (the perceptron), which simulate the thought processes of the human brain."[6]
1959 "A neural network learns to make phone calls clearer"[8] "Another extremely early instance of a neural network came in 1959, when Bernard Widrow and Marcian Hoff created two models of them at Stanford University. The first was called ADELINE, and it could detect binary patterns. For example, in a stream of bits, it could predict what the next one would be. The next generation was called MADELINE, and it could eliminate echo on phone lines, so had a useful real world application. It is still in use today."[1]
1959 "In 1959, the term "Machine Learning" was first coined by Arthur Samuel."[12]
1959 "In 1959, the first neural network was applied to a real-world problem to remove echoes over phone lines using an adaptive filter."[12]
1962 "Neural networks use back propagation (explained in detail in the Introduction to Neural Networks), and this important step came in 1986, when three researchers from the Stanford psychology department decided to extend an algorithm created by Widrow and Hoff in 1962. This therefore allowed multiple layers to be used in a neural network, creating what are known as ‘slow learners’, which will learn over a long period of time."[1]
1963 "U.S. government agencies like the Defense Advanced Research Projects Agency (DARPA) fund AI research at universities such as MIT, hoping for machines that will translate Russian instantly."[20]
1965 " Probably the first who decided to “develop” (deepen) pepperprope was the Soviet mathematician A.G. Ivakhnenko, who had published a number of articles and books since 1965, which, in particular, described the modeling system “Alpha”."[21]
1967 Nearest Neighbor The nearest neighbor algorithm was created, which is the start of basic pattern recognition. The algorithm was used to map routes. [22] "The “nearest neighbor” algorithm was written, allowing computers to begin using very basic pattern recognition. This could be used to map a route for traveling salesmen, starting at a random city but ensuring they visit all cities during a short tour."[6]
1969 Limitations of Neural Networks Marvin Minsky and Seymour Papert publish their book Perceptrons, describing some of the limitations of perceptrons and neural networks. The interpretation that the book shows that neural networks are fundamentally limited is seen as a hindrance for research into neural networks.[23][24]
1970 Automatic Differentation (Backpropagation) Seppo Linnainmaa published the general method for automatic differentiation (AD) of discrete connected networks of nested differentiable functions.[25][26] This corresponds to the modern version of backpropagation, but is not yet named as such.[27][28][29][30]
1970 "There had been not to much effort until the intuition of Multi-Layer Perceptron (MLP) was suggested by Werbos[6] in 1981 with NN specific Backpropagation(BP) algorithm, albeit BP idea had been proposed before by Linnainmaa [5] in 1970 in the name "reverse mode of automatic differentiation"."[14]
1974 Algorithm "ALOPEX (an acronym from "ALgorithms Of Pattern EXtraction") is a correlation based machine learning algorithm first proposed by Tzanakou and Harth in 1974."
1977 Algorithm The Expectation–maximization algorithm is explained and given its name in a paper by Arthur Dempster, Nan Laird, and Donald Rubin.[31]
1979 Stanford Cart Students at Stanford University develop a cart that can navigate and avoid obstacles in a room [32] " Students at Stanford University invent the “Stanford Cart” which can navigate obstacles in a room on its own."[6]
1980 Discovery Neocognitron Kunihiko Fukushima first publishes his work on the Neocognitron, a type of artificial neural network.[33] Neocognition later inspires convolutional neural networks.[34] " In 1980, Kunihika Fukushima proposed a hierarchical multilayered convolution neural network known as the neocognitron."[21]
1980 The Linde–Buzo–Gray algorithm is introduced by Yoseph Linde, Andrés Buzo and Robert M. Gray.[35]
1980 International Conference on Machine Learning
1981 Explanation Based Learning Gerald Dejong introduces Explanation Based Learning, where a computer algorithm analyses data and creates a general rule it can follow and discard unimportant data.[36] "Gerald Dejong introduces the concept of Explanation Based Learning (EBL), in which a computer analyses training data and creates a general rule it can follow by discarding unimportant data."[6]
1981 "There had been not to much effort until the intuition of Multi-Layer Perceptron (MLP) was suggested by Werbos[6] in 1981 with NN specific Backpropagation(BP) algorithm, albeit BP idea had been proposed before by Linnainmaa [5] in 1970 in the name "reverse mode of automatic differentiation"."[14]
1982 Discovery Recurrent Neural Network John Hopfield popularizes Hopfield networks, a type of recurrent neural network that can serve as content-addressable memory systems.[37][1][2]
1982 " Furthermore, in 1982, Japan announced it was focusing on more advanced neural networks, which incentivised American funding into the area, and thus created more research in the area."[1]
1982 Self-learning as machine learning paradigm is introduced along with a neural network capable of self-learning named Crossbar Adaptive Array (CAA).[38]
1985 NetTalk A program that learns to pronounce words the same way a baby does, is developed by Terry Sejnowski.[39] " Terry Sejnowski invents NetTalk, which learns to pronounce words the same way a baby does." "In 1985, Terry Sejnowski and Charles Rosenberg invented a neural network NETtalk, which was able to teach itself how to correctly pronounce 20,000 words in one week."[12]
1985–1986 "There had been not to much effort until the intuition of Multi-Layer Perceptron (MLP) was suggested by Werbos[6] in 1981 with NN specific Backpropagation(BP) algorithm, albeit BP idea had been proposed before by Linnainmaa [5] in 1970 in the name "reverse mode of automatic differentiation". Still BP is the key ingredient of today's NN architectures. With those new ideas, NN researches accelerated again. In 1985 - 1986 NN researchers successively presented the idea of MLP with practical BP training"[14]
1986 Discovery Backpropagation The process of backpropagation is described by David Rumelhart, Geoff Hinton and Ronald J. Williams.[40][41]
1986 "At the another spectrum, a very-well known ML algorithm was proposed by J. R. Quinlan [9] in 1986 that we call Decision Trees, more specifically ID3 algorithm."[14]
1986 Algorithm The Dehaene–Changeux model is developed by cognitive neuroscientists Stanislas Dehaene and Jean-Pierre Changeux.[42] It is used to provide a predictive framework to the study of inattentional blindness and the solving of the Tower of London test.[43][44]
1986 Machine Learning (journal)[45]
1986 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases
1987 Conference on Neural Information Processing Systems
1988 Knowledge Engineering and Machine Learning Group
1989 Discovery Reinforcement Learning Christopher Watkins develops Q-learning, which greatly improves the practicality and feasibility of reinforcement learning.[46]
1989 Commercialization Commercialization of Machine Learning on Personal Computers Axcelis, Inc. releases Evolver, the first software package to commercialize the use of genetic algorithms on personal computers.[47]
1989 Algorithm Q-learning, a model-free reinforcement learning algorithm, is introduced by Chris Watkins.[48] in 1989. A convergence proof was presented by Watkins and Dayan[49]
1992 Achievement Machines Playing Backgammon Gerald Tesauro develops TD-Gammon, a computer backgammon program that utilises an artificial neural network trained using temporal-difference learning (hence the 'TD' in the name). TD-Gammon is able to rival, but not consistently surpass, the abilities of top human backgammon players.[50]
1995 "One of the most important ML breakthrough was Support Vector Machines (Networks) (SVM), proposed by Vapnik and Cortes[10] in 1995 with very strong theoretical standing and empirical results. That was the time separating the ML community into two crowds as NN or SVM advocates."[14]
1995 Discovery Random Forest Algorithm Tin Kam Ho publishes a paper describing Random decision forests.[51]
1995 Discovery Support Vector Machines Corinna Cortes and Vladimir Vapnik publish their work on support vector machines.[14][52]
1996 (Octgober 10) Orange (software) is released.
1997 IBM Deep Blue Beats Kasparov IBM’s Deep Blue beats the world champion at chess.[53]
1997 Discovery LSTM Sepp Hochreiter and Jürgen Schmidhuber invent Long-short term memory recurrent neural networks,[54] greatly improving the efficiency and practicality of recurrent neural networks.
1997 "Little before, another solid ML model was proposed by Freund and Schapire in 1997 prescribed with boosted ensemble of weak classifiers called Adaboost. This work also gave the Godel Prize to the authors at the time. Adaboost trains weak set of classifiers that are easy to train, by giving more importance to hard instances. This model still the basis of many different tasks like face recognition and detection."[14]
1998 MNIST database A team led by Yann LeCun releases the MNIST database, a dataset comprising a mix of handwritten digits from American Census Bureau employees and American high school students.[55] The MNIST database has since become a benchmark for evaluating handwriting recognition.
1998 "Since then, there have been many more advances in the field, such as in 1998, when research at AT&T Bell Laboratories on digit recognition resulted in good accuracy in detecting handwritten postcodes from the US Postal Service. This used back-propagation, which, as stated above, is explained in detail on the Introduction to Neural Networks."[1]
1999 "Computer-aided diagnosis catches more cancers. Computers can’t cure cancer (yet), but they can help us diagnose it. The CAD Prototype Intelligent Workstation, developed at the University of Chicago, reviewed 22,000 mammograms and detected cancer 52% more accurately than radiologists did."
2000 Algorithm In anomaly detection, the local outlier factor (LOF) is an algorithm proposed by Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng and Jörg Sander for finding anomalous data points by measuring the local deviation of a given data point with respect to its neighbours.[56]
2000 LogitBoost, a boosting algorithm In machine learning and computational learning theory, is formulated by Jerome H. Friedman, Trevor Hastie, and Robert Tibshirani.[57]
2000 Journal of Machine Learning Research
2000 ""Machine learning research that began in the 1980s achieves widespread practical use in major software service and mobile devices. One example: Intuitive Surgical’s da Vinci robotics-assisted surgical system becomes the first such device to gain U.S. Food and Drug Administration approval for general laparoscopic surgery. view citation[14]
Since then, da Vinci has been used for more than 5 million minimally invasive procedures in multiple specialties including urology, gynecology, thoracic surgery and cardiac surgery. view citation[15]""[58]
2001 "Another ensemble model explored by Breiman [12] in 2001 that ensembles multiple decision trees where each of them is curated by a random subset of instances and each node is selected from a random subset of features."[14]
2001 The iDistance indexing and query processing technique is first proposed by Cui Yu, Beng Chin Ooi, Kian-Lee Tan and H. V. Jagadish.[59]
2002 Torch Machine Learning Library Torch, a software library for machine learning, is first released.[60]
2002 (October) Software release Torch (machine learning) is first released.
2002 Software release Dlib
2003 Algorithm The concept of manifold alignment is first introduced as by Ham, Lee, and Saul as a class of machine learning algorithms that produce projections between sets of data, given that the original data sets lie on a common manifold.[61]
2004 "The second is the decrease in the cost of parallel computing and memory. This trend was discovered in 2004 when Google unveiled its MapReduce technology"[21]
2004 "Hierarchical temporal memory (HTM) is a biologically constrained theory (or model) of intelligence, originally described in the 2004 book On Intelligence by Jeff Hawkins with Sandra Blakeslee."
2005 " The 3rd rise of NN has begun roughly in 2005 with the conjunction of many different discoveries from past and present by recent mavens Hinton, LeCun, Bengio, Andrew Ng and other valuable older researchers. "[14]
2006 The Netflix Prize The Netflix Prize competition is launched by Netflix. The aim of the competition was to use machine learning to beat Netflix's own recommendation software's accuracy in predicting a user's rating for a film given their ratings for previous films by at least 10%.[62] The prize was won in 2009. "In 2006, Netflix offered $1M to anyone who could beat its algorithm at predicting consumer film ratings. The BellKor team of AT&T scientists took the prize three years later, beating the second-place team by mere minutes"[8]
2006 "Geoffrey Hinton coins the term “deep learning” to explain new algorithms that let computers “see” and distinguish objects and text in images and videos."[6] " In the year 2006, computer scientist Geoffrey Hinton has given a new name to neural net research as "deep learning," and nowadays, it has become one of the most trending technologies."[12]
2006 "In 2006, the Face Recognition Grand Challenge – a National Institute of Standards and Technology program – evaluated the popular face recognition algorithms of the time. 3D face scans, iris images, and high-resolution face images were tested. Their findings suggested the new algorithms were ten times more accurate than the facial recognition algorithms from 2002 and 100 times more accurate than those from 1995. Some of the algorithms were able to outperform human participants in recognizing faces and could uniquely identify identical twins."[1]
2006 ". This trend was discovered in 2004 when Google unveiled its MapReduce technology, followed by its open analogue Hadoop (2006), and together they gave the opportunity to distribute the processing of huge amounts of data between simple processors"[21]
c.2006 "The term deep learning was coined around 2006, and refers to deep neural networks with many layers."[3]
2006 Software release RapidMiner is forst released.
2007 "Around the year 2007, Long Short-Term Memory started outperforming more traditional speech recognition programs."[1]
2007 scikit-learn is released in June.[63]
2007 Software release Theano (software) is initially released. It is an open source Python library that allows users to easily make use of various machine learning models.[64]
2008 (January 11) pandas (software)[65]
2008 Algorithm The Isolation Forest (iForest) algorithm was initially proposed by Fei Tony Liu, Kai Ming Ting and Zhi-Hua Zhou in 2008.[66]
2008 Encog is created as a pure-Java/C# machine learning framework to support genetic programming, NEAT/HyperNEAT, and other neural network technologies.[67]
2009 (April 7) Software release Apache Mahout is first released.[68]
2010 (April) Kaggle, a website that serves as a platform for machine learning competitions, is launched.[69][70]
2010 "The Microsoft Kinect can track 20 human features at a rate of 30 times per second, allowing people to interact with the computer via movements and gestures."[6]
2010 (May 20) Software release Accord.NET is initially released.[71]
2010 "Constructing skill trees (CST) is a hierarchical reinforcement learning algorithm which can build skill trees from a set of sample solution trajectories obtained from demonstration. CST was introduced by George Konidaris, Scott Kuindersma, Andrew Barto and Roderic Grupen."
2011 Achievement Beating Humans in Jeopardy Using a combination of machine learning, natural language processing and information retrieval techniques, IBM's Watson beats two human champions in a Jeopardy! competition.[72]
2012 Achievement Recognizing Cats on YouTube The Google Brain team, led by Andrew Ng and Jeff Dean, create a neural network that learns to recognize cats by watching unlabeled images taken from frames of YouTube videos.[73][74] " In 2012, Google created a deep neural network which learned to recognize the image of humans and cats in YouTube videos."[12]
2012 "Google’s X Lab develops a machine learning algorithm that is able to autonomously browse YouTube videos to identify the videos that contain cats."[6]
2012 "AlexNet (2012) - AlexNet won the ImageNet competition by a large margin in 2012, which led to the use of GPUs and Convolutional Neural Networks in machine learning. They also created ReLU, which is an activation function that greatly improves efficiency of CNNs."[1]
2012 Special Interest Group on Knowledge Discovery and Data Mining
2012 (March 12) mlpy is released.[75]
2013 International Conference on Learning Representations
2014 Leap in Face Recognition Facebook researchers publish their work on DeepFace, a system that uses neural networks that identifies faces with 97.35% accuracy. The results are an improvement of more than 27% over previous systems and rivals human performance.[76] "Facebook develops DeepFace, a software algorithm that is able to recognize or verify individuals on photos to the same level as humans can."[6] "DeepFace was a deep neural network created by Facebook, and they claimed that it could recognize a person with the same precision as a human can do."[12]
2014 (May 26) Software release Apache Spark is first released.[77]
2014 Sibyl Researchers from Google detail their work on Sibyl,[78] a proprietary platform for massively parallel machine learning used internally by Google to make predictions about user behavior and provide recommendations.[79]
2014 "In 2014, the Chabot "Eugen Goostman" cleared the Turing Test. It was the first Chabot who convinced the 33% of human judges that it was not a machine."[12]
2014 "DeepMind (2014) - This company was bought by Google, and can play basic video games to the same levels as humans. In 2016, it managed to beat a professional at the game Go, which is considered to be one the world’s most difficult board games."[1]
2014 "Generative Adversarial Networks (GAN)"[2] "GAN is a class of machine learning systems invented by Ian Goodfellow and his colleagues in 2014."[80]
2014 "the Apache Spark software framework for distributed processing of unstructured and weakly structured data appeared; it was convenient for the implementation of machine learning algorithms."[21]
2015 (February) spaCy is released.[81][82]
2015 (March 27) Software release Keras is first released. It is an open source software library designed to simplify the creation of deep learning models.[83]
2015 (June 9) Software release Chainer is released.[66][84]
2015 (October 8) Software release Apache SINGA is first released.[85]
2015 Achievement Beating Humans in Go Google's AlphaGo program becomes the first Computer Go program to beat an unhandicapped professional human player[86] using a combination of machine learning and tree search techniques.[87]
2015 Software TensorFlow Library Google releases TensorFlow, an open source software library for machine learning.[88]
2015 "Amazon launches its own machine learning platform."[6]
2015 "Microsoft creates the Distributed Machine Learning Toolkit, which enables the efficient distribution of machine learning problems across multiple computers."[6]
2015 " Over 3,000 AI and Robotics researchers, endorsed by Stephen Hawking, Elon Musk and Steve Wozniak (among many others), sign an open letter warning of the danger of autonomous weapons which select and engage targets without human intervention."[6]
2015 " In 2015, the Google speech recognition program reportedly had a significant performance jump of 49 percent using a CTC-trained Long Short-Term Memory."[1]
2015 "OpenAI (2015) - This is a non-profit organisation created by Elon Musk and others, to create safe artificial intelligence that can benefit humanity."[1]
2015 "Amazon Machine Learning Platform (2015) - This is part of Amazon Web Services, and shows how most big companies want to get involved in machine learning. They say it drives many of their internal systems, from regularly used services such as search recommendations and Alexa, to more experimental ones like Prime Air and Amazon Go."[1]
2015 "ResNet (2015) - This was a major advancement in CNNs, and more information can be found on the Introduction to CNNs page."[1]
2015 "U-net (2015) - This is an CNN architecture specialised in biomedical image segmentation. It introduced an equal amount of upsampling and downsampling layers, and also skip connections. More information on what this means can be found on the Semantic Segmentation page."[1]
2015 "Machines and humans pair up to fight fraud online. When PayPal set out to fight fraud and money laundering on its site, it took a hybrid approach. Human detectives define the characteristics of criminal behavior, then a machine learning program uses those parameters to root out the bad guys on the PayPal site"[8]
2015 (November 30) Rnn (software) is released.
2016 (January 25) Microsoft Cognitive Toolkit is initially released. It is an AI solution aimed at helping users to advance in their machine learning projects.[64]
2016 "Google’s artificial intelligence algorithm beats a professional player at the Chinese board game Go, which is considered the world’s most complex board game and is many times harder than chess. The AlphaGo algorithm developed by Google DeepMind managed to win five games out of five in the Go competition."[6] "AlphaGo beat the world's number second player Lee sedol at Go game. In 2017 it beat the number one player of this game Ke Jie."[12]
2016 Software FBLearner Flow Facebook details FBLearner Flow, an internal software platform that allows Facebook software engineers to easily share, train and use machine learning algorithms.[89] FBLearner Flow is used by more than 25% of Facebook's engineers, more than a million models have been trained using the service and the service makes more than 6 million predictions per second.[90]
2016 (October) PyTorch is first released.[91]
2017 (April 18) Software release Caffe (Convolutional Architecture for Fast Feature Embedding) is initially released. It is a machine learning framework that focuses on expressiveness, speed, and modularity.[64]
2017 (April 25) Software release Shogun (toolbox) is released.
2017 "In 2017, the Alphabet's Jigsaw team built an intelligent system that was able to learn the online trolling. It used to read millions of comments of different websites to learn to stop online trolling."[12] "As part of its anti-harassment efforts, Alphabet’s Jigsaw team built a system that learned to identify trolling by reading millions of website comments. The underlying algorithms could be a huge help for sites with limited resources for moderation"[8]
2017 (May 1) CellCognition[92][93]
2017 (September 1) Software release Encog
2017 Software release PlaidML
2019 (September 10) Software release Deeplearning4j is initially released.
2019 (November 26) Software release mlpack is released.[94]
2020 (February 5) Software release KNIME is released.[95]

See also

Meta information on the timeline

How the timeline was built

The initial version of the timeline was written by User:Issa.

Funding information for this timeline is available.

Feedback and comments

Feedback for the timeline can be provided at the following places:


What the timeline is still missing

Timeline update strategy

See also

External links


  1. 1.00 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 1.10 1.11 1.12 1.13 1.14 1.15 1.16 1.17 "A Brief History of Machine Learning". dataversity.net. Retrieved 20 February 2020. 
  2. 2.0 2.1 2.2 2.3 2.4 2.5 "A History of Machine Learning and Deep Learning". import.io. Retrieved 21 February 2020. 
  3. 3.0 3.1 "A brief history of the development of machine learning algorithms". subscription.packtpub.com. Retrieved 25 February 2020. 
  4. "A BRIEF HISTORY OF MACHINE LEARNING". provalisresearch.com. Retrieved 21 February 2020. 
  5. "What is Machine Learning?". mlplatform.nl. Retrieved 25 February 2020. 
  6. 6.00 6.01 6.02 6.03 6.04 6.05 6.06 6.07 6.08 6.09 6.10 6.11 6.12 6.13 6.14 6.15 "A Short History of Machine Learning". forbes.com. Retrieved 20 February 2020. 
  7. 7.0 7.1 7.2 7.3 "History of Machine Learning". medium.com. Retrieved 25 February 2020. 
  8. 8.0 8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9 "A history of machine learning". cloud.withgoogle.com. Retrieved 21 February 2020. 
  9. Bayes, Thomas (1 January 1763). "An Essay towards solving a Problem in the Doctrine of Chance" (PDF). Philosophical Transactions. 53: 370–418. doi:10.1098/rstl.1763.0053. Retrieved 15 June 2016. 
  10. Legendre, Adrien-Marie (1805). Nouvelles méthodes pour la détermination des orbites des comètes (in French). Paris: Firmin Didot. p. viii. Retrieved 13 June 2016. 
  11. O'Connor, J J; Robertson, E F. "Pierre-Simon Laplace". School of Mathematics and Statistics, University of St Andrews, Scotland. Retrieved 15 June 2016. 
  12. 12.00 12.01 12.02 12.03 12.04 12.05 12.06 12.07 12.08 12.09 12.10 12.11 12.12 12.13 "History of Machine Learning". javatpoint.com. Retrieved 21 February 2020. 
  13. Hayes, Brian. "First Links in the Markov Chain". American Scientist. Sigma Xi, The Scientific Research Society (March–April 2013): 92. doi:10.1511/2013.101.1. Retrieved 15 June 2016. Delving into the text of Alexander Pushkin’s novel in verse Eugene Onegin, Markov spent hours sifting through patterns of vowels and consonants. On January 23, 1913, he summarized his findings in an address to the Imperial Academy of Sciences in St. Petersburg. His analysis did not alter the understanding or appreciation of Pushkin’s poem, but the technique he developed—now known as a Markov chain—extended the theory of probability in a new direction. 
  14. 14.0 14.1 14.2 14.3 14.4 14.5 14.6 14.7 14.8 14.9 "Brief History of Machine Learning". erogol.com. Retrieved 24 February 2020. 
  15. Turing, Alan (October 1950). "COMPUTING MACHINERY AND INTELLIGENCE". MIND. 59 (236): 433–460. doi:10.1093/mind/LIX.236.433. Retrieved 8 June 2016. 
  16. Crevier 1993, pp. 34–35 and Russell & Norvig 2003, p. 17
  17. McCarthy, John; Feigenbaum, Ed. "Arthur Samuel: Pioneer in Machine Learning". AI Magazine (3). Association for the Advancement of Artificial Intelligence. p. 10. Retrieved 5 June 2016. 
  18. Rosenblatt, Frank (1958). "THE PERCEPTRON: A PROBABILISTIC MODEL FOR INFORMATION STORAGE AND ORGANIZATION IN THE BRAIN" (PDF). Psychological Review. 65 (6): 386–408. 
  19. Mason, Harding; Stewart, D; Gill, Brendan (6 December 1958). "Rival". The New Yorker. Retrieved 5 June 2016. 
  20. "Seventy years of highs and lows in the history of machine learning". fastcompany.com. Retrieved 25 February 2020. 
  21. 21.0 21.1 21.2 21.3 21.4 "History of deep machine learning". medium.com. Retrieved 21 February 2020. 
  22. Marr, Marr. "A Short History of Machine Learning - Every Manager Should Read". Forbes. Retrieved 28 Sep 2016. 
  23. Cohen, Harvey. "The Perceptron". Retrieved 5 June 2016. 
  24. Colner, Robert. "A brief history of machine learning". SlideShare. Retrieved 5 June 2016. 
  25. Seppo Linnainmaa (1970). The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors. Master's Thesis (in Finnish), Univ. Helsinki, 6-7.
  26. Seppo Linnainmaa (1976). Taylor expansion of the accumulated rounding error. BIT Numerical Mathematics, 16(2), 146-160.
  27. Griewank, Andreas (2012). Who Invented the Reverse Mode of Differentiation?. Optimization Stories, Documenta Matematica, Extra Volume ISMP (2012), 389-400.
  28. Griewank, Andreas and Walther, A.. Principles and Techniques of Algorithmic Differentiation, Second Edition. SIAM, 2008.
  29. Jürgen Schmidhuber (2015). Deep learning in neural networks: An overview. Neural Networks 61 (2015): 85-117. ArXiv
  30. Jürgen Schmidhuber (2015). Deep Learning. Scholarpedia, 10(11):32832. Section on Backpropagation
  31. Dempster, A.P.; Laird, N.M.; Rubin, D.B. (1977). "Maximum Likelihood from Incomplete Data via the EM Algorithm". Journal of the Royal Statistical Society, Series B. 39 (1): 1–38. JSTOR 2984875. MR 0501537. 
  32. Marr, Marr. "A Short History of Machine Learning - Every Manager Should Read". Forbes. Retrieved 28 Sep 2016. 
  33. Fukushima, Kunihiko (1980). "Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern The Recognitron Unaffected by Shift in Position" (PDF). Biological Cybernetics. 36: 193–202. doi:10.1007/bf00344251. Retrieved 5 June 2016. 
  34. Le Cun, Yann. "Deep Learning". Retrieved 5 June 2016. 
  35. Linde, Y.; Buzo, A.; Gray, R. (1980). "An Algorithm for Vector Quantizer Design". IEEE Transactions on Communications. 28: 84–95. doi:10.1109/TCOM.1980.1094577. 
  36. Marr, Marr. "A Short History of Machine Learning - Every Manager Should Read". Forbes. Retrieved 28 Sep 2016. 
  37. Hopfield, John (April 1982). "Neural networks and physical systems with emergent collective computational abilities" (PDF). Proceedings of the National Academy of Sciences of the United States of America. 79: 2554–2558. doi:10.1073/pnas.79.8.2554. Retrieved 8 June 2016. 
  38. Bozinovski, S. (1982). "A self-learning system using secondary reinforcement" . In Trappl, Robert (ed.). Cybernetics and Systems Research: Proceedings of the Sixth European Meeting on Cybernetics and Systems Research. North Holland. pp. 397–402. Template:ISBN.
  39. Marr, Marr. "A Short History of Machine Learning - Every Manager Should Read". Forbes. Retrieved 28 Sep 2016. 
  40. Rumelhart, David; Hinton, Geoffrey; Williams, Ronald (9 October 1986). "Learning representations by back-propagating errors" (PDF). Nature. 323: 533–536. doi:10.1038/323533a0. Retrieved 5 June 2016. 
  41. "A brief history of machine learning". slideshare.net. Retrieved 24 February 2020. 
  42. Dehaene S, Changeux JP. Experimental and theoretical approaches to conscious processing. Neuron. 2011 Apr 28;70(2):200-27.
  43. Changeux JP, Dehaene S. Hierarchical neuronal modeling of cognitive functions: from synaptic transmission to the Tower of London. Comptes Rendus de l'Académie des Sciences, Série III. 1998 Feb–Mar;321(2–3):241-7.
  44. Dehaene S, Changeux JP, Nadal JP. Neural networks that learn temporal sequences by selection. Proc Natl Acad Sci U S A. 1987 May;84(9):2727-31.
  45. "Machine Learning". springer.com. Retrieved 9 March 2020. 
  46. Watksin, Christopher (1 May 1989). "Learning from Delayed Rewards" (PDF). 
  47. Markoff, John (29 August 1990). "BUSINESS TECHNOLOGY; What's the Best Answer? It's Survival of the Fittest". New York Times. Retrieved 8 June 2016. 
  48. Watkins, C.J.C.H. (1989), Learning from Delayed Rewards (PDF) (Ph.D. thesis), Cambridge University 
  49. Watkins and Dayan, C.J.C.H., (1992), 'Q-learning.Machine Learning'
  50. Tesauro, Gerald (March 1995). "Temporal Difference Learning and TD-Gammon". Communications of the ACM. 38 (3). 
  51. Ho, Tin Kam (August 1995). "Random Decision Forests" (PDF). Proceedings of the Third International Conference on Document Analysis and Recognition. Montreal, Quebec: IEEE. 1: 278–282. ISBN 0-8186-7128-9. doi:10.1109/ICDAR.1995.598994. Retrieved 5 June 2016. 
  52. Cortes, Corinna; Vapnik, Vladimir (September 1995). "Support-vector networks" (PDF). Machine Learning. Kluwer Academic Publishers. 20 (3): 273–297. ISSN 0885-6125. doi:10.1007/BF00994018. Retrieved 5 June 2016. 
  53. Marr, Marr. "A Short History of Machine Learning - Every Manager Should Read". Forbes. Retrieved 28 Sep 2016. 
  54. Hochreiter, Sepp; Schmidhuber, Jürgen (1997). "LONG SHORT-TERM MEMORY" (PDF). Neural Computation. 9 (8): 1735–1780. doi:10.1162/neco.1997.9.8.1735. 
  55. LeCun, Yann; Cortes, Corinna; Burges, Christopher. "THE MNIST DATABASE of handwritten digits". Retrieved 16 June 2016. 
  56. Breunig, M. M.; Kriegel, H.-P.; Ng, R. T.; Sander, J. (2000). LOF: Identifying Density-based Local Outliers (PDF). Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. SIGMOD. pp. 93–104. ISBN 1-58113-217-4. doi:10.1145/335191.335388. 
  57. Friedman, Jerome; Hastie, Trevor; Tibshirani, Robert (2000). "Additive logistic regression: a statistical view of boosting". Annals of Statistics. 28 (2): 337–407. CiteSeerX accessible. doi:10.1214/aos/1016218223. 
  58. "Intuitive for Patients". davincisurgery.com. Retrieved 9 March 2020. 
  59. Cui Yu, Beng Chin Ooi, Kian-Lee Tan and H. V. Jagadish Indexing the distance: an efficient method to KNN processing, Proceedings of the 27th International Conference on Very Large Data Bases, Rome, Italy, 421-430, 2001.
  60. Collobert, Ronan; Benigo, Samy; Mariethoz, Johnny (30 October 2002). "Torch: a modular machine learning software library" (PDF). Retrieved 5 June 2016. 
  61. Ham, Ji Hun; Daniel D. Lee; Lawrence K. Saul (2003). "Learning high dimensional correspondences from low dimensional manifolds" (PDF). Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003). 
  62. "The Netflix Prize Rules". Netflix Prize. Netflix. Retrieved 16 June 2016. 
  63. "What is scikit-learn ?". njtrainingacademy.com. Retrieved 5 March 2020. 
  64. 64.0 64.1 64.2 "Sharing is Caring with Algorithms". towardsdatascience.com. Retrieved 8 March 2020. 
  65. "Python's pandas library is on its way to v.1.0.0 – first release candidate has arrived". jaxenter.com. Retrieved 9 March 2020. 
  66. 66.0 66.1 Liu, Fei Tony; Ting, Kai Ming; Zhou, Zhi-Hua (December 2008). "Isolation Forest". 2008 Eighth IEEE International Conference on Data Mining: 413–422. ISBN 978-0-7695-3502-9. doi:10.1109/ICDM.2008.17.  Cite error: Invalid <ref> tag; name ":0" defined multiple times with different content
  67. "Encog Machine Learning Framework". heatonresearch.com. Retrieved 8 March 2020. 
  68. "Apache Mahout". people.apache.org. Retrieved 9 March 2020. 
  69. "About". Kaggle. Kaggle Inc. Retrieved 16 June 2016. 
  70. Simon, Phil. Too Big to Ignore: The Business Case for Big Data. 
  71. "Accord.NET Framework – An extension to AForge.NET". crsouza.com/. Retrieved 9 March 2020. 
  72. Markoff, John (17 February 2011). "Computer Wins on 'Jeopardy!': Trivial, It's Not". New York Times. p. A1. Retrieved 5 June 2016. 
  73. Le, Quoc; Ranzato, Marc’Aurelio; Monga, Rajat; Devin, Matthieu; Chen, Kai; Corrado, Greg; Dean, Jeff; Ng, Andrew (12 July 2012). "Building High-level Features Using Large Scale Unsupervised Learning". CoRR. arXiv:1112.6209Freely accessible. 
  74. Markoff, John (26 June 2012). "How Many Computers to Identify a Cat? 16,000". New York Times. p. B1. Retrieved 5 June 2016. 
  75. "mlpy". mlpy.sourceforge.net. Retrieved 8 March 2020. 
  76. Taigman, Yaniv; Yang, Ming; Ranzato, Marc’Aurelio; Wolf, Lior (24 June 2014). "DeepFace: Closing the Gap to Human-Level Performance in Face Verification". Conference on Computer Vision and Pattern Recognition. Retrieved 8 June 2016. 
  77. "Popular Big Data Engine Apache Spark 2.0 Released". adtmag.com. Retrieved 8 March 2020. 
  78. Canini, Kevin; Chandra, Tushar; Ie, Eugene; McFadden, Jim; Goldman, Ken; Gunter, Mike; Harmsen, Jeremiah; LeFevre, Kristen; Lepikhin, Dmitry; Llinares, Tomas Lloret; Mukherjee, Indraneel; Pereira, Fernando; Redstone, Josh; Shaked, Tal; Singer, Yoram. "Sibyl: A system for large scale supervised machine learning" (PDF). Jack Baskin School Of Engineering. UC Santa Cruz. Retrieved 8 June 2016. 
  79. Woodie, Alex (17 July 2014). "Inside Sibyl, Google's Massively Parallel Machine Learning Platform". Datanami. Tabor Communications. Retrieved 8 June 2016. 
  80. Goodfellow, Ian; Pouget-Abadie, Jean; Mirza, Mehdi; Xu, Bing; Warde-Farley, David; Ozair, Sherjil; Courville, Aaron; Bengio, Yoshua (2014). Generative Adversarial Networks (PDF). Proceedings of the International Conference on Neural Information Processing Systems (NIPS 2014). pp. 2672–2680. 
  81. "A Little spaCy Food for Thought: Easy to use NLP Framework". towardsdatascience.com. Retrieved 5 March 2020. 
  82. "Introducing spaCy". explosion.ai. Retrieved 5 March 2020. 
  83. "Keras". news.ycombinator.com. Retrieved 5 March 2020. 
  84. "Deep Learning のフレームワーク Chainer を公開しました" (in 日本語). 2015-06-09. Retrieved 8 March 2020. 
  85. "Apache SINGA". singa.apache.org. Retrieved 8 March 2020. 
  86. "Google achieves AI 'breakthrough' by beating Go champion". BBC News. BBC. 27 January 2016. Retrieved 5 June 2016. 
  87. "AlphaGo". Google DeepMind. Google Inc. Retrieved 5 June 2016. 
  88. Dean, Jeff; Monga, Rajat (9 November 2015). "TensorFlow - Google's latest machine learning system, open sourced for everyone". Google Research Blog. Retrieved 5 June 2016. 
  89. Dunn, Jeffrey (10 May 2016). "Introducing FBLearner Flow: Facebook's AI backbone". Facebook Code. Facebook. Retrieved 8 June 2016. 
  90. Shead, Sam (10 May 2016). "There's an 'AI backbone' that over 25% of Facebook's engineers are using to develop new products". Business Insider. Allure Media. Retrieved 8 June 2016. 
  91. "PyTorch Releases Major Update, Now Officially Supports Windows". medium.com. Retrieved 8 March 2020. 
  92. "CellCognition Explorer". software.cellcognition-project.org. Retrieved 8 March 2020. 
  93. "A deep learning and novelty detection framework for rapid phenotyping in high-content screening.". PMC 5687041Freely accessible. PMID 28954863. doi:10.1091/mbc.E17-05-0333. 
  94. "mlpack". pypi.org. Retrieved 8 March 2020. 
  95. "KNIME". knime.com. Retrieved 8 March 2020.