Difference between revisions of "Timeline of machine learning"

From Timelines
Jump to: navigation, search
(See also)
 
(161 intermediate revisions by the same user not shown)
Line 3: Line 3:
 
This page is a '''timeline of [[wikipedia:machine learning|machine learning]]'''. Major discoveries, achievements, milestones and other major events are included.
 
This page is a '''timeline of [[wikipedia:machine learning|machine learning]]'''. Major discoveries, achievements, milestones and other major events are included.
  
==Overview==
+
== Big picture ==
 +
 
 +
{| class="wikitable"
 +
! Time period !! Development summary !! More details
 +
|-
 +
| 1950s-1970s || Early days || The early days of machine learning are marked by the development of statistical methods and the use of simple algorithms. In the 1950s, Arthur Samuel develops a machine learning algorithm that can learn to play checkers. In the 1960s, Frank Rosenblatt develops the perceptron, a simple neural network that could learn to classify patterns. However, the early days of machine learning were also marked by a period of pessimism, known as the AI Winter. This was due to a number of factors, including the failure of some early AI projects and the difficulty of scaling up machine learning algorithms to large datasets.
 +
|-
 +
| 1980s-1990s || Resurgence || The rediscovery of backpropagation causes a resurgence in machine learning research. Convolutional neural networks emerge. Support vector machines and recurrent neural networks become popular. Machine learning shifts from a knowledge-driven approach to a data-driven approach.<ref name="Firican">{{cite web |last1=Firican |first1=George |title=The history of Machine Learning |url=https://www.lightsondata.com/the-history-of-machine-learning/ |website=LightsOnData |access-date=5 July 2023 |date=31 January 2022}}</ref>
 +
|-
 +
| 2000s-present || Modern era || The modern era of machine learning begins in the 2000s, when the development of deep learning make it possible to train neural networks on even larger datasets. This leads to a resurgence of interest in neural networks, and they are now used in a wide variety of applications, including image recognition, natural language processing, speech recognition, machine translation, medical diagnosis, financial trading, and self-driving cars.
 +
|}
 +
 
 +
=== Summary by decade ===
  
 
{| class="wikitable sortable"
 
{| class="wikitable sortable"
Line 11: Line 23:
 
| <1950s|| Statistical methods are discovered and refined.
 
| <1950s|| Statistical methods are discovered and refined.
 
|-
 
|-
| 1950s || Pioneering [[wikipedia:machine learning|machine learning]] research is conducted using simple algorithms.
+
| 1950s || Pioneering {{w|machine learning}} research is conducted using simple algorithms.
 
|-
 
|-
| 1960s || "In the 1960s, the discovery and use of multilayers opened a new path in neural network research."<ref name="dataversity.net">{{cite web |title=A Brief History of Machine Learning |url=https://www.dataversity.net/a-brief-history-of-machine-learning/ |website=dataversity.net |accessdate=20 February 2020}}</ref> "1960s: Shallow neural networks"<ref name="import.ioe">{{cite web |title=A History of Machine Learning and Deep Learning |url=https://www.import.io/post/history-of-deep-learning/ |website=import.io |accessdate=21 February 2020}}</ref>
+
| 1960s || The field of neural network research experiences a notable development with the discovery and utilization of multilayers.<ref name="dataversity.net">{{cite web |title=A Brief History of Machine Learning |url=https://www.dataversity.net/a-brief-history-of-machine-learning/ |website=dataversity.net |accessdate=20 February 2020}}</ref> neural networks were primarily shallow in structure, meaning they consisted of only a few layers of interconnected neurons. These shallow neural networks had limitations in handling complex problems that required more sophisticated data representations. However, they laid the foundation for further advancements in neural network research and paved the way for the development of deeper and more powerful networks in the future.<ref name="import.ioe">{{cite web |title=A History of Machine Learning and Deep Learning |url=https://www.import.io/post/history-of-deep-learning/ |website=import.io |accessdate=21 February 2020}}</ref>
 
|-
 
|-
| 1970s || '[[wikipedia:AI Winter|AI Winter]]' caused by pessimism about machine learning effectiveness. "Backpropagation, developed in the 1970s, allows a network to adjust its hidden layers of neurons/nodes to adapt to new situations."<ref name="dataversity.net"/>
+
| 1970s || The {{w|AI Winter}} is caused by pessimism about machine learning effectiveness. Backpropagation is developed, allowing a network to adjust its hidden layers of neurons/nodes to adapt to new situations.<ref name="dataversity.net"/>
 
|-
 
|-
| 1980s || ". In the mid-1980s, artificial neural networks (ANN) came to the foreground, to be then pushed aside by statistical learning systems in the 1990s. "<ref name="subscription.packtpub.com"/> "Convolution emerges"<ref name="import.ioe"/> Rediscovery of [[wikipedia:backpropagation|backpropagation]] causes a resurgence in machine learning research.
+
| 1980s || During the mid-1980s, the focus of research in the field of machine learning shifts towards artificial neural networks (ANN). However, in the subsequent decade of the 1990s, statistical learning systems gain prominence and temporarily overshadows the popularity of ANN. A pivotal event during this period is the emergence of convolution as a significant concept in machine learning, while the rediscovery and renewed exploration of backpropagation techniques leads to a resurgence of interest and advancement in the field of machine learning research. Rediscovery of {{w|backpropagation}} causes a resurgence in machine learning research.<ref name="subscription.packtpub.com"/><ref name="import.ioe"/>  
 
|-
 
|-
| 1990s || "1990s: Unsupervised deep learning"<ref name="import.ioe"/> "Thanks to statistics, machine learning became very famous in 1990s. The intersection of computer science and statistics gave birth to probabilistic approaches in AI. This shifted the field further toward data-driven approaches."<ref name="provalisresearch.coms">{{cite web |title=A BRIEF HISTORY OF MACHINE LEARNING |url=https://provalisresearch.com/blog/brief-history-machine-learning/ |website=provalisresearch.com |accessdate=21 February 2020}}</ref> "In the early 90’s Machine Learning became very popular again due to the intersection of Computer Science and Statistics"<ref name="mlplatform.nlt">{{cite web |title=What is Machine Learning? |url=https://mlplatform.nl/what-is-machine-learning/ |website=mlplatform.nl |accessdate=25 February 2020}}</ref> " Work on machine learning shifts from a knowledge-driven approach to a data-driven approach.  Scientists begin creating programs for computers to analyze large amounts of data and draw conclusions — or “learn” — from the results."<ref name="forbes.com"/>  Work on machine learning shifts from a knowledge-driven approach to a data-driven approach. Scientists begin creating programs for computers to analyze large amounts of data and draw conclusions — or “learn” — from the results.<ref name="forbes.com"/> "In the 1990s we began to apply machine learning in data mining, adaptive software and web applications, text learning, and language learning."<ref name="medium.comb"/> [[wikipedia:Support vector machines|Support vector machines]] and [[wikipedia:recurrent neural networks|recurrent neural networks]] become popular.
+
| 1990s || There is a shift away from neural networks and towards statistical learning methods. Statistical learning methods are able to achieve comparable or better performance than neural networks on a wider range of tasks. However, neural networks continue to be used for some specific tasks, such as natural language processing and image recognition.<ref name="provalisresearch.coms">{{cite web |title=A BRIEF HISTORY OF MACHINE LEARNING |url=https://provalisresearch.com/blog/brief-history-machine-learning/ |website=provalisresearch.com |accessdate=21 February 2020}}</ref><ref name="mlplatform.nlt">{{cite web |title=What is Machine Learning? |url=https://mlplatform.nl/what-is-machine-learning/ |website=mlplatform.nl |accessdate=25 February 2020}}</ref><ref name="forbes.com"/>
 
|-
 
|-
| 2000s || Deep learning becomes feasible and neural networks see widespread commercial use. "2006s-present: Modern deep learning"<ref name="import.ioe"/>
+
| 2000s || Deep learning becomes feasible and neural networks see widespread commercial use.<ref name="import.ioe"/>
 
|-
 
|-
 
| 2010s || Machine learning becomes integral to many widely used software services and receives great publicity.
 
| 2010s || Machine learning becomes integral to many widely used software services and receives great publicity.
 
|}
 
|}
  
==Timeline==
+
== Full timeline ==
 
+
[[wikipedia:File:A simple neural network with two input units and one output unit.png|thumb|A simple neural network with two input units and one output unit]]
 
 
 
[[wikipedia:File:OS2 TD-Gammon Screenshot.png|thumb|OS/2 [[wikipedia:TD-Gammon|TD-Gammon]] game screenshot]]
 
 
 
<!-- Commented out: [[wikipedia:File:Watson Jeopardy.jpg|thumb|[[wikipedia:Ken Jennings|Ken Jennings]], Watson, and [[wikipedia:Brad Rutter|Brad Rutter]] in their Jeopardy! exhibition match]] -->
 
  
 
{| class="wikitable sortable"
 
{| class="wikitable sortable"
Line 38: Line 45:
 
! Year !! Event Type !! Caption !! Event
 
! Year !! Event Type !! Caption !! Event
 
|-
 
|-
| 1642 || || || "Blaise Pascal was 19 when he made an “arithmetic machine” for his tax collector father. It could add, subtract, multiply, and divide. Three centuries later, the IRS uses machine learning to combat tax evasion."<ref name="cloud.withgoogle.com">{{cite web |title=A history of machine learning |url=https://cloud.withgoogle.com/build/data-analytics/explore-history-machine-learning/ |website=cloud.withgoogle.com |accessdate=21 February 2020}}</ref>
+
| 1642 || || || At the age of 19, French child prodigy {{w|Blaise Pascal}} creates an "arithmetic machine" for his father, a tax collector. This machine has the capability to perform addition, subtraction, multiplication, and division. Fast forward three centuries, the {{w|Internal Revenue Service}} (IRS) now utilizes machine learning techniques to tackle tax evasion.<ref name="cloud.withgoogle.com">{{cite web |title=A history of machine learning |url=https://cloud.withgoogle.com/build/data-analytics/explore-history-machine-learning/ |website=cloud.withgoogle.com |accessdate=21 February 2020}}</ref>
 +
|-
 +
| 1679 || || || {{w|Gottfried Wilhelm Leibniz}}, a German mathematician, philosopher, and sometimes poet, is credited with inventing the binary code system, which serves as the basis for contemporary computing.<ref name="cloud.withgoogle.com"/>
 
|-
 
|-
| 1679 || || || "German mathematician, philosopher, and occasional poet Gottfried Wilhelm Leibniz devised the system of binary code that laid the foundation for modern computing"<ref name="cloud.withgoogle.com"/>
+
| 1763 || Discovery || The Underpinngs of Bayes' Theorem || {{w|Thomas Bayes}}'s work ''{{w|An Essay towards solving a Problem in the Doctrine of Chances}}'' is published two years after his death, having been amended and edited by a friend of Bayes, {{w|Richard Price}}.<ref>{{cite journal|last1=Bayes|first1=Thomas|title=An Essay towards solving a Problem in the Doctrine of Chance|journal=Philosophical Transactions|date=1 January 1763|volume=53|pages=370–418|doi=10.1098/rstl.1763.0053|url=http://rstl.royalsocietypublishing.org/content/53/370.full.pdf|accessdate=15 June 2016}}</ref> The essay presents work which underpins {{w|Bayes theorem}}.
 
|-
 
|-
| 1763 || Discovery || The Underpinngs of Bayes' Theorem || [[wikipedia:Thomas Bayes|Thomas Bayes]]'s work ''[[wikipedia:An Essay towards solving a Problem in the Doctrine of Chances|An Essay towards solving a Problem in the Doctrine of Chances]]'' is published two years after his death, having been amended and edited by a friend of Bayes, [[wikipedia:Richard Price|Richard Price]].<ref>{{cite journal|last1=Bayes|first1=Thomas|title=An Essay towards solving a Problem in the Doctrine of Chance|journal=Philosophical Transactions|date=1 January 1763|volume=53|pages=370–418|doi=10.1098/rstl.1763.0053|url=http://rstl.royalsocietypublishing.org/content/53/370.full.pdf|accessdate=15 June 2016}}</ref> The essay presents work which underpins [[wikipedia:Bayes theorem|Bayes theorem]].
+
| 1801 || || || French weaver and merchant {{w|Joseph-Marie Jacquard}} introduces a groundbreaking innovation in data storage through the invention of a programmable weaving loom. The loom utilizes punched cards to control the movement of warp threads, enabling the creation of intricate patterns in fabric. This revolutionary technology not only allows weavers to produce complex designs more efficiently but also paves the way for future advancements in data storage. The concept of punched cards, pioneered by Jacquard, would become a fundamental principle in computer data storage systems during the 20th century. This significant development lays the foundation for the evolution of data storage technology as we know it today.<ref>{{cite web |title=Jacquard Loom, 1934 - The Henry Ford |url=https://www.thehenryford.org/collections-and-research/digital-collections/artifact/354319/ |website=www.thehenryford.org |access-date=14 June 2023 |language=en}}</ref><ref name="medium.comb"/>
 
|-
 
|-
| 1770 || || || "A chess-playing automaton debuts, then dupes Europe for decades" "A moving, mechanical device designed to imitate a human, “The Turk” fooled even Napoleon into thinking it could play chess. The jig was up in 1857 when The Turk’s final owner revealed how a person hidden inside moved its arms."<ref name="cloud.withgoogle.com"/>
+
| 1805 || Discovery || Least Squares || {{w|Adrien-Marie Legendre}} describes the "méthode des moindres carrés", known in English as the {{w|least squares}} method.<ref>{{cite book|last1=Legendre|first1=Adrien-Marie|title=Nouvelles méthodes pour la détermination des orbites des comètes|date=1805|publisher=Firmin Didot|location=Paris|page=viii|url=https://books.google.com.au/books/about/Nouvelles_m%C3%A9thodes_pour_la_d%C3%A9terminati.html?id=FRcOAAAAQAAJ&redir_esc=y|accessdate=13 June 2016|language=French}}</ref> The least squares method is used widely in {{w|data fitting}}, which in machine learning, refers to the process of finding a model or function that best represents or fits a given dataset.
 
|-
 
|-
| 1801 || || || "1801- First Data Storage through the Weaving Loom"<ref name="medium.comb"/>
+
| 1812 || || Bayes' Theorem || {{w|Pierre-Simon Laplace}} publishes ''Théorie Analytique des Probabilités'', in which he expands upon the work of Bayes and defines what is now known as {{w|Bayes' Theorem}}.<ref>{{cite web|last1=O'Connor|first1=J J|last2=Robertson|first2=E F|title=Pierre-Simon Laplace|url=http://www-history.mcs.st-and.ac.uk/Biographies/Laplace.html|publisher=School of Mathematics and Statistics, University of St Andrews, Scotland|accessdate=15 June 2016}}</ref>
 
|-
 
|-
| 1805 || Discovery || Least Squares || [[wikipedia:Adrien-Marie Legendre|Adrien-Marie Legendre]] describes the "méthode des moindres carrés", known in English as the [[wikipedia:least squares|least squares]] method.<ref>{{cite book|last1=Legendre|first1=Adrien-Marie|title=Nouvelles méthodes pour la détermination des orbites des comètes|date=1805|publisher=Firmin Didot|location=Paris|page=viii|url=https://books.google.com.au/books/about/Nouvelles_m%C3%A9thodes_pour_la_d%C3%A9terminati.html?id=FRcOAAAAQAAJ&redir_esc=y|accessdate=13 June 2016|language=French}}</ref> The least squares method is used widely in [[wikipedia:data fitting|data fitting]].
+
| 1834 || || || English polymath {{w|Charles Babbage}}, known as the father of the computer, envisions a machine that could be programmed using punch cards. Although the device would be never constructed, its logical framework forms the basis for all modern computers. Charles Babbage's contribution to punch-card programming is significant in the development of computer technology.<ref name="javatpoint.comu">{{cite web |title=History of Machine Learning |url=https://www.javatpoint.com/history-of-machine-learning |website=javatpoint.com |accessdate=21 February 2020}}</ref><ref name="cloud.withgoogle.com"/>
 
|-
 
|-
| 1812 || || Bayes' Theorem || [[wikipedia:Pierre-Simon Laplace|Pierre-Simon Laplace]] publishes ''Théorie Analytique des Probabilités'', in which he expands upon the work of Bayes and defines what is now known as [[wikipedia:Bayes' Theorem|Bayes' Theorem]].<ref>{{cite web|last1=O'Connor|first1=J J|last2=Robertson|first2=E F|title=Pierre-Simon Laplace|url=http://www-history.mcs.st-and.ac.uk/Biographies/Laplace.html|publisher=School of Mathematics and Statistics, University of St Andrews, Scotland|accessdate=15 June 2016}}</ref>
+
| 1842 || || || English mathematician and writer {{w|Ada Lovelace}} becomes the world's first computer programmer. She develops an algorithm that outlines a series of steps for solving mathematical problems on Charles Babbage's theoretical punch-card machine. Ada Lovelace's pioneering work in computer programming would be recognized years later when the US Department of Defense names a new software language "Ada" in her honor.<ref name="cloud.withgoogle.com"/>
 
|-
 
|-
| 1834 || || || "In 1834, Charles Babbage, the father of the computer, conceived a device that could be programmed with punch cards. However, the machine was never built, but all modern computers rely on its logical structure."<ref name="javatpoint.comu">{{cite web |title=History of Machine Learning |url=https://www.javatpoint.com/history-of-machine-learning |website=javatpoint.com |accessdate=21 February 2020}}</ref> "The "father of the computer" invents punch-card programming"<ref name="cloud.withgoogle.com"/>
+
| 1847 || || || English mathematician, philosopher, and logician {{w|George Boole}} devises a type of algebra that allows all values to be simplified as either "true" or "false." This concept, known as {{w|Boolean logic}}, would play a crucial role in contemporary computing by aiding the central processing unit (CPU) in determining how to handle incoming inputs.<ref name="cloud.withgoogle.com"/><ref name="medium.comb">{{cite web |title=History of Machine Learning |url=https://medium.com/bloombench/history-of-machine-learning-7c9dc67857a5 |website=medium.com |accessdate=25 February 2020}}</ref>
 
|-
 
|-
| 1842 || || || "Ada Lovelace's algorithm makes her the world's first computer programmer" "The 27-year-old mathematician described a sequence of operations for solving mathematical problems using Charles' Babbage's theoretical punch-card machine. In the 70s, the US Department of Defense paid homage, naming a new software language Ada."<ref name="cloud.withgoogle.com"/>
+
| 1854 || || || English physician {{w|John Snow}}, during a deadly cholera outbreak in {{w|London}}, challenges the prevailing belief that cholera spreads through "bad air." Using a map, Snow plots the locations of cholera cases and identifies the regions closest to each water pump. He makes a significant discovery by finding that most deaths occurred near a specific pump on Broad Street in the Soho district. Snow deduces that the contaminated water from that pump is responsible for the outbreak. By convincing the locals to disable the pump, the epidemic is brought under control. This event marks the birth of epidemiology and serves as an early success of the nearest-neighbor algorithm, even before its official invention, nearly a century later.<ref name=Domingos>{{cite book |last1=Domingos |first1=Pedro |title=The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World |date=22 September 2015 |publisher=Basic Books |edition=1st |url=https://www.amazon.com/Master-Algorithm-Ultimate-Learning-Machine-ebook/dp/B012271YB2#:~:text=In%20The%20Master%20Algorithm%2C%20Pedro,business%2C%20science%2C%20and%20society. |language=English}}</ref>
 
|-
 
|-
| 1847 || || || "Philosopher and closet mystic George Boole created a form of algebra in which all values can be reduced to “true” or “false.” Essential to modern computing, Boolean logic helps a CPU decide how to process new inputs."<ref name="cloud.withgoogle.com"/><ref name="medium.comb">{{cite web |title=History of Machine Learning |url=https://medium.com/bloombench/history-of-machine-learning-7c9dc67857a5 |website=medium.com |accessdate=25 February 2020}}</ref>
+
| 1890 || || || German-American statistician, inventor, and businessman {{w|Herman Hollerith}} develops a pioneering mechanical system that integrates punch cards with mechanical calculation methods. This groundbreaking system would enable the rapid computation of statistics compiled from vast amounts of data collected from millions of individuals.<ref name="medium.comb"/> Such advancement would contribute to the evolution of computing and provide a basis for future developments in machine learning.
 
|-
 
|-
| 1890 || || || "1890 - Mechanical System for Statistical calculations" "Herman Hollerith created the first combined system of mechanical calculation and punch cards to rapidly calculate statistics gathered from millions of people."<ref name="medium.comb"/>
+
| 1913 || Discovery || Markov Chains || {{w|Andrey Markov}} first describes techniques he used to analyse a poem. The techniques later become known as {{w|Markov chains}}.<ref>{{cite journal|last1=Hayes|first1=Brian|title=First Links in the Markov Chain|url=http://www.americanscientist.org/issues/pub/first-links-in-the-markov-chain/|accessdate=15 June 2016|work=American Scientist|issue=March–April 2013|publisher=Sigma Xi, The Scientific Research Society|page=92|doi=10.1511/2013.101.1|quote=Delving into the text of Alexander Pushkin’s novel in verse Eugene Onegin, Markov spent hours sifting through patterns of vowels and consonants. On January 23, 1913, he summarized his findings in an address to the Imperial Academy of Sciences in St. Petersburg. His analysis did not alter the understanding or appreciation of Pushkin’s poem, but the technique he developed—now known as a Markov chain—extended the theory of probability in a new direction.}}</ref>
 
|-
 
|-
| 1913 || Discovery || Markov Chains || [[wikipedia:Andrey Markov|Andrey Markov]] first describes techniques he used to analyse a poem. The techniques later become known as [[wikipedia:Markov chains|Markov chains]].<ref>{{cite journal|last1=Hayes|first1=Brian|title=First Links in the Markov Chain|url=http://www.americanscientist.org/issues/pub/first-links-in-the-markov-chain/|accessdate=15 June 2016|work=American Scientist|issue=March–April 2013|publisher=Sigma Xi, The Scientific Research Society|page=92|doi=10.1511/2013.101.1|quote=Delving into the text of Alexander Pushkin’s novel in verse Eugene Onegin, Markov spent hours sifting through patterns of vowels and consonants. On January 23, 1913, he summarized his findings in an address to the Imperial Academy of Sciences in St. Petersburg. His analysis did not alter the understanding or appreciation of Pushkin’s poem, but the technique he developed—now known as a Markov chain—extended the theory of probability in a new direction.}}</ref>
+
| 1936 || || || English mathematician {{w|Alan Turing}} proposes a theory outlining how a machine could identify and carry out a predefined set of instructions.<ref name="javatpoint.comu"/> His theory of computation forms the foundation of modern computing and has direct relevance to machine learning. Turing's concept of a universal machine laid the groundwork for the development of computers capable of executing algorithms and processing data.<ref>{{cite journal |last1=Bernhardt |first1=Chris |title=Turing's Vision: The Birth of Computer Science |date=2016 |url=https://www.jstor.org/stable/j.ctt1c2crt7 |publisher=The MIT Press}}</ref>  
 
|-
 
|-
| 1936 || || || "In 1936, Alan Turing gave a theory that how a machine can determine and execute a set of instructions."<ref name="javatpoint.comu"/>
+
| 1940 || || || {{w|ENIAC}} (Electronic Numerical Integrator and Computer) is created as the first manually operated computer. This invention marks the birth of the first electronic general-purpose computer. Following this milestone, stored program computers such as {{w|EDSAC}} in 1949 and {{w|EDVAC}} in 1951 would be subsequently developed. These advancements introduce the concept of storing and executing programs electronically, paving the way for the evolution of modern computer systems.<ref name="javatpoint.comu"/>
 
|-
 
|-
| 1940 || || || "In 1940, the first manually operated computer, "ENIAC" was invented, which was the first electronic general-purpose computer. After that stored program computer such as EDSAC in 1949 and EDVAC in 1951 were invented."<ref name="javatpoint.comu"/>
+
| 1943 || || || American neurophysiologist {{w|Warren McCulloch}} and mathematician {{w|Walter Pitts}} publish a paper describing the functioning of neurons and their desire to create a model of it using an electrical circuit. This marks the first instance of {{w|neural network}}s. Building on this concept, they begin exploring the application of their idea and delve into the analysis of human neuron behavior.<ref name="dataversity.net"/><ref name="javatpoint.comu"/>
 
|-
 
|-
| 1943 || || || "The first case of neural networks was in 1943, when neurophysiologist Warren McCulloch and mathematician Walter Pitts wrote a paper about neurons, and how they work. They decided to create a model of this using an electrical circuit, and therefore the neural network was born."<ref name="dataversity.net"/> "In 1943, a human neural network was modeled with an electrical circuit. In 1950, the scientists started applying their idea to work and analyzed how human neurons might work."<ref name="javatpoint.comu"/>
+
| 1949 || || || Canadian psychologist {{w|Donald Hebb}} introduces a pioneering concept that marks the initial advancement in machine learning. Known as {{w|Hebbian Learning}} theory, it draws from a neuropsychological framework and aims to establish correlations among nodes within a {{w|recurrent neural network}} (RNN). This theory essentially captures and retains shared patterns within the network, functioning as a memory for future reference. In simpler terms, Hebbian Learning theory enables the network to identify connections and store relevant information for later use.<ref name="erogol.comt"/>
 
|-
 
|-
| 1949 || || || "First step toward prevalent ML was proposed by Hebb, in 1949, based on a neuropsychological learning formulation. It is called Hebbian Learning theory. With a simple explanation, it pursues correlations between nodes of a Recurrent Neural Network (RNN). It memorizes any commonalities on the network and serves like a memory later."<ref name="erogol.comt"/>
+
| 1950 || || Turing's Learning Machine || {{w|Alan Turing}} proposes a 'learning machine' that could learn and become artificially intelligent. Turing's specific proposal foreshadows [[wikipedia:genetic algorithms|genetic algorithms]].<ref>{{cite journal|last1=Turing|first1=Alan|title=COMPUTING MACHINERY AND INTELLIGENCE|journal=MIND|date=October 1950|volume=59|issue=236|pages=433–460|doi=10.1093/mind/LIX.236.433|url=http://mind.oxfordjournals.org/content/LIX/236/433|accessdate=8 June 2016}}</ref> "Alan Turing creates the “Turing Test” to determine if a computer has real intelligence. To pass the test, a computer must be able to fool a human into believing it is also human."<ref name="forbes.com">{{cite web |title=A Short History of Machine Learning |url=https://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine-learning-every-manager-should-read/#756b4b2615e7 |website=forbes.com |accessdate=20 February 2020}}</ref><ref name="javatpoint.comu"/>
 
|-
 
|-
| 1950 || || Turing's Learning Machine || [[wikipedia:Alan Turing|Alan Turing]] proposes a 'learning machine' that could learn and become artificially intelligent. Turing's specific proposal foreshadows [[wikipedia:genetic algorithms|genetic algorithms]].<ref>{{cite journal|last1=Turing|first1=Alan|title=COMPUTING MACHINERY AND INTELLIGENCE|journal=MIND|date=October 1950|volume=59|issue=236|pages=433–460|doi=10.1093/mind/LIX.236.433|url=http://mind.oxfordjournals.org/content/LIX/236/433|accessdate=8 June 2016}}</ref> "Alan Turing creates the “Turing Test” to determine if a computer has real intelligence. To pass the test, a computer must be able to fool a human into believing it is also human."<ref name="forbes.com">{{cite web |title=A Short History of Machine Learning |url=https://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine-learning-every-manager-should-read/#756b4b2615e7 |website=forbes.com |accessdate=20 February 2020}}</ref><ref name="javatpoint.comu"/>
+
| 1951 || || First Neural Network Machine || [[wikipedia:Marvin Minsky|Marvin Minsky]] and Dean Edmonds build the [[wikipedia:Stochastic neural analog reinforcement calculator|SNARC]], the first neural network machine able to learn.<ref>{{Harvnb|Crevier|1993|pp=34–35}} and {{Harvnb|Russell|Norvig|2003|p=17}}</ref>
 
|-
 
|-
| 1951 || || First Neural Network Machine || [[wikipedia:Marvin Minsky|Marvin Minsky]] and Dean Edmonds build the first neural network machine, able to learn, the [[wikipedia:Stochastic neural analog reinforcement calculator|SNARC]]. <ref>{{Harvnb|Crevier|1993|pp=34–35}} and {{Harvnb|Russell|Norvig|2003|p=17}}</ref>
+
| 1952 || || Machines Playing Checkers || {{w|Arthur Samuel}} at IBM's Poughkeepsie Laboratory becomes one of the early pioneers of machine learning. He develops some of the first machine learning programs, starting with programs that play checkers. Samuel's program, designed for an IBM computer, analyzes winning strategies by studying gameplay. Over time, the program would improve its performance by incorporating successful moves into its algorithm, thereby enhancing its gameplay abilities. Samuel's use of alpha-beta pruning in his computer program enables it to play checkers at a championship level, marking a significant milestone in the application of machine learning to gaming.<ref name="aaai">{{cite news|last1=McCarthy|first1=John|last2=Feigenbaum|first2=Ed|title=Arthur Samuel: Pioneer in Machine Learning|url=http://www.aaai.org/ojs/index.php/aimagazine/article/view/840/758|accessdate=5 June 2016|work=AI Magazine|issue=3|publisher=Association for the Advancement of Artificial Intelligence|page=10}}</ref><ref name="forbes.com"/><ref name="Koch">{{cite web |last1=Koch |first1=Robert |title=History of Machine Learning - A Journey through the Timeline |url=https://www.clickworker.com/customer-blog/history-of-machine-learning/ |website=clickworker.com |access-date=3 July 2023 |language=en |date=1 September 2022}}</ref>
 
|-
 
|-
| 1952 || || || "1952 saw the first computer program which could learn as it ran. It was a game which played checkers, created by Arthur Samuel."<ref name="dataversity.net"/> "Arthur Samuel wrote the first computer learning program. The program was the game of checkers, and the IBM IBM +0% computer improved at the game the more it played, studying which moves made up winning strategies and incorporating those moves into its program."  
+
| 1957 || Discovery || Perceptron || {{w|Frank Rosenblatt}} invents the {{w|perceptron}} while working at the {{w|Cornell Aeronautical Laboratory}}. This groundbreaking invention garners significant attention and receives extensive media coverage. The perceptron is the first neural network for computers. It aims to simulate the cognitive processes of the human brain, marking a significant milestone in the field of {{w|artificial intelligence}}.<ref>{{cite journal|last1=Rosenblatt|first1=Frank|title=THE PERCEPTRON: A PROBABILISTIC MODEL FOR INFORMATION STORAGE AND ORGANIZATION IN THE BRAIN|journal=Psychological Review|date=1958|volume=65|issue=6|pages=386–408|url=http://www.staff.uni-marburg.de/~einhaeus/GRK_Block/Rosenblatt1958.pdf}}</ref><ref>{{cite news|last1=Mason|first1=Harding|last2=Stewart|first2=D|last3=Gill|first3=Brendan|title=Rival|url=http://www.newyorker.com/magazine/1958/12/06/rival-2|accessdate=5 June 2016|work=The New Yorker|date=6 December 1958}}</ref><ref name="forbes.com"/>
 
|-
 
|-
| 1952 || || Machines Playing Checkers || [[wikipedia:Arthur Samuel|Arthur Samuel]] joins IBM's Poughkeepsie Laboratory and begins working on some of the very first machine learning programs, first creating programs that play checkers.<ref name="aaai">{{cite news|last1=McCarthy|first1=John|last2=Feigenbaum|first2=Ed|title=Arthur Samuel: Pioneer in Machine Learning|url=http://www.aaai.org/ojs/index.php/aimagazine/article/view/840/758|accessdate=5 June 2016|work=AI Magazine|issue=3|publisher=Association for the Advancement of Artificial Intelligence|page=10}}</ref> "Arthur Samuel wrote the first computer learning program. The program was the game of checkers, and the IBM computer improved at the game the more it played, studying which moves made up winning strategies and incorporating those moves into its program."<ref name="forbes.com"/>
+
| 1959 || || || A significant advancement in neural networks occurrs when {{w|Bernard Widrow}} and {{w|Marcian Hoff}} develop two models at {{w|Stanford University}}. The initial model, known as ADELINE, showcases the ability to recognize binary patterns and make predictions about the next bit in a sequence. The subsequent generation, called MADELINE, proves to be highly practical as it effectively eliminates echo on phone lines, providing a valuable real-world application. Remarkably, this technology continues to be utilized to this day.<ref name="cloud.withgoogle.com"/><ref name="dataversity.net"/>
 
|-
 
|-
| 1957 || Discovery || Perceptron || [[wikipedia:Frank Rosenblatt|Frank Rosenblatt]] invents the [[wikipedia:perceptron|perceptron]] while working at the [[wikipedia:Cornell Aeronautical Laboratory|Cornell Aeronautical Laboratory]].<ref>{{cite journal|last1=Rosenblatt|first1=Frank|title=THE PERCEPTRON: A PROBABILISTIC MODEL FOR INFORMATION STORAGE AND ORGANIZATION IN THE BRAIN|journal=Psychological Review|date=1958|volume=65|issue=6|pages=386–408|url=http://www.staff.uni-marburg.de/~einhaeus/GRK_Block/Rosenblatt1958.pdf}}</ref> The invention of the perceptron generated a great deal of excitement and widely covered in the media.<ref>{{cite news|last1=Mason|first1=Harding|last2=Stewart|first2=D|last3=Gill|first3=Brendan|title=Rival|url=http://www.newyorker.com/magazine/1958/12/06/rival-2|accessdate=5 June 2016|work=The New Yorker|date=6 December 1958}}</ref> "Frank Rosenblatt designed the first neural network for computers (the perceptron), which simulate the thought processes of the human brain."<ref name="forbes.com"/>
+
| 1959 || || || The term "Machine Learning" is first coined by Arthur Samuel<ref name="javatpoint.comu"/>, who defines it as the “field of study that gives computers the ability to learn without being explicitly programmed”.<ref>{{cite web |last1=Bheemaiah |first1=Kariappa |last2=Esposito |first2=Mark |last3=Tse |first3=Terence |title=What is machine learning? |url=https://theconversation.com/what-is-machine-learning-76759#:~:text=In%201959%2C%20Arthur%20Samuel%2C%20a,learn%20without%20being%20explicitly%20programmed%E2%80%9D. |website=The Conversation |access-date=3 July 2023 |language=en |date=3 May 2017}}</ref>
 
|-
 
|-
| 1959 || || || "A neural network learns to make phone calls clearer"<ref name="cloud.withgoogle.com"/> "Another extremely early instance of a neural network came in 1959, when Bernard Widrow and Marcian Hoff created two models of them at Stanford University. The first was called ADELINE, and it could detect binary patterns. For example, in a stream of bits, it could predict what the next one would be. The next generation was called MADELINE, and it could eliminate echo on phone lines, so had a useful real world application. It is still in use today."<ref name="dataversity.net"/>
+
| 1959 || || || The first practical application of a neural network occurrs when it is utilized to address the issue of echo removal on phone lines. This is achieved through the implementation of an adaptive filter.<ref name="javatpoint.comu"/>
 
|-
 
|-
| 1959 || || || "In 1959, the term "Machine Learning" was first coined by Arthur Samuel."<ref name="javatpoint.comu"/>
+
| 1962 || || || U.S. professor {{w|Bernard Widrow}} and Ted Hoff introduce the {{w|ADALINE}} algorithm, a single-layer neural network that can be used for classification and regression tasks. The ADALINE algorithm is a significant breakthrough in the field of machine learning, but it is limited to a single layer. This is because it is difficult to train neural networks with multiple layers.<ref name="dataversity.net"/>
 
|-
 
|-
| 1959 || || || "In 1959, the first neural network was applied to a real-world problem to remove echoes over phone lines using an adaptive filter."<ref name="javatpoint.comu"/>
+
| 1963 || || || United States government agencies like the {{w|Defense Advanced Research Projects Agency}} (DARPA) fund AI research at universities such as MIT, hoping for machines that would translate Russian instantly. The {{w|Cold War}} is in full swing at the time, and the US government is eager to develop technologies that would give them an edge over the {{w|Soviet Union}}. Machine translation is seen as one such technology, and DARPA is willing to invest heavily in its development. {{w|MIT}} is one of the leading universities in the field of AI research at the time, and DARPA funds a number of projects at the university.<ref name="fastcompany.comp">{{cite web |title=Seventy years of highs and lows in the history of machine learning |url=https://www.fastcompany.com/90396217/seventy-years-of-highs-and-lows-in-the-history-of-machine-learning |website=fastcompany.com |accessdate=25 February 2020}}</ref>
 
|-
 
|-
| 1962 || || || "Neural networks use back propagation (explained in detail in the Introduction to Neural Networks), and this important step came in 1986, when three researchers from the Stanford psychology department decided to extend an algorithm created by Widrow and Hoff in 1962. This therefore allowed multiple layers to be used in a neural network, creating what are known as ‘slow learners’, which will learn over a long period of time."<ref name="dataversity.net"/>
+
| 1965 || || || Soviet mathematician {{w|Alexey Ivakhnenko}} publishes a number of articles and books on {{w|group method of data handling}} (GMDH), a method for inductive inference that is used to build complex models from data. Ivakhnenko's work on GMDH is influential in the development of neural networks, as the GMDH algorithm is similar to the backpropagation algorithm, which is a widely used algorithm for training neural networks. Ivakhnenko's work on GMDH is considered to be one of the foundations of {{w|deep learning}}. His work would have a significant impact on the development of machine learning, and it is still used today in a variety of applications.<ref name="medium.comw"/>
 
|-
 
|-
| 1963 || || || "U.S. government agencies like the Defense Advanced Research Projects Agency (DARPA) fund AI research at universities such as MIT, hoping for machines that will translate Russian instantly."<ref name="fastcompany.comp">{{cite web |title=Seventy years of highs and lows in the history of machine learning |url=https://www.fastcompany.com/90396217/seventy-years-of-highs-and-lows-in-the-history-of-machine-learning |website=fastcompany.com |accessdate=25 February 2020}}</ref>
+
| 1967 || || Nearest Neighbor || {{w|Thomas M. Cover}} and {{w|Peter E. Hart}} make a significant contribution to the field of pattern recognition by introducing the nearest neighbor algorithm. This algorithm marks the beginning of basic pattern recognition capabilities for computers. Its initial application is in mapping routes, particularly for traveling salesmen who needed to visit multiple cities in a short tour. By leveraging the nearest neighbor algorithm, computers could identify similarities between items in large datasets and automatically recognize patterns. This breakthrough paves the way for further advancements in pattern recognition and data analysis.<ref>{{cite web|last1=Marr|first1=Marr|title=A Short History of Machine Learning - Every Manager Should Read|url=http://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine-learning-every-manager-should-read/#2a1a75f9323f|website=Forbes|accessdate=28 Sep 2016}}</ref><ref name="forbes.com"/>
 
|-
 
|-
| 1965 || || || " Probably the first who decided to “develop” (deepen) pepperprope was the Soviet mathematician A.G. Ivakhnenko, who had published a number of articles and books since 1965, which, in particular, described the modeling system “Alpha”."<ref name="medium.comw"/>
+
| 1969 || || Limitations of Neural Networks || [[wikipedia:Marvin Minsky|Marvin Minsky]] and [[wikipedia:Seymour Papert|Seymour Papert]] publish their book ''[[wikipedia:Perceptrons (book)|Perceptrons]]'', describing some of the limitations of perceptrons and neural networks. The interpretation that the book shows that neural networks are fundamentally limited is seen as a hindrance for research into neural networks.<ref>{{cite web|last1=Cohen|first1=Harvey|title=The Perceptron|url=http://harveycohen.net/image/perceptron.html|accessdate=5 June 2016}}</ref><ref>{{cite web|last1=Colner|first1=Robert|title=A brief history of machine learning|url=http://www.slideshare.net/bobcolner/a-brief-history-of-machine-learning|website=SlideShare|accessdate=5 June 2016}}</ref>
 
|-
 
|-
| 1967 || || Nearest Neighbor || The nearest neighbor algorithm was created, which is the start of basic pattern recognition. The algorithm was used to map routes. <ref>{{cite web|last1=Marr|first1=Marr|title=A Short History of Machine Learning - Every Manager Should Read|url=http://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine-learning-every-manager-should-read/#2a1a75f9323f|website=Forbes|accessdate=28 Sep 2016}}</ref> "The “nearest neighbor” algorithm was written, allowing computers to begin using very basic pattern recognition. This could be used to map a route for traveling salesmen, starting at a random city but ensuring they visit all cities during a short tour."<ref name="forbes.com"/>
+
| 1970 || || Automatic Differentation (Backpropagation)  || Finnish mathematician and computer scientist {{w|Seppo Linnainmaa}} publishes the general method for automatic differentiation (AD) of discrete connected networks of nested differentiable functions.<ref name="lin1970">[[wikipedia:Seppo Linnainmaa|Seppo Linnainmaa]] (1970). The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors. Master's Thesis (in Finnish), Univ. Helsinki, 6-7.</ref><ref name="lin1976">[[wikipedia:Seppo Linnainmaa|Seppo Linnainmaa]] (1976). Taylor expansion of the accumulated rounding error. BIT Numerical Mathematics, 16(2), 146-160.</ref> This corresponds to the modern version of backpropagation, but is not yet named as such.<ref name="grie2012">Griewank, Andreas (2012). Who Invented the Reverse Mode of Differentiation?. Optimization Stories, Documenta Matematica, Extra Volume ISMP (2012), 389-400.</ref><ref name="grie2008">Griewank, Andreas and Walther, A.. Principles and Techniques of Algorithmic Differentiation, Second Edition. SIAM, 2008.</ref><ref name="schmidhuber2015">[[wikipedia:Jürgen Schmidhuber|Jürgen Schmidhuber]] (2015). Deep learning in neural networks: An overview. Neural Networks 61 (2015): 85-117. [http://arxiv.org/abs/1404.7828 ArXiv] </ref><ref name="scholarpedia2015">[[wikipedia:Jürgen Schmidhuber|Jürgen Schmidhuber]] (2015). Deep Learning. Scholarpedia, 10(11):32832. [http://www.scholarpedia.org/article/Deep_Learning#Backpropagation Section on Backpropagation]</ref><ref name="erogol.comt"/>
 
|-
 
|-
| 1969 || || Limitations of Neural Networks || [[wikipedia:Marvin Minsky|Marvin Minsky]] and [[wikipedia:Seymour Papert|Seymour Papert]] publish their book ''[[wikipedia:Perceptrons (book)|Perceptrons]]'', describing some of the limitations of perceptrons and neural networks. The interpretation that the book shows that neural networks are fundamentally limited is seen as a hindrance for research into neural networks.<ref>{{cite web|last1=Cohen|first1=Harvey|title=The Perceptron|url=http://harveycohen.net/image/perceptron.html|accessdate=5 June 2016}}</ref><ref>{{cite web|last1=Colner|first1=Robert|title=A brief history of machine learning|url=http://www.slideshare.net/bobcolner/a-brief-history-of-machine-learning|website=SlideShare|accessdate=5 June 2016}}</ref>
+
| 1974 || || Algorithm || Greek biomedical engineer {{w|Evangelia Micheli-Tzanakou}} and Harth introduce {{w|ALOPEX}} (ALgorithms Of Pattern EXtraction) as a correlation based machine learning algorithm, which focuses on extracting patterns from data by identifying correlations between variables or features.
 +
|-
 +
| 1974 || || {{w|Backpropagation}} || American social scientist and machine learning pioneer {{w|Paul Werbos}} lays the foundation for backpropagation in his dissertation, a technique that adjusts the weights of neural networks to improve prediction accuracy.<ref name="Koch"/>
 
|-
 
|-
| 1970 || || Automatic Differentation (Backpropagation)  || [[wikipedia:Seppo Linnainmaa|Seppo Linnainmaa]] published the general method for automatic differentiation (AD) of discrete connected networks of nested differentiable functions.<ref name="lin1970">[[wikipedia:Seppo Linnainmaa|Seppo Linnainmaa]] (1970). The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors. Master's Thesis (in Finnish), Univ. Helsinki, 6-7.</ref><ref name="lin1976">[[wikipedia:Seppo Linnainmaa|Seppo Linnainmaa]] (1976). Taylor expansion of the accumulated rounding error. BIT Numerical Mathematics, 16(2), 146-160.</ref> This corresponds to the modern version of backpropagation, but is not yet named as such.<ref name="grie2012">Griewank, Andreas (2012). Who Invented the Reverse Mode of Differentiation?. Optimization Stories, Documenta Matematica, Extra Volume ISMP (2012), 389-400.</ref><ref name="grie2008">Griewank, Andreas and Walther, A.. Principles and Techniques of Algorithmic Differentiation, Second Edition. SIAM, 2008.</ref><ref name="schmidhuber2015">[[wikipedia:Jürgen Schmidhuber|Jürgen Schmidhuber]] (2015). Deep learning in neural networks: An overview. Neural Networks 61 (2015): 85-117. [http://arxiv.org/abs/1404.7828 ArXiv] </ref><ref name="scholarpedia2015">[[wikipedia:Jürgen Schmidhuber|Jürgen Schmidhuber]] (2015). Deep Learning. Scholarpedia, 10(11):32832. [http://www.scholarpedia.org/article/Deep_Learning#Backpropagation Section on Backpropagation]</ref>
+
| 1977 || || Algorithm || The {{w|Expectation–maximization algorithm}} is explained and given its name in a paper by [[w:Arthur P. Dempster|Arthur Dempster]], {{w|Nan Laird}}, and {{w|Donald Rubin}}.<ref name="Dempster1977">
 +
{{cite journal
 +
|last1=Dempster  |first1= A.P.
 +
|last2=Laird |first2=N.M. |last3=Rubin
 +
|first3=D.B.
 +
|title=Maximum Likelihood from Incomplete Data via the EM Algorithm
 +
|journal={{w|Journal of the Royal Statistical Society, Series B}}
 +
|year=1977 |volume=39 |issue=1 |pages=1–38
 +
}}
 +
</ref>
 
|-
 
|-
| 1970 || || || "There had been not to much effort until the intuition of Multi-Layer Perceptron (MLP) was suggested by Werbos[6] in 1981 with NN specific Backpropagation(BP) algorithm, albeit BP idea had been proposed before by Linnainmaa [5] in 1970 in the name "reverse mode of automatic differentiation"."<ref name="erogol.comt"/>
+
| 1979 || || Stanford Cart || Students at {{w|Stanford University}} develop a cart that can navigate and avoid obstacles in a room.<ref>{{cite web |title=Rise of the machines |url=https://mydigitalpublication.com/publication/?i=498301&article_id=3093864&view=articleBrowser |website=mydigitalpublication.com |access-date=5 July 2023}}</ref> The Stanford Cart consists in a remote-controlled robot, successfully navigating a room filled with obstacles without human intervention, showcasing advancements in autonomous movement.<ref>{{cite web|last1=Marr|first1=Marr|title=A Short History of Machine Learning - Every Manager Should Read|url=http://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine-learning-every-manager-should-read/#2a1a75f9323f|website=Forbes|accessdate=28 Sep 2016}}</ref><ref name="forbes.com"/>
 
|-
 
|-
| 1974 || || Algorithm || "{{w|ALOPEX}} (an acronym from "'''''AL'''gorithms '''O'''f '''P'''attern '''EX'''traction''") is a correlation based machine learning algorithm first proposed by [[w:Evangelia Micheli-Tzanakou|Tzanakou]] and Harth in 1974."
+
| 1980 || Discovery || Neocognitron || Japanese computer scientist {{w|Kunihiko Fukushima}} introduces the neocognitron, a hierarchical multilayered convolutional neural network. This groundbreaking work lays the foundation for {{w|convolutional neural network}}s, which would become a fundamental architecture in the field of artificial neural networks. The neocognitron's innovative design would inspire further advancements and applications in image and pattern recognition.<ref>{{cite journal|last1=Fukushima|first1=Kunihiko|title=Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern The Recognitron Unaffected by Shift in Position|journal=Biological Cybernetics|date=1980|volume=36|pages=193–202|url=http://www.cs.princeton.edu/courses/archive/spr08/cos598B/Readings/Fukushima1980.pdf|accessdate=5 June 2016|doi=10.1007/bf00344251}}</ref><ref>{{cite web|last1=Le Cun|first1=Yann|title=Deep Learning|url=http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.297.6176&rep=rep1&type=pdf|accessdate=5 June 2016}}</ref><ref name="medium.comw"/>
 
|-
 
|-
| 1979 || || Stanford Cart || Students at Stanford University develop a cart that can navigate and avoid obstacles in a room <ref>{{cite web|last1=Marr|first1=Marr|title=A Short History of Machine Learning - Every Manager Should Read|url=http://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine-learning-every-manager-should-read/#2a1a75f9323f|website=Forbes|accessdate=28 Sep 2016}}</ref> " Students at Stanford University invent the “Stanford Cart” which can navigate obstacles in a room on its own."<ref name="forbes.com"/>
+
| 1980 || || || The {{w|Linde–Buzo–Gray algorithm}} is introduced by Yoseph Linde, Andrés Buzo and {{w|Robert M. Gray}}.<ref>{{Cite journal | last1 = Linde | first1 = Y. | last2 = Buzo | first2 = A. | last3 = Gray | first3 = R.| title = An Algorithm for Vector Quantizer Design | doi = 10.1109/TCOM.1980.1094577 | journal = {{w|IEEE Transactions on Communications}} | volume = 28 | pages = 84–95 | year = 1980 | pmid =  | pmc = | url     = http://ieeexplore.ieee.org/xpls/abs_all.jsp?&arnumber=1094577}}</ref>
 
|-
 
|-
| 1980 || Discovery || Neocognitron || [[wikipedia:Kunihiko Fukushima|Kunihiko Fukushima]] first publishes his work on the [[wikipedia:Neocognitron|Neocognitron]], a type of [[wikipedia:artificial neural network|artificial neural network]].<ref>{{cite journal|last1=Fukushima|first1=Kunihiko|title=Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern The Recognitron Unaffected by Shift in Position|journal=Biological Cybernetics|date=1980|volume=36|pages=193–202|url=http://www.cs.princeton.edu/courses/archive/spr08/cos598B/Readings/Fukushima1980.pdf|accessdate=5 June 2016|doi=10.1007/bf00344251}}</ref> [[wikipedia:Neocognitron|Neocognition]] later inspires [[wikipedia:convolutional neural networks|convolutional neural networks]].<ref>{{cite web|last1=Le Cun|first1=Yann|title=Deep Learning|url=http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.297.6176&rep=rep1&type=pdf|accessdate=5 June 2016}}</ref> " In 1980, Kunihika Fukushima proposed a hierarchical multilayered convolution neural network known as the neocognitron."<ref name="medium.comw"/>
+
| 1980 || || || The first instance of the {{w|International Conference on Machine Learning}} takes place. The conference serves as a platform for researchers, practitioners, and industry professionals to come together and present their latest research, share ideas, and discuss advancements in machine learning algorithms, methodologies, and applications.  
 
|-
 
|-
| 1981 || || Explanation Based Learning || Gerald Dejong introduces Explanation Based Learning, where a computer algorithm analyses data and creates a general rule it can follow and discard unimportant data.<ref>{{cite web|last1=Marr|first1=Marr|title=A Short History of Machine Learning - Every Manager Should Read|url=http://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine-learning-every-manager-should-read/#2a1a75f9323f|website=Forbes|accessdate=28 Sep 2016}}</ref> "Gerald Dejong introduces the concept of Explanation Based Learning (EBL), in which a computer analyses training data and creates a general rule it can follow by discarding unimportant data."<ref name="forbes.com"/>
+
| 1981 || || Explanation Based Learning || Gerald Dejong introduces Explanation Based Learning (EBL), a concept in machine learning where a computer algorithm analyzes training data to create a general rule by discarding unimportant information. This approach allows the algorithm to focus on relevant patterns and extract valuable knowledge from the data.<ref>{{cite web|last1=Marr|first1=Marr|title=A Short History of Machine Learning - Every Manager Should Read|url=http://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine-learning-every-manager-should-read/#2a1a75f9323f|website=Forbes|accessdate=28 Sep 2016}}</ref><ref name="forbes.com"/>
 
|-
 
|-
| 1981 || || || "There had been not to much effort until the intuition of Multi-Layer Perceptron (MLP) was suggested by Werbos[6] in 1981 with NN specific Backpropagation(BP) algorithm, albeit BP idea had been proposed before by Linnainmaa [5] in 1970 in the name "reverse mode of automatic differentiation"."<ref name="erogol.comt"/>
+
| 1981 || || || American social scientist and machine learning pioneer {{w|Paul Werbos}} publishes a paper in the ''Mathematics of Control, Signals, and Systems'' journal that introduces the backpropagation algorithm for training multilayer perceptrons (MLPs). MLPs are a type of neural network that can learn to solve complex problems by adjusting the weights of its connections. Werbos's paper is a major breakthrough in the field of machine learning. It shows that MLPs can be trained to solve problems that are previously thought to be intractable. This would lead to a resurgence of interest in {{w|neural network}}s, paving the way for the development of more advanced neural network architectures.<ref name="erogol.comt"/>
 
|-
 
|-
 
| 1982 || Discovery || Recurrent Neural Network || [[wikipedia:John Hopfield|John Hopfield]] popularizes [[wikipedia:Hopfield networks|Hopfield networks]], a type of [[wikipedia:recurrent neural network|recurrent neural network]] that can serve as [[wikipedia:content-addressable memory|content-addressable memory]] systems.<ref>{{cite journal|last1=Hopfield|first1=John|title=Neural networks and physical systems with emergent collective computational abilities|journal=Proceedings of the National Academy of Sciences of the United States of America|date=April 1982|volume=79|pages=2554–2558|url=http://www.pnas.org/content/79/8/2554.full.pdf|accessdate=8 June 2016|doi=10.1073/pnas.79.8.2554}}</ref><ref name="dataversity.net"/><ref name="import.ioe"/>
 
| 1982 || Discovery || Recurrent Neural Network || [[wikipedia:John Hopfield|John Hopfield]] popularizes [[wikipedia:Hopfield networks|Hopfield networks]], a type of [[wikipedia:recurrent neural network|recurrent neural network]] that can serve as [[wikipedia:content-addressable memory|content-addressable memory]] systems.<ref>{{cite journal|last1=Hopfield|first1=John|title=Neural networks and physical systems with emergent collective computational abilities|journal=Proceedings of the National Academy of Sciences of the United States of America|date=April 1982|volume=79|pages=2554–2558|url=http://www.pnas.org/content/79/8/2554.full.pdf|accessdate=8 June 2016|doi=10.1073/pnas.79.8.2554}}</ref><ref name="dataversity.net"/><ref name="import.ioe"/>
 
|-
 
|-
| 1982 || || || " Furthermore, in 1982, Japan announced it was focusing on more advanced neural networks, which incentivised American funding into the area, and thus created more research in the area."<ref name="dataversity.net"/>
+
| 1982 || || || {{w|Japan}} makes a significant announcement regarding its emphasis on the development of more sophisticated neural networks. This declaration serves as a catalyst for increased American funding in this field, subsequently leading to a surge of research endeavors in the same domain.<ref name="dataversity.net"/>
 +
|-
 +
| 1982 || || || Self-learning as machine learning paradigm is introduced along with a neural network capable of self-learning  named Crossbar Adaptive Array (CAA).<ref>  Bozinovski, S. (1982). "A self-learning system using secondary reinforcement" . In Trappl, Robert (ed.). Cybernetics and Systems Research: Proceedings of the Sixth European Meeting on Cybernetics and Systems Research. North Holland. pp. 397–402.</ref>
 +
|-
 +
| 1985 || || NetTalk || Terry Sejnowski, along with Charles Rosenberg, develop a neural network called NetTalk. This innovative system has the ability to learn the pronunciation of words similar to how a baby learns. NetTalk demonstrates impressive capabilities by teaching itself the correct pronunciation of approximately 20,000 words within just one week. This breakthrough in neural network technology showcases the potential of self-learning systems and their ability to acquire language skills. A program that learns to pronounce words the same way a baby does, is developed by Terry Sejnowski.<ref>{{cite web|last1=Marr|first1=Marr|title=A Short History of Machine Learning - Every Manager Should Read|url=http://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine-learning-every-manager-should-read/#2a1a75f9323f|website=Forbes|accessdate=28 Sep 2016}}</ref><ref name="javatpoint.comu"/>
 
|-
 
|-
| 1982 || || || Self-learning as machine learning paradigm is introduced along with a neural network capable of self-learning  named Crossbar Adaptive Array (CAA).<ref>  Bozinovski, S. (1982). "A self-learning system using secondary reinforcement" . In Trappl, Robert (ed.). Cybernetics and Systems Research: Proceedings of the Sixth European Meeting on Cybernetics and Systems Research. North Holland. pp. 397–402. {{ISBN|978-0-444-86488-8}}.</ref>
+
| 1985–1986 || || || Researchers in the field of neural networks introduce the concept of {{w|Multilayer Perceptron}} (MLP) along with the practical {{w|Backpropagation}} (BP) training algorithm. Although the idea of BP was proposed earlier, the specific implementation for neural networks was suggested by Werbos in 1981. These developments mark a significant acceleration in {{w|neural network}} research and lay the foundation for the neural network architectures used today.<ref name="erogol.comt"/>
 
|-
 
|-
| 1985 || || NetTalk || A program that learns to pronounce words the same way a baby does, is developed by Terry Sejnowski.<ref>{{cite web|last1=Marr|first1=Marr|title=A Short History of Machine Learning - Every Manager Should Read|url=http://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine-learning-every-manager-should-read/#2a1a75f9323f|website=Forbes|accessdate=28 Sep 2016}}</ref> " Terry Sejnowski invents NetTalk, which learns to pronounce words the same way a baby does." "In 1985, Terry Sejnowski and Charles Rosenberg invented a neural network NETtalk, which was able to teach itself how to correctly pronounce 20,000 words in one week."<ref name="javatpoint.comu"/>
+
| 1986 || Discovery || Backpropagation || The process of {{w|backpropagation}} is described by [[wikipedia:David Rumelhart|David Rumelhart]], [[wikipedia:Geoff Hinton|Geoff Hinton]] and [[wikipedia:Ronald J. Williams|Ronald J. Williams]].<ref>{{cite journal|last1=Rumelhart|first1=David|last2=Hinton|first2=Geoffrey|last3=Williams|first3=Ronald|title=Learning representations by back-propagating errors|journal=Nature|date=9 October 1986|volume=323|pages=533–536|url=http://elderlab.yorku.ca/~elder/teaching/cosc6390psyc6225/readings/hinton%201986.pdf|accessdate=5 June 2016|doi=10.1038/323533a0}}</ref><ref name="slideshare.netr">{{cite web |title=A brief history of machine learning |url=https://www.slideshare.net/bobcolner/a-brief-history-of-machine-learning |website=slideshare.net |accessdate=24 February 2020}}</ref>
 
|-
 
|-
| 1985–1986 || || || "There had been not to much effort until the intuition of Multi-Layer Perceptron (MLP) was suggested by Werbos[6] in 1981 with NN specific Backpropagation(BP) algorithm, albeit BP idea had been proposed before by Linnainmaa [5] in 1970 in the name "reverse mode of automatic differentiation". Still BP is the key ingredient of today's NN architectures. With those new ideas, NN researches accelerated again. In 1985 - 1986 NN researchers successively presented the idea of MLP with practical BP training"<ref name="erogol.comt"/>
+
| 1986 || || || Australian computer scientist {{w|Ross Quinlan}} proposes the {{w|ID3 algorithm}}, today a very-well known ML algorithm.<ref name="erogol.comt">{{cite web |title=Brief History of Machine Learning |url=http://www.erogol.com/brief-history-machine-learning/ |website=erogol.com |accessdate=24 February 2020}}</ref>
 
|-
 
|-
| 1986 || Discovery || Backpropagation || The process of [[wikipedia:backpropagation|backpropagation]] is described by [[wikipedia:David Rumelhart|David Rumelhart]], [[wikipedia:Geoff Hinton|Geoff Hinton]] and [[wikipedia:Ronald J. Williams|Ronald J. Williams]].<ref>{{cite journal|last1=Rumelhart|first1=David|last2=Hinton|first2=Geoffrey|last3=Williams|first3=Ronald|title=Learning representations by back-propagating errors|journal=Nature|date=9 October 1986|volume=323|pages=533–536|url=http://elderlab.yorku.ca/~elder/teaching/cosc6390psyc6225/readings/hinton%201986.pdf|accessdate=5 June 2016|doi=10.1038/323533a0}}</ref><ref name="slideshare.netr">{{cite web |title=A brief history of machine learning |url=https://www.slideshare.net/bobcolner/a-brief-history-of-machine-learning |website=slideshare.net |accessdate=24 February 2020}}</ref>
+
| 1986 || || Algorithm || The Dehaene–Changeux model is developed by cognitive neuroscientists {{w|Stanislas Dehaene}} and {{w|Jean-Pierre Changeux}}.<ref>Dehaene S, Changeux JP. '''Experimental and theoretical approaches to conscious processing.''' Neuron. 2011 Apr 28;70(2):200-27.</ref> It is used to provide a predictive framework to the study of {{w|inattentional blindness}} and the solving of the {{w|Tower of London test}}.<ref>Changeux JP, Dehaene S. '''Hierarchical neuronal modeling of cognitive functions: from synaptic transmission to the Tower of London.''' Comptes Rendus de l'Académie des Sciences, Série III. 1998 Feb–Mar;321(2–3):241-7.</ref><ref>Dehaene S, Changeux JP, Nadal JP. '''Neural networks that learn temporal sequences by selection.''' Proc Natl Acad Sci U S A. 1987 May;84(9):2727-31.</ref>
 
|-
 
|-
| 1986 || || || "At the another spectrum, a very-well known ML algorithm was proposed by J. R. Quinlan [9] in 1986 that we call Decision Trees, more specifically ID3 algorithm."<ref name="erogol.comt">{{cite web |title=Brief History of Machine Learning |url=http://www.erogol.com/brief-history-machine-learning/ |website=erogol.com |accessdate=24 February 2020}}</ref>
+
| 1986 || || || Peer-reviewed scientific journal ''[[w:Machine Learning (journal)|Machine Learning]]'' is first issued. Published by Springer Nature, it is considered to be one of the leading journals in the field of machine learning. The journal publishes articles on a wide range of topics related to machine learning, including statistical learning theory, natural language processing, computer vision, data mining, reinforcement learning, and robotics.<ref>{{cite web |title=Machine Learning |url=https://www.springer.com/journal/10994 |website=springer.com |accessdate=9 March 2020}}</ref>
 +
|-
 +
| 1986 || || || {{w|European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases}}
 +
|-
 +
| 1987 || || || The {{w|Conference on Neural Information Processing Systems}} (NeurIPS) is first held. It is a prominent conference in the field of artificial intelligence and machine learning, where researchers, academics, and industry professionals gather to present and discuss the latest advancements, research findings, and developments related to neural networks, deep learning, and various aspects of information processing systems. NeurIPS would become a significant platform for showcasing breakthroughs and fostering collaborations within the AI community.
 +
|-
 +
| 1988 || || || The {{w|Knowledge Engineering and Machine Learning Group}} is founded at the Technical University of Catalonia (UPC) in Barcelona, Spain. KEMLG is a research group that focuses on the development of knowledge engineering and machine learning techniques. The group would make significant contributions to the field of artificial intelligence, and its work would be used in a wide variety of applications, including medical diagnosis, fraud detection, and natural language processing. 
 
|-
 
|-
 
| 1989 || Discovery || Reinforcement Learning || Christopher Watkins develops [[wikipedia:Q-learning|Q-learning]], which greatly improves the practicality and feasibility of [[wikipedia:reinforcement learning|reinforcement learning]].<ref>{{cite journal|last1=Watksin|first1=Christopher|title=Learning from Delayed Rewards|date=1 May 1989|url=http://www.cs.rhul.ac.uk/~chrisw/new_thesis.pdf}}</ref>
 
| 1989 || Discovery || Reinforcement Learning || Christopher Watkins develops [[wikipedia:Q-learning|Q-learning]], which greatly improves the practicality and feasibility of [[wikipedia:reinforcement learning|reinforcement learning]].<ref>{{cite journal|last1=Watksin|first1=Christopher|title=Learning from Delayed Rewards|date=1 May 1989|url=http://www.cs.rhul.ac.uk/~chrisw/new_thesis.pdf}}</ref>
 
|-
 
|-
 
| 1989 || Commercialization || Commercialization of Machine Learning on Personal Computers || Axcelis, Inc. releases [[wikipedia:Evolver (software)|Evolver]], the first software package to commercialize the use of genetic algorithms on personal computers.<ref>{{cite news|last1=Markoff|first1=John|title=BUSINESS TECHNOLOGY; What's the Best Answer? It's Survival of the Fittest|url=http://www.nytimes.com/1990/08/29/business/business-technology-what-s-the-best-answer-it-s-survival-of-the-fittest.html|accessdate=8 June 2016|work=New York Times|date=29 August 1990}}</ref>
 
| 1989 || Commercialization || Commercialization of Machine Learning on Personal Computers || Axcelis, Inc. releases [[wikipedia:Evolver (software)|Evolver]], the first software package to commercialize the use of genetic algorithms on personal computers.<ref>{{cite news|last1=Markoff|first1=John|title=BUSINESS TECHNOLOGY; What's the Best Answer? It's Survival of the Fittest|url=http://www.nytimes.com/1990/08/29/business/business-technology-what-s-the-best-answer-it-s-survival-of-the-fittest.html|accessdate=8 June 2016|work=New York Times|date=29 August 1990}}</ref>
 +
|-
 +
| 1989 || || Algorithm || Chris Watkins introduces {{w|Q-learning}}, a [[w:Model-free (reinforcement learning)|model-free]] {{w|reinforcement learning}} algorithm.<ref>{{citation |type=Ph.D. thesis|last=Watkins |first=C.J.C.H. |year=1989 |title=Learning from Delayed Rewards |publisher=Cambridge University|url=http://www.cs.rhul.ac.uk/~chrisw/new_thesis.pdf}}</ref><ref>Watkins and Dayan, C.J.C.H., (1992), 'Q-learning.Machine Learning'</ref>
 
|-
 
|-
 
| 1992 || Achievement || Machines Playing Backgammon || Gerald Tesauro develops [[wikipedia:TD-Gammon|TD-Gammon]], a computer [[wikipedia:backgammon|backgammon]] program that utilises an [[wikipedia:artificial neural network|artificial neural network]] trained using [[wikipedia:temporal-difference learning|temporal-difference learning]] (hence the 'TD' in the name). TD-Gammon is able to rival, but not consistently surpass, the abilities of top human backgammon players.<ref>{{cite journal|last1=Tesauro|first1=Gerald|title=Temporal Difference Learning and TD-Gammon|journal=Communications of the ACM|date=March 1995|volume=38|issue=3|url=http://www.bkgm.com/articles/tesauro/tdl.html}}</ref>
 
| 1992 || Achievement || Machines Playing Backgammon || Gerald Tesauro develops [[wikipedia:TD-Gammon|TD-Gammon]], a computer [[wikipedia:backgammon|backgammon]] program that utilises an [[wikipedia:artificial neural network|artificial neural network]] trained using [[wikipedia:temporal-difference learning|temporal-difference learning]] (hence the 'TD' in the name). TD-Gammon is able to rival, but not consistently surpass, the abilities of top human backgammon players.<ref>{{cite journal|last1=Tesauro|first1=Gerald|title=Temporal Difference Learning and TD-Gammon|journal=Communications of the ACM|date=March 1995|volume=38|issue=3|url=http://www.bkgm.com/articles/tesauro/tdl.html}}</ref>
 
|-
 
|-
| 1995 || || || "One of the most important ML breakthrough was Support Vector Machines (Networks) (SVM), proposed by Vapnik and Cortes[10] in 1995 with very strong theoretical standing and empirical results. That was the time separating the ML community into two crowds as NN or SVM advocates."<ref name="erogol.comt"/>
+
| 1995 || || || A significant breakthrough in machine learning occurrs with the introduction of {{w|Support Vector Machine}}s (SVM) by Vapnik and Cortes. SVMs possesses a solid theoretical foundation and delivers impressive empirical results. This development would lead to a division within the machine learning community, with some advocating for {{w|neural network}}s (NN) while others supporting SVM as the preferred approach.<ref name="erogol.comt"/>
 
|-
 
|-
| 1995 || Discovery || Random Forest Algorithm || Tin Kam Ho publishes a paper describing [[wikipedia:Random forest|Random decision forests]].<ref>{{cite journal|last1=Ho|first1=Tin Kam|title=Random Decision Forests|journal=Proceedings of the Third International Conference on Document Analysis and Recognition|date=August 1995|volume=1|pages=278–282|doi=10.1109/ICDAR.1995.598994|url=http://ect.bell-labs.com/who/tkh/publications/papers/odt.pdf|accessdate=5 June 2016|publisher=IEEE|location=Montreal, Quebec|isbn=0-8186-7128-9}}</ref>
+
| 1995 || Discovery || Random Forest Algorithm || Tin Kam Ho publishes a paper describing random decision forests. Random decision forests are a type of ensemble learning algorithm that combines multiple decision trees to improve the accuracy of predictions. Ho's paper, titled ''Random Decision Forests'', introduces the basic idea of random decision forests. He shows that by randomly selecting features and thresholds, it is possible to construct a large number of decision trees that are relatively independent of each other. This independence helps to reduce the variance of the predictions, which leads to improved accuracy. Ho's paper is met with a positive reception from the machine learning community. Random decision forests would since become one of the most popular machine learning algorithms, and they would be used in a wide variety of applications, including image classification, natural language processing, and medical diagnosis.<ref>{{cite journal|last1=Ho|first1=Tin Kam|title=Random Decision Forests|journal=Proceedings of the Third International Conference on Document Analysis and Recognition|date=August 1995|volume=1|pages=278–282|doi=10.1109/ICDAR.1995.598994|url=http://ect.bell-labs.com/who/tkh/publications/papers/odt.pdf|accessdate=5 June 2016|publisher=IEEE|location=Montreal, Quebec|isbn=0-8186-7128-9}}</ref>
 
|-
 
|-
| 1995 || Discovery || Support Vector Machines || [[wikipedia:Corinna Cortes|Corinna Cortes]] and [[wikipedia:Vladimir Vapnik|Vladimir Vapnik]] publish their work on [[wikipedia:support vector machines|support vector machines]].<ref name="erogol.comt"/><ref>{{cite journal|last1=Cortes|first1=Corinna|last2=Vapnik|first2=Vladimir|title=Support-vector networks|journal=Machine Learning|date=September 1995|volume=20|issue=3|pages=273–297|doi=10.1007/BF00994018|url=http://download.springer.com/static/pdf/467/art%253A10.1007%252FBF00994018.pdf?originUrl=http%3A%2F%2Flink.springer.com%2Farticle%2F10.1007%2FBF00994018&token2=exp=1465109699~acl=%2Fstatic%2Fpdf%2F467%2Fart%25253A10.1007%25252FBF00994018.pdf%3ForiginUrl%3Dhttp%253A%252F%252Flink.springer.com%252Farticle%252F10.1007%252FBF00994018*~hmac=133f5211871b237411d6dcc05047fc16cdc99abc25ab4e74be863808ea53bfd7|accessdate=5 June 2016|publisher=Kluwer Academic Publishers|issn=0885-6125}}</ref>
+
| 1995 || Discovery || Support Vector Machines || {{w|Corinna Cortes}} and {{w|Vladimir Vapnik}} publish their work on {{w|support vector machines}} in the journal ''Machine Learning''. Their paper, titled "Support-Vector Networks", introduces SVMs as a new machine learning algorithm for classification and regression problems. SVMs are based on the idea of finding a hyperplane that separates two classes of data points with the maximum possible margin. The margin is the distance between the hyperplane and the closest data points on either side. The more data points that lie on the margin, the more robust the SVM will be to noise in the data. SVMs would show to be very effective for a wide variety of classification and regression problems. They are particularly well-suited for problems where the data is not linearly separable, as SVMs can be used to map the data to a higher-dimensional space where it becomes linearly separable. SVMs are also relatively easy to train and are very efficient in terms of computational resources. Today it is one of the most popular machine learning algorithms. They are used in a wide variety of applications, including spam filtering, image classification, and fraud detection.<ref name="erogol.comt"/><ref>{{cite journal|last1=Cortes|first1=Corinna|last2=Vapnik|first2=Vladimir|title=Support-vector networks|journal=Machine Learning|date=September 1995|volume=20|issue=3|pages=273–297|doi=10.1007/BF00994018|url=http://download.springer.com/static/pdf/467/art%253A10.1007%252FBF00994018.pdf?originUrl=http%3A%2F%2Flink.springer.com%2Farticle%2F10.1007%2FBF00994018&token2=exp=1465109699~acl=%2Fstatic%2Fpdf%2F467%2Fart%25253A10.1007%25252FBF00994018.pdf%3ForiginUrl%3Dhttp%253A%252F%252Flink.springer.com%252Farticle%252F10.1007%252FBF00994018*~hmac=133f5211871b237411d6dcc05047fc16cdc99abc25ab4e74be863808ea53bfd7|accessdate=5 June 2016|publisher=Kluwer Academic Publishers|issn=0885-6125}}</ref>
 
|-
 
|-
| 1996 (Octgober 10) || || || {{w|Orange (software)}} is released.
+
| 1996 (Octgober 10) || || || [[w:Orange (software)|Orange]] is released by the {{w|University of Ljubljana}}. It is a visual programming language and integrated development environment (IDE) for data mining and machine learning.  
 
|-
 
|-
| 1997 || || IBM Deep Blue Beats Kasparov || IBM’s Deep Blue beats the world champion at chess.<ref>{{cite web|last1=Marr|first1=Marr|title=A Short History of Machine Learning - Every Manager Should Read|url=http://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine-learning-every-manager-should-read/#2a1a75f9323f|website=Forbes|accessdate=28 Sep 2016}}</ref>
+
| 1997 || || IBM Deep Blue Beats Kasparov || {{w|Supercomputer}} {{w|Deep Blue}}, developed by {{w|IBM}}, achieves a historic victory by defeating chess grandmaster {{w|Garry Kasparov}} in a match. This landmark event demonstrates the potential of artificial intelligence to surpass human capability in complex tasks such as {{w|chess}}. It marks a pivotal moment in machine learning, highlighting the ability of AI systems to learn and evolve independently, posing new challenges and possibilities for mankind.<ref>{{cite web|last1=Marr|first1=Marr|title=A Short History of Machine Learning - Every Manager Should Read|url=http://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine-learning-every-manager-should-read/#2a1a75f9323f|website=Forbes|accessdate=28 Sep 2016}}</ref><ref name="Koch"/>
 
|-
 
|-
 
| 1997 || Discovery || LSTM || [[wikipedia:Sepp Hochreiter|Sepp Hochreiter]] and [[wikipedia:Jürgen Schmidhuber|Jürgen Schmidhuber]] invent Long-short term memory recurrent neural networks,<ref>{{cite journal|last1=Hochreiter|first1=Sepp|last2=Schmidhuber|first2=Jürgen|title=LONG SHORT-TERM MEMORY|journal=Neural Computation|date=1997|volume=9|issue=8|pages=1735–1780|url=http://deeplearning.cs.cmu.edu/pdfs/Hochreiter97_lstm.pdf|doi=10.1162/neco.1997.9.8.1735}}</ref> greatly improving the efficiency and practicality of recurrent neural networks.
 
| 1997 || Discovery || LSTM || [[wikipedia:Sepp Hochreiter|Sepp Hochreiter]] and [[wikipedia:Jürgen Schmidhuber|Jürgen Schmidhuber]] invent Long-short term memory recurrent neural networks,<ref>{{cite journal|last1=Hochreiter|first1=Sepp|last2=Schmidhuber|first2=Jürgen|title=LONG SHORT-TERM MEMORY|journal=Neural Computation|date=1997|volume=9|issue=8|pages=1735–1780|url=http://deeplearning.cs.cmu.edu/pdfs/Hochreiter97_lstm.pdf|doi=10.1162/neco.1997.9.8.1735}}</ref> greatly improving the efficiency and practicality of recurrent neural networks.
 
|-
 
|-
| 1997 || || || "Little before, another solid ML model was proposed by Freund and Schapire in 1997 prescribed with boosted ensemble of weak classifiers called Adaboost. This work also gave the Godel Prize to the authors at the time. Adaboost trains weak set of classifiers that are easy to train, by giving more importance to hard instances. This model still the basis of many different tasks like face recognition and detection."<ref name="erogol.comt"/>
+
| 1997 || || || {{w|Yoav Freund}} and {{w|Robert Schapire}} introduce Adaboost, which would become an influential machine learning model. Adaboost is an ensemble method that combines multiple weak classifiers to create a strong classifier. The model gained recognition and received the prestigious Godel Prize for its contributions. Adaboost works by iteratively training weak classifiers on difficult instances while giving them more importance. This approach has proven effective in various tasks such as face recognition and detection, and it continues to serve as a foundation for many machine learning applications.<ref name="erogol.comt"/>
 
|-
 
|-
| 1998 || || MNIST database || A team led by [[wikipedia:Yann LeCun|Yann LeCun]] releases the [[wikipedia:MNIST database|MNIST database]], a dataset comprising a mix of handwritten digits from [[wikipedia:American Census Bureau|American Census Bureau]] employees and American high school students.<ref>{{cite web|last1=LeCun|first1=Yann|last2=Cortes|first2=Corinna|last3=Burges|first3=Christopher|title=THE MNIST DATABASE of handwritten digits|url=http://yann.lecun.com/exdb/mnist/|accessdate=16 June 2016}}</ref> The MNIST database has since become a benchmark for evaluating handwriting recognition.
+
| 1998 || || MNIST database || A team led by {{w|Yann LeCun}} releases the [[wikipedia:MNIST database|MNIST database]], a dataset comprising a mix of handwritten digits from [[wikipedia:American Census Bureau|American Census Bureau]] employees and American high school students.<ref>{{cite web|last1=LeCun|first1=Yann|last2=Cortes|first2=Corinna|last3=Burges|first3=Christopher|title=THE MNIST DATABASE of handwritten digits|url=http://yann.lecun.com/exdb/mnist/|accessdate=16 June 2016}}</ref> The MNIST database has since become a benchmark for evaluating handwriting recognition.
 
|-
 
|-
| 1998 || || || "Since then, there have been many more advances in the field, such as in 1998, when research at AT&T Bell Laboratories on digit recognition resulted in good accuracy in detecting handwritten postcodes from the US Postal Service. This used back-propagation, which, as stated above, is explained in detail on the Introduction to Neural Networks."<ref name="dataversity.net"/>
+
| 1998 || || || Researchers at {{w|AT&T Bell Laboratories}} develop a neural network that can accurately recognize handwritten ZIP codes. The network was trained on a dataset of 100,000 ZIP codes, and it is able to achieve an accuracy of 99%. The network uses a technique called backpropagation to train itself. Backpropagation is a method for adjusting the weights of a neural network so that it can better predict the output for a given input. The development of this network is a major breakthrough in the field of machine learning. It shows that neural networks can be used to solve real-world problems, and it paves the way for the development of more advanced neural networks.<ref name="dataversity.net"/>
 
|-
 
|-
| 1999 || || || "Computer-aided diagnosis catches more cancers. Computers can’t cure cancer (yet), but they can help us diagnose it. The CAD Prototype Intelligent Workstation, developed at the University of Chicago, reviewed 22,000 mammograms and detected cancer 52% more accurately than radiologists did."
+
| 1999 || || || A study is published in the ''Journal of the National Cancer Institute'' showing that computer-aided diagnosis (CAD) is more accurate than {{w|radiologist}}s at detecting {{w|breast cancer}} on {{w|mammogram}}s. The study, which is conducted by researchers at the {{w|University of Chicago}}, finds that CAD detecta cancer 52% more accurately than radiologists do.
 
|-
 
|-
| 2001 || || || "Another ensemble model explored by Breiman [12] in 2001 that ensembles multiple decision trees where each of them is curated by a random subset of instances and each node is selected from a random subset of features."<ref name="erogol.comt"/>
+
| 2000 || || Algorithm || In {{w|anomaly detection}}, the {{w|local outlier factor}} (LOF) is an algorithm proposed by Markus M. Breunig, {{w|Hans-Peter Kriegel}}, Raymond T. Ng and Jörg Sander for finding anomalous data points by measuring the local deviation of a given data point with respect to its neighbours.<ref>{{Cite conference| doi = 10.1145/335191.335388| title = LOF: Identifying Density-based Local Outliers| year = 2000| last1 = Breunig | first1 = M. M.| last2 = Kriegel | first2 = H.-P. | last3 = Ng | first3 = R. T.| last4 = Sander | first4 = J.| work = Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data| series = {{w|SIGMOD}}| isbn = 1-58113-217-4| pages = 93–104| url = http://www.dbs.ifi.lmu.de/Publikationen/Papers/LOF.pdf}}</ref>
 
|-
 
|-
| 2002 || || Torch Machine Learning Library || [[wikipedia:Torch (machine learning)|Torch]], a software library for machine learning, is first released.<ref>{{cite journal|last1=Collobert|first1=Ronan|last2=Benigo|first2=Samy|last3=Mariethoz|first3=Johnny|title=Torch: a modular machine learning software library|date=30 October 2002|url=http://www.idiap.ch/ftp/reports/2002/rr02-46.pdf|accessdate=5 June 2016}}</ref>
+
| 2000 || || || {{w|LogitBoost}}, a [[w:Boosting (meta-algorithm)|boosting]] algorithm In {{w|machine learning}} and {{w|computational learning theory}}, is formulated by {{w|Jerome H. Friedman}}, {{w|Trevor Hastie}}, and {{w|Robert Tibshirani}}.<ref>{{cite journal |first1=Jerome |last1=Friedman |first2=Trevor |last2=Hastie |first3=Robert |last3=Tibshirani |title=Additive logistic regression: a statistical view of boosting |journal=Annals of Statistics |volume=28 |issue=2 |year=2000 |pages=337–407 |doi=10.1214/aos/1016218223}}</ref>
 
|-
 
|-
| 2002 (October) || || Software release || {{w|Torch (machine learning)}} is first released.
+
| 2000 || || || The ''{{w|Journal of Machine Learning Research}}'' is first published by the JMLR Foundation. It is considered to be one of the leading journals in the field of machine learning. JMLR publishes articles on a wide range of topic, including statistical learning theory, natural language processing, computer vision, data mining, reinforcement learning, and robotics.
 
|-
 
|-
| 2004 || || || "The second is the decrease in the cost of parallel computing and memory. This trend was discovered in 2004 when Google unveiled its MapReduce technology"<ref name="medium.comw">{{cite web |title=History of deep machine learning |url=https://medium.com/mindsync-ai/history-of-deep-machine-learning-1842dc3a4507 |website=medium.com |accessdate=21 February 2020}}</ref>
+
| 2001 || || || Breiman introduces an alternative ensemble model that involves the combination of multiple {{w|decision tree}}s. In this model, each decision tree is carefully constructed by considering only a random subset of instances, while the selection of each node is based on a random subset of features.<ref name="erogol.comt"/>
 
|-
 
|-
| 2004 || || || "{{w|Hierarchical temporal memory}} (HTM) is a biologically constrained theory (or model) of intelligence, originally described in the 2004 book ''{{w|On Intelligence}}'' by {{w|Jeff Hawkins}} with {{w|Sandra Blakeslee}}."
+
| 2001 || || || The {{w|iDistance}} indexing and query processing technique is first proposed by Cui Yu, Beng Chin Ooi, Kian-Lee Tan and {{w|H. V. Jagadish}}.<ref>Cui Yu, Beng Chin Ooi, Kian-Lee Tan and H. V. Jagadish [http://www.comp.nus.edu.sg/~ooibc/papers/Rcuiyu.pdf Indexing the distance: an efficient method to KNN processing], Proceedings of the 27th International Conference on Very Large Data Bases, Rome, Italy, 421-430, 2001.</ref> It is a method for indexing and querying data in high-dimensional metric spaces. A metric space is a space where the distance between two points can be measured. High-dimensional metric spaces are often used to represent data that has a large number of features, such as images or text documents. The iDistance indexing technique would show to be effective for a variety of applications, including image retrieval, text mining, and data mining. It is a powerful tool for indexing and querying data in high-dimensional metric spaces.
 
|-
 
|-
| 2005 || || || " The 3rd rise of NN has begun roughly in  2005 with the conjunction of many different discoveries from past and present by  recent mavens Hinton, LeCun, Bengio, Andrew Ng and other valuable older researchers. "<ref name="erogol.comt"/>
+
| 2002 || (October) || Torch Machine Learning Library || [[wikipedia:Torch (machine learning)|Torch]] is first released. It is a scientific computing library that is used for machine learning research. Torch is a popular choice for deep learning research and development. It would be used to develop a wide variety of deep learning models.<ref>{{cite journal|last1=Collobert|first1=Ronan|last2=Benigo|first2=Samy|last3=Mariethoz|first3=Johnny|title=Torch: a modular machine learning software library|date=30 October 2002|url=http://www.idiap.ch/ftp/reports/2002/rr02-46.pdf|accessdate=5 June 2016}}</ref>
 
|-
 
|-
| 2006 || || The Netflix Prize || The [[wikipedia:Netflix Prize|Netflix Prize]] competition is launched by [[wikipedia:Netflix|Netflix]]. The aim of the competition was to use machine learning to beat Netflix's own recommendation software's accuracy in predicting a user's rating for a film given their ratings for previous films by at least 10%.<ref>{{cite web|title=The Netflix Prize Rules|url=http://www.netflixprize.com/rules|website=Netflix Prize|publisher=Netflix|accessdate=16 June 2016}}</ref> The prize was won in 2009. "In 2006, Netflix offered $1M to anyone who could beat its algorithm at predicting consumer film ratings. The BellKor team of AT&T scientists took the prize three years later, beating the second-place team by mere minutes"<ref name="cloud.withgoogle.com"/>
+
| 2002 || || Software release || Computer vision and machine learning library {{w|Dlib}} is first released by Davis King. It is a popular choice for developing facial recognition, object detection, and image processing applications.
 
|-
 
|-
| 2006 || || || "Geoffrey Hinton coins the term “deep learning” to explain new algorithms that let computers “see” and distinguish objects and text in images and videos."<ref name="forbes.com"/> " In the year 2006, computer scientist Geoffrey Hinton has given a new name to neural net research as "deep learning," and nowadays, it has become one of the most trending technologies."<ref name="javatpoint.comu"/>
+
| 2003 || || Algorithm || The concept of {{w|manifold alignment}} is first introduced as by Ham, Lee, and Saul as a class of machine learning algorithms that produce projections between sets of data, given that the original data sets lie on a common [[w:Manifold learning|manifold]].<ref>{{cite journal|last=Ham|first=Ji Hun|author2=Daniel D. Lee |author3=Lawrence K. Saul |year=2003|title=Learning high dimensional correspondences from low dimensional manifolds|journal=Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003)|url=ftp://ftp.cis.upenn.edu/pub/datamining/public_html/ReadingGroup/papers/corr_icmlws03.pdf}}</ref>
|-
 
| 2006 || || || "In 2006, the Face Recognition Grand Challenge – a National Institute of Standards and Technology program – evaluated the popular face recognition algorithms of the time. 3D face scans, iris images, and high-resolution face images were tested. Their findings suggested the new algorithms were ten times more accurate than the facial recognition algorithms from 2002 and 100 times more accurate than those from 1995. Some of the algorithms were able to outperform human participants in recognizing faces and could uniquely identify identical twins."<ref name="dataversity.net"/>
 
 
|-
 
|-
| 2006 || || || ". This trend was discovered in 2004 when Google unveiled its MapReduce technology, followed by its open analogue Hadoop (2006), and together they gave the opportunity to distribute the processing of huge amounts of data between simple processors"<ref name="medium.comw"/>
+
| 2004 || || || {{w|Google}} unveils its MapReduce technology, which is a distributed programming model for processing and generating large data sets. MapReduce is based on the idea of breaking down a large data set into smaller chunks that can be processed in parallel on a cluster of computers. The development of MapReduce is a major breakthrough in the field of distributed computing. It makes possible to process large data sets more efficiently and cost-effectively. This would lead to a decrease in the cost of parallel computing and memory, in turn making possible to develop more powerful machine learning models.<ref name="medium.comw">{{cite web |title=History of deep machine learning |url=https://medium.com/mindsync-ai/history-of-deep-machine-learning-1842dc3a4507 |website=medium.com |accessdate=21 February 2020}}</ref>
 
|-
 
|-
| c.2006 || || || "The term deep learning was coined around 2006, and refers to deep neural networks with many layers."<ref name="subscription.packtpub.com">{{cite web |title=A brief history of the development of machine learning algorithms |url=https://subscription.packtpub.com/book/big_data_and_business_intelligence/9781783553112/1/ch01lvl1sec9/a-brief-history-of-the-development-of-machine-learning-algorithms |website=subscription.packtpub.com |accessdate=25 February 2020}}</ref>
+
| 2004 || || || {{w|Jeff Hawkins}} and {{w|Sandra Blakeslee}} introduce the concept of {{w|Hierarchical Temporal Memory}} (HTM), which can be regarded as a theory or model of intelligence that adheres to biological limitations. This concept is extensively explained in their book ''{{w|On Intelligence}}''.
 
|-
 
|-
| 2006 || || Software release || {{w|RapidMiner}} is forst released.
+
| 2005 || || || The third rise of neural networks (NN) begins with the conjunction of many different discoveries from past and present by recent mavens Geoffrey Hinton, Yoshua Bengio, Yann LeCun, Andrew Ng, and other valuable older researchers. This is a time when a number of factors come together to enable a new wave of progress in NN. These factors include the availability of large datasets, such as the ImageNet dataset, the development of powerful computers, such as GPUs, the development of new algorithms for training NN, such as backpropagation, and the combination of these factors led to a rapid increase in the performance of NN. NN begin to achieve state-of-the-art results in a wide variety of tasks, including image classification, {{w|natural language processing}}, and {{w|speech recognition}}.<ref name="erogol.comt"/>
 
|-
 
|-
| 2007 || || || "Around the year 2007, Long Short-Term Memory started outperforming more traditional speech recognition programs."<ref name="dataversity.net"/>
+
| 2006 || Concept development || {{w|Deep learning}} || British-Canadian cognitive psychologist and computer scientist {{w|Geoffrey Hinton}} introduces the term "{{w|deep learning}}" to describe a set of new algorithms that enable computers to analyze and recognize objects and text within images and videos. This development marks a significant advancement in the field of {{w|neural network}}s and would since become a prominent and widely adopted technology in various industries.<ref name="forbes.com"/><ref name="javatpoint.comu"/><ref name="subscription.packtpub.com">{{cite web |title=A brief history of the development of machine learning algorithms |url=https://subscription.packtpub.com/book/big_data_and_business_intelligence/9781783553112/1/ch01lvl1sec9/a-brief-history-of-the-development-of-machine-learning-algorithms |website=subscription.packtpub.com |accessdate=25 February 2020}}</ref>
 +
|-
 +
| 2006 || Competition || Face Recognition Grand Challenge || The Face Recognition Grand Challenge (FRGC) is held by the National Institute of Standards and Technology (NIST) to evaluate the state-of-the-art in face recognition technology. It would become a landmark event in the field of face recognition, helping to accelerate the development of new and more accurate face recognition algorithms. The FRGC uses a variety of data sets, including 3D face scans, iris images, and high-resolution face images. The results of the FRGC would show that the new algorithms are significantly more accurate than the facial recognition algorithms from 2002 and 1995. The FRGC would help to establish face recognition as a viable technology for a variety of applications. The results of the FRGC would also be used to improve the accuracy of face recognition algorithms in commercial products.<ref name="dataversity.net"/>
 
|-
 
|-
| 2007 || || || {{w|scikit-learn}} is released in June.
+
| 2006 || || Big data processing || This is a significant year in the development of big data processing, as it sees the release of Hadoop, an open-source software framework that allows for the distributed processing of large data sets across clusters of computers. Hadoop was developed by Doug Cutting and Mike Cafarella at the Apache Software Foundation. It is based on the MapReduce programming model, which was originally developed by Google. MapReduce is a programming model that breaks down a large data processing task into a series of smaller tasks that can be run in parallel on a cluster of computers. This makes it possible to process very large data sets that would be too large to process on a single computer. Hadoop would be widely used framework for big data processing. It would be used by a variety of organizations, including Google, Facebook, and Yahoo.<ref name="medium.comw"/>
 
|-
 
|-
| 2007 || || Software release || {{w|Theano (software)}} is initially released.
+
| 2006 || Software release || {{w|RapidMiner}} || {{w|RapidMiner}} is first released by Ingo Mierswa and Ralf Klinkenberg. It is a data mining and machine learning software platform. RapidMiner is a powerful tool for data mining and machine learning tasks. It is easy to use and has a wide range of features. RapidMiner would be used by a wide range of companies and organizations, including Google, Amazon, and IBM.
 
|-
 
|-
| 2009 (April 7) || || Software release || {{w|Apache Mahout}} is first released.
+
| 2007 || Scientific development || {{w|Long Short-Term Memory}} || A significant breakthrough occurs in the field of speech recognition with the introduction of a neural network architecture called {{w|Long Short-Term Memory}} (LSTM), which demonstrates superior performance compared to more traditional speech recognition programs at the time.<ref name="dataversity.net"/>
 
|-
 
|-
| 2010 || || Kaggle Competition || [[wikipedia:Kaggle|Kaggle]], a website that serves as a platform for machine learning competitions, is launched.<ref>{{cite web|title=About|url=https://www.kaggle.com/about|website=Kaggle|publisher=Kaggle Inc|accessdate=16 June 2016}}</ref>
+
| 2007 (June) || || {{w|Scikit-learn}} || {{w|scikit-learn}} is released by David Cournapeau, Gael Varoquaux, and others. It is a free and open-source machine learning library for Python. Scikit-learn would become a popular choice for machine learning practitioners because it is easy to use, well-documented, and has a wide range of features. It includes implementations of a variety of machine learning algorithms, including support vector machines, decision trees, random forests, and k-nearest neighbors.<ref>{{cite web |title=What is scikit-learn ? |url=https://njtrainingacademy.com/2017/02/10/what-is-scikit-learn/ |website=njtrainingacademy.com |accessdate=5 March 2020}}</ref>
 
|-
 
|-
| 2010 || || || "The Microsoft Kinect can track 20 human features at a rate of 30 times per second, allowing people to interact with the computer via movements and gestures."<ref name="forbes.com"/>
+
| 2007 || Software release || [[w:Theano (software)|Theano]] || [[w:Theano (software)|Theano]] is initially released. It is an open source Python library that allows users to easily make use of various machine learning models.<ref name="Sharing is Caring with Algorithms">{{cite web |title=Sharing is Caring with Algorithms |url=https://towardsdatascience.com/sharing-is-caring-with-algorithms-57549ca7cb75 |website=towardsdatascience.com |accessdate=8 March 2020}}</ref>
 
|-
 
|-
| 2010 (April) || || || {{w|Kaggle}} is founded.
+
| 2008 (January 11) || Software release || [[w:pandas (software)|Pandas]] || American software developer {{w|Wes McKinney}} releases the first version of [[w:pandas (software)|pandas]], a software library written for the Python programming language for data manipulation and analysis. pandas is fast, efficient, easy to use, and well-documented. It is used by a wide range of companies and organizations, including Google, Facebook, and Amazon. It is also used by many academic researchers. The name pandas is a play on the phrase "panel data", which is a type of data that is commonly used in statistical analysis. The pandas library was created by Wes McKinney, who was working as a researcher at AQR Capital Management at the time. Since its release, pandas would become one of the most popular data analysis libraries in the Python ecosystem. It is used by a wide range of companies and organizations, and it is also used by many academic researchers.<ref>{{cite web |title=Python’s pandas library is on its way to v.1.0.0 – first release candidate has arrived |url=https://jaxenter.com/python-pandas-1-0-0-release-candidate-166741.html |website=jaxenter.com |accessdate=9 March 2020}}</ref>
 
|-
 
|-
| 2010 (May 20) || || Software release || {{w|Accord.NET}} is initially released.
+
| 2008 || Scientific development || {{w|Isolation Forest}} || The {{w|Isolation Forest}} (iForest) algorithm is initially proposed by Fei Tony Liu, Kai Ming Ting and Zhi-Hua Zhou.<ref>{{Cite journal|last=Liu|first=Fei Tony|last2=Ting|first2=Kai Ming|last3=Zhou|first3=Zhi-Hua|date=December 2008|title=Isolation Forest|url=https://ieeexplore.ieee.org/document/4781136|journal=2008 Eighth IEEE International Conference on Data Mining|volume=|pages=413–422|via=|doi=10.1109/ICDM.2008.17|isbn=978-0-7695-3502-9}}</ref>
 
|-
 
|-
| 2010 || || || "{{w|Constructing skill trees}} (CST) is a hierarchical {{w|reinforcement learning}} algorithm which can build skill trees from a set of sample solution trajectories obtained from demonstration. CST was introduced by {{w|George Konidaris}}, {{w|Scott Kuindersma}}, {{w|Andrew Barto}} and {{w|Roderic Grupen}}."
+
| 2008 || || {{w|Encog}} || {{w|Encog}} is created as a pure-[[w:Java (programming language)|Java]]/{{w|C#}} machine learning framework to support genetic programming, NEAT/HyperNEAT, and other neural network technologies.<ref>{{cite web |title=Encog Machine Learning Framework |url=https://www.heatonresearch.com/encog/ |website=heatonresearch.com |accessdate=8 March 2020}}</ref> 
 
|-
 
|-
| 2011 || Achievement || Beating Humans in Jeopardy || Using a combination of machine learning, [[wikipedia:natural language processing|natural language processing]] and information retrieval techniques, [[wikipedia:IBM|IBM]]'s [[wikipedia:Watson (computer)|Watson]] beats two human champions in a [[wikipedia:Jeopardy!|Jeopardy!]] competition.<ref>{{cite news|last1=Markoff|first1=John|title=Computer Wins on ‘Jeopardy!’: Trivial, It’s Not|url=http://www.nytimes.com/2011/02/17/science/17jeopardy-watson.html?pagewanted=all&_r=0|accessdate=5 June 2016|work=New York Times|date=17 February 2011|page=A1}}</ref>
+
| 2009 (April 7) || Software release || {{w|Apache Mahout}} || {{w|Apache Mahout}} is first released.<ref>{{cite web |title=Apache Mahout |url=http://people.apache.org/~robinanil/mahout/ |website=people.apache.org |accessdate=9 March 2020}}</ref>
 
|-
 
|-
| 2012 || Achievement || Recognizing Cats on YouTube || The [[wikipedia:Google Brain|Google Brain]] team, led by [[wikipedia:Andrew Ng|Andrew Ng]] and [[wikipedia:Jeff Dean|Jeff Dean]], create a neural network that learns to recognize cats by watching unlabeled images taken from frames of [[wikipedia:YouTube|YouTube]] videos.<ref>{{cite journal|last1=Le|first1=Quoc|last2=Ranzato|first2=Marc’Aurelio|last3=Monga|first3=Rajat|last4=Devin|first4=Matthieu|last5=Chen|first5=Kai|last6=Corrado|first6=Greg|last7=Dean|first7=Jeff|last8=Ng|first8=Andrew|title=Building High-level Features Using Large Scale Unsupervised Learning|journal=CoRR|date=12 July 2012|arxiv=1112.6209}}</ref><ref>{{cite news|last1=Markoff|first1=John|title=How Many Computers to Identify a Cat? 16,000|url=http://www.nytimes.com/2012/06/26/technology/in-a-big-network-of-computers-evidence-of-machine-learning.html|accessdate=5 June 2016|work=New York Times|date=26 June 2012|page=B1}}</ref> " In 2012, Google created a deep neural network which learned to recognize the image of humans and cats in YouTube videos."<ref name="javatpoint.comu"/>
+
| 2010 (April) || || || [[wikipedia:Kaggle|Kaggle]], a website that serves as a platform for machine learning competitions, is launched.<ref>{{cite web|title=About|url=https://www.kaggle.com/about|website=Kaggle|publisher=Kaggle Inc|accessdate=16 June 2016}}</ref><ref>{{cite book |last1=Simon |first1=Phil |title=Too Big to Ignore: The Business Case for Big Data |url=https://books.google.com.ar/books?id=1ekYIAoEBrEC&pg=PT84&lpg=PT84&dq=2010+(April)+Kaggle+is+founded.&source=bl&ots=X1Hf-qwb-t&sig=ACfU3U3Wu3RKbmOiyAUiKJTjLxeB3wEOtQ&hl=en&sa=X&ved=2ahUKEwiP6rvr0Y3oAhXbJrkGHXsgCrgQ6AEwCnoECAsQAQ#v=onepage&q=2010%20(April)%20Kaggle%20is%20founded.&f=false}}</ref>
 
|-
 
|-
| 2012 || || || "Google’s X Lab develops a machine learning algorithm that is able to autonomously browse YouTube videos to identify the videos that contain cats."<ref name="forbes.com"/>
+
| 2010 || || || Microsoft releases the Kinect, a motion-sensing input device that can track 20 human features at a rate of 30 times per second. This allows people to interact with the computer via movements and gestures. The Kinect is originally developed for the Xbox 360 gaming console, but it would since be used for a variety of other applications, including gaming, healthcare, and education. It can be used to play games that require physical movement, such as Dance Central and Kinect Sports. It can also be used for rehabilitation therapy, as it can track the movements of patients and provide feedback on their progress. In the education space, the Kinect can be used to help students learn languages or math, as it can track their movements and provide feedback on their answers. It is likely to be used in a variety of applications in the future, as the technology continues to develop.<ref name="forbes.com"/>
 
|-
 
|-
| 2012 || || || "AlexNet (2012) - AlexNet won the ImageNet competition by a large margin in 2012, which led to the use of GPUs and Convolutional Neural Networks in machine learning. They also created ReLU, which is an activation function that greatly improves efficiency of CNNs."<ref name="dataversity.net"/>
+
| 2010 (May 20) || Software release || {{w|Accord.NET}} || {{w|Accord.NET}} is initially released.<ref>{{cite web |title=Accord.NET Framework – An extension to AForge.NET |url=http://crsouza.com/2010/05/20/accord-net-framework-an-extension-to-aforge-net/ |website=crsouza.com/ |accessdate=9 March 2020}}</ref>
 
|-
 
|-
| 2012 (March 12) || || || {{w|mlpy}} is released.
+
| 2010 || || || George Konidaris, Scott Kuindersma, Andrew Barto, and Roderic Grupen introduce a hierarchical reinforcement learning algorithm called {{w|Constructing skill trees}} (CST), which can build skill trees from a set of sample solution trajectories obtained from demonstration. CST works by first segmenting the demonstration trajectories into a set of primitive skills. These skills are then combined to form a skill tree, with each node in the tree representing a different skill. The skill tree is then used to guide the agent's exploration of the environment, and to help the agent learn new skills. CST would be shown to be effective in a variety of domains, including robotics, video games, and board games. It is a promising approach to hierarchical reinforcement learning, and it is likely to be used in a variety of applications in the future.
 
|-
 
|-
| 2014 || || Leap in Face Recognition || [[wikipedia:Facebook|Facebook]] researchers publish their work on [[wikipedia:DeepFace|DeepFace]], a system that uses neural networks that identifies faces with 97.35% accuracy. The results are an improvement of more than 27% over previous systems and rivals human performance.<ref>{{cite journal|last1=Taigman|first1=Yaniv|last2=Yang|first2=Ming|last3=Ranzato|first3=Marc’Aurelio|last4=Wolf|first4=Lior|title=DeepFace: Closing the Gap to Human-Level Performance in Face Verification|journal=Conference on Computer Vision and Pattern Recognition|date=24 June 2014|url=https://research.facebook.com/publications/deepface-closing-the-gap-to-human-level-performance-in-face-verification/|accessdate=8 June 2016}}</ref> "Facebook develops DeepFace, a software algorithm that is able to recognize or verify individuals on photos to the same level as humans can."<ref name="forbes.com"/> "DeepFace was a deep neural network created by Facebook, and they claimed that it could recognize a person with the same precision as a human can do."<ref name="javatpoint.comu"/>
+
| 2011 || Achievement || Beating Humans in Jeopardy || Using a combination of machine learning, {{w|natural language processing}} and information retrieval techniques, {{w|IBM}}'s [[wikipedia:Watson (computer)|Watson]] beats two human champions in a [[wikipedia:Jeopardy!|Jeopardy!]] competition.<ref>{{cite news|last1=Markoff|first1=John|title=Computer Wins on ‘Jeopardy!’: Trivial, It’s Not|url=http://www.nytimes.com/2011/02/17/science/17jeopardy-watson.html?pagewanted=all&_r=0|accessdate=5 June 2016|work=New York Times|date=17 February 2011|page=A1}}</ref>
 
|-
 
|-
| 2014 (May 26) || || Software release || {{w|Apache Spark}} is first released.
+
| 2012 || Achievement || Recognizing Cats on YouTube || The {{w|Google Brain}} team, led by {{w|Andrew Ng}} and {{w|Jeff Dean}}, create a neural network that learns to recognize cats by watching unlabeled images taken from frames of {{w|YouTube}} videos.<ref>{{cite journal|last1=Le|first1=Quoc|last2=Ranzato|first2=Marc’Aurelio|last3=Monga|first3=Rajat|last4=Devin|first4=Matthieu|last5=Chen|first5=Kai|last6=Corrado|first6=Greg|last7=Dean|first7=Jeff|last8=Ng|first8=Andrew|title=Building High-level Features Using Large Scale Unsupervised Learning|journal=CoRR|date=12 July 2012|arxiv=1112.6209}}</ref><ref>{{cite news|last1=Markoff|first1=John|title=How Many Computers to Identify a Cat? 16,000|url=http://www.nytimes.com/2012/06/26/technology/in-a-big-network-of-computers-evidence-of-machine-learning.html|accessdate=5 June 2016|work=New York Times|date=26 June 2012|page=B1}}</ref> " In 2012, Google created a deep neural network which learned to recognize the image of humans and cats in YouTube videos."<ref name="javatpoint.comu"/>
 
|-
 
|-
| 2014 || || Sibyl || Researchers from [[wikipedia:Google|Google]] detail their work on Sibyl,<ref>{{cite web|last1=Canini|first1=Kevin|last2=Chandra|first2=Tushar|last3=Ie|first3=Eugene|last4=McFadden|first4=Jim|last5=Goldman|first5=Ken|last6=Gunter|first6=Mike|last7=Harmsen|first7=Jeremiah|last8=LeFevre|first8=Kristen|last9=Lepikhin|first9=Dmitry|last10=Llinares|first10=Tomas Lloret|last11=Mukherjee|first11=Indraneel|last12=Pereira|first12=Fernando|last13=Redstone|first13=Josh|last14=Shaked|first14=Tal|last15=Singer|first15=Yoram|title=Sibyl: A system for large scale supervised machine learning|url=https://users.soe.ucsc.edu/~niejiazhong/slides/chandra.pdf|website=Jack Baskin School Of Engineering|publisher=UC Santa Cruz|accessdate=8 June 2016}}</ref> a proprietary platform for massively parallel machine learning used internally by Google to make predictions about user behavior and provide recommendations.<ref>{{cite news|last1=Woodie|first1=Alex|title=Inside Sibyl, Google’s Massively Parallel Machine Learning Platform|url=http://www.datanami.com/2014/07/17/inside-sibyl-googles-massively-parallel-machine-learning-platform/|accessdate=8 June 2016|work=Datanami|publisher=Tabor Communications|date=17 July 2014}}</ref>
+
| 2012 || || || Google's X Lab develops a machine learning algorithm that can identify cat videos on YouTube. The algorithm was trained on a dataset of manually labeled videos. It works by extracting features from videos and training a classifier to distinguish between videos that contain cats and videos that do not. The algorithm can be used to recommend cat videos to users or generate statistics about cat videos on YouTube. The cat-detection algorithm is a powerful example of the use of machine learning to solve real-world problems. It shows that machine learning can be used to solve problems in a variety of domains.<ref name="forbes.com"/>
 
|-
 
|-
| 2014 || || || "In 2014, the Chabot "Eugen Goostman" cleared the Turing Test. It was the first Chabot who convinced the 33% of human judges that it was not a machine."<ref name="javatpoint.comu"/>
+
| 2012 || || || Alex Krizhevsky and his colleagues develop the AlexNet convolutional neural network, which wins the ImageNet Large Scale Visual Recognition Challenge and becomes one of the first successful applications of deep learning to image recognition. This would lead to a surge in the use of GPUs and convolutional neural networks (CNNs) in machine learning. AlexNet is the first CNN to use GPUs for training, and it also introduces the ReLU activation function. These two innovations make it possible to train much larger and deeper CNNs than had been possible before, which would lead to significant improvements in the performance of CNNs on a variety of tasks.<ref name="dataversity.net"/>
 
|-
 
|-
| 2014 || || || "DeepMind (2014) - This company was bought by Google, and can play basic video games to the same levels as humans. In 2016, it managed to beat a professional at the game Go, which is considered to be one the world’s most difficult board games."<ref name="dataversity.net"/>
+
| 2012 || || || {{w|Special Interest Group on Knowledge Discovery and Data Mining}} 
 
|-
 
|-
| 2014 || || || "Generative Adversarial Networks (GAN)"<ref name="import.ioe"/> "GAN is a class of {{w|machine learning}} systems invented by {{w|Ian Goodfellow}} and his colleagues in 2014."<ref name="GANnips">{{cite conference|title=Generative Adversarial Networks|first1=Ian |last1=Goodfellow |first2=Jean |last2=Pouget-Abadie |first3=Mehdi |last3=Mirza |first4=Bing |last4=Xu |first5=David |last5=Warde-Farley |first6=Sherjil |last6=Ozair |first7=Aaron |last7=Courville |first8=Yoshua |last8=Bengio |conference= Proceedings of the International Conference on Neural Information Processing Systems (NIPS 2014) |pages= 2672–2680 |url= https://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf |year=2014}}</ref>
+
| 2012 (March 12) || || || {{w|mlpy}} is released by Richard Arp and Ralf Klinkenberg. It is a Python module for machine learning. mlpy is a free and open-source library that provides a wide range of machine learning algorithms, including support vector machines, decision trees, random forests, and k-nearest neighbors. It also includes a number of utility functions for data manipulation and visualization.<ref>{{cite web |title=mlpy |url=http://mlpy.sourceforge.net/docs/3.5/ |website=mlpy.sourceforge.net |accessdate=8 March 2020}}</ref>
 
|-
 
|-
| 2014 || || || "the Apache Spark software framework for distributed processing of unstructured and weakly structured data appeared; it was convenient for the implementation of machine learning algorithms."<ref name="medium.comw"/>
+
| 2013 || || || {{w|International Conference on Learning Representations}}
 
|-
 
|-
| 2015 (February) || || || {{w|spaCy}} is released.
+
| 2014 || || Leap in Face Recognition || [[wikipedia:Facebook|Facebook]] researchers publish their work on [[wikipedia:DeepFace|DeepFace]], a system that uses neural networks that identifies faces with 97.35% accuracy. The results are an improvement of more than 27% over previous systems and rivals human performance.<ref>{{cite journal|last1=Taigman|first1=Yaniv|last2=Yang|first2=Ming|last3=Ranzato|first3=Marc’Aurelio|last4=Wolf|first4=Lior|title=DeepFace: Closing the Gap to Human-Level Performance in Face Verification|journal=Conference on Computer Vision and Pattern Recognition|date=24 June 2014|url=https://research.facebook.com/publications/deepface-closing-the-gap-to-human-level-performance-in-face-verification/|accessdate=8 June 2016}}</ref> "Facebook develops DeepFace, a software algorithm that is able to recognize or verify individuals on photos to the same level as humans can."<ref name="forbes.com"/> "DeepFace was a deep neural network created by Facebook, and they claimed that it could recognize a person with the same precision as a human can do."<ref name="javatpoint.comu"/>
|-
 
| 2015 (March 27) || || Software release || {{w|Keras}} is first released.
 
|-
 
| 2015 (October 8) || || Software release || {{w|Apache SINGA}} is first released.
 
 
|-
 
|-
| 2015 || Achievement || Beating Humans in Go ||Google's [[wikipedia:AlphaGo|AlphaGo]] program becomes the first [[wikipedia:Computer Go|Computer Go]] program to beat an unhandicapped professional human player<ref>{{cite web|title=Google achieves AI 'breakthrough' by beating Go champion|url=http://www.bbc.com/news/technology-35420579|website=BBC News|publisher=BBC|accessdate=5 June 2016|date=27 January 2016}}</ref> using a combination of machine learning and tree search techniques.<ref>{{cite web|title=AlphaGo|url=https://www.deepmind.com/alpha-go.html|website=Google DeepMind|publisher=Google Inc|accessdate=5 June 2016}}</ref>
+
| 2014 (May 26) || Software release || || {{w|Apache Spark}} is first released by Matei Zaharia, Andrew Fire, and others at the AMPLab at UC Berkeley. Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Java, Scala, Python, and R, and an optimized engine that supports general execution graphs. It also supports a rich set of higher-level tools including Spark SQL for SQL and structured data processing, MLlib for machine learning, GraphX for graph processing, and Spark Streaming. It would become a popular choice for big data processing, and would be used by a wide variety of companies, including Uber, Airbnb, and Netflix.<ref name="medium.comw"/><ref>{{cite web |title=Popular Big Data Engine Apache Spark 2.0 Released |url=https://adtmag.com/articles/2016/07/27/spark-2-0.aspx |website=adtmag.com |accessdate=8 March 2020}}</ref>
 
|-
 
|-
| 2015 || Software || TensorFlow Library || Google releases [[wikipedia:TensorFlow|TensorFlow]], an open source software library for machine learning.<ref>{{cite web|last1=Dean|first1=Jeff|last2=Monga|first2=Rajat|title=TensorFlow - Google’s latest machine learning system, open sourced for everyone|url=https://research.googleblog.com/2015/11/tensorflow-googles-latest-machine_9.html|website=Google Research Blog|accessdate=5 June 2016|date=9 November 2015}}</ref>
+
| 2014 || || Sibyl || Researchers from {{w|Google}} detail their work on Sibyl,<ref>{{cite web|last1=Canini|first1=Kevin|last2=Chandra|first2=Tushar|last3=Ie|first3=Eugene|last4=McFadden|first4=Jim|last5=Goldman|first5=Ken|last6=Gunter|first6=Mike|last7=Harmsen|first7=Jeremiah|last8=LeFevre|first8=Kristen|last9=Lepikhin|first9=Dmitry|last10=Llinares|first10=Tomas Lloret|last11=Mukherjee|first11=Indraneel|last12=Pereira|first12=Fernando|last13=Redstone|first13=Josh|last14=Shaked|first14=Tal|last15=Singer|first15=Yoram|title=Sibyl: A system for large scale supervised machine learning|url=https://users.soe.ucsc.edu/~niejiazhong/slides/chandra.pdf|website=Jack Baskin School Of Engineering|publisher=UC Santa Cruz|accessdate=8 June 2016}}</ref> a proprietary platform for massively parallel machine learning used internally by Google to make predictions about user behavior and provide recommendations.<ref>{{cite news|last1=Woodie|first1=Alex|title=Inside Sibyl, Google’s Massively Parallel Machine Learning Platform|url=http://www.datanami.com/2014/07/17/inside-sibyl-googles-massively-parallel-machine-learning-platform/|accessdate=8 June 2016|work=Datanami|publisher=Tabor Communications|date=17 July 2014}}</ref>
 
|-
 
|-
| 2015 || || || "Amazon launches its own machine learning platform."<ref name="forbes.com"/>
+
| 2014 || || || The Chabot named "{{w|Eugen Goostman}}" successfully passes the {{w|Turing Test}} by convincing 33% of human judges that it is a human rather than a machine. This marks the first instance in which a Chabot achieves such a level of deception in the test.<ref name="javatpoint.comu"/> It is developed by Russian-born Vladimir Veselov, Ukrainian-born Eugene Demchenko, and Russian-born Sergey Ulasen.<ref>{{cite web |title=The Turing Test Is Not What You Think It Is {{!}} WNYC {{!}} New York Public Radio, Podcasts, Live Streaming Radio, News |url=https://www.wnyc.org/story/the-turing-test-is-not-what-you-think-it-is/ |website=WNYC |access-date=4 July 2023 |language=en}}</ref>
 
|-
 
|-
| 2015 || || || "Microsoft creates the Distributed Machine Learning Toolkit, which enables the efficient distribution of machine learning problems across multiple computers."<ref name="forbes.com"/>
+
| 2014 || || || British artificial intelligence company {{w|DeepMind}} is founded by {{w|Demis Hassabis}}, {{w|Shane Legg}}, and {{w|Mustafa Suleyman}}. The company is acquired by {{w|Google}} in the same year. DeepMind is known for its work in {{w|reinforcement learning}}, a type of machine learning that allows computers to learn by trial and error. DeepMind would develop a number of successful reinforcement learning algorithms, including {{w|AlphaGo}}, which would defeat a professional human Go player in 2016.<ref name="dataversity.net"/>
 
|-
 
|-
| 2015 || || || " Over 3,000 AI and Robotics researchers, endorsed by Stephen Hawking, Elon Musk and Steve Wozniak (among many others), sign an open letter warning of the danger of autonomous weapons which select and engage targets without human intervention."<ref name="forbes.com"/>
+
| 2014 || || || {{w|Ian Goodfellow}} and his colleagues invent Generative Adversarial Networks (GANs), a type of unsupervised learning algorithm that can be used to generate realistic images, text, and other data. GANs work by pitting two neural networks against each other: the generator, which creates new data, and the discriminator, which determines whether the data was generated by the generator or is real data. As the generator and discriminator are trained, they become better at their respective tasks, and GANs would be used to generate realistic images, text, and other data for a variety of applications.<ref name="import.ioe"/><ref name="GANnips">{{cite conference|title=Generative Adversarial Networks|first1=Ian |last1=Goodfellow |first2=Jean |last2=Pouget-Abadie |first3=Mehdi |last3=Mirza |first4=Bing |last4=Xu |first5=David |last5=Warde-Farley |first6=Sherjil |last6=Ozair |first7=Aaron |last7=Courville |first8=Yoshua |last8=Bengio |conference= Proceedings of the International Conference on Neural Information Processing Systems (NIPS 2014) |pages= 2672–2680 |url= https://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf |year=2014}}</ref>
 
|-
 
|-
| 2015 || || || " In 2015, the Google speech recognition program reportedly had a significant performance jump of 49 percent using a CTC-trained Long Short-Term Memory."<ref name="dataversity.net"/>
+
| 2015 (February) || || || {{w|spaCy}} is released. It is a free, open-source natural language processing (NLP) library for Python. It is a powerful tool for NLP tasks such as text classification, named entity recognition, and part-of-speech tagging. It is fast, efficient, and easy to use. spaCy would be used by a wide range of companies and organizations, including Google, Facebook, and Amazon. It is also used by many academic researchers.<ref>{{cite web |title=A Little spaCy Food for Thought: Easy to use NLP Framework |url=https://towardsdatascience.com/a-little-spacy-food-for-thought-easy-to-use-nlp-framework-97cbcc81f977 |website=towardsdatascience.com |accessdate=5 March 2020}}</ref><ref>{{cite web |title=Introducing spaCy |url=https://explosion.ai/blog/introducing-spacy |website=explosion.ai |accessdate=5 March 2020}}</ref>
 
|-
 
|-
| 2015 || || || "OpenAI (2015) - This is a non-profit organisation created by Elon Musk and others, to create safe artificial intelligence that can benefit humanity."<ref name="dataversity.net"/>
+
| 2015 (March 27) || Software release || {{w|Keras}} || {{w|Keras}} is first released. It is an open source software library designed to simplify the creation of deep learning models.<ref>{{cite web |title=Keras |url=https://news.ycombinator.com/item?id=21730711 |website=news.ycombinator.com |accessdate=5 March 2020}}</ref>
 
|-
 
|-
| 2015 || || || "Amazon Machine Learning Platform (2015) - This is part of Amazon Web Services, and shows how most big companies want to get involved in machine learning. They say it drives many of their internal systems, from regularly used services such as search recommendations and Alexa, to more experimental ones like Prime Air and Amazon Go."<ref name="dataversity.net"/>
+
| 2015 (June 9) || Software release || {{w|Chainer}} || {{w|Chainer}} is released by Preferred Networks, Inc. in Japan. A deep learning framework written in Python, it would become a popular choice for deep learning research and development.<ref name=":0">{{cite web|url=https://www.theregister.co.uk/2017/04/07/intel_chainer_ai_day/|title=Big-in-Japan AI code 'Chainer' shows how Intel will gun for GPUs|date=2017-04-07|website=The Register|access-date=8 March 2020}}</ref><ref name=":1">{{Cite news|title=Deep Learning のフレームワーク Chainer を公開しました|url=https://research.preferred.jp/2015/06/deep-learning-chainer/|date=2015-06-09|access-date=8 March 2020|language=ja-JP}}</ref>
 
|-
 
|-
| 2015 || || || "ResNet (2015) - This was a major advancement in CNNs, and more information can be found on the Introduction to CNNs page."<ref name="dataversity.net"/>
+
| 2015 (October 8) || Software release || {{w|Apache SINGA}} || {{w|Apache SINGA}} is first released. It is an open-source distributed machine learning library that facilitates the training of large-scale machine learning (especially deep learning) models over a cluster of machines. The SINGA project was initiated by the DB System Group at National University of Singapore in 2014, in collaboration with the database group of Zhejiang University. The goal of the project was to support complex analytics at scale, and make database systems more intelligent and autonomic. Apache SINGA would be used by a number of organizations, including Citigroup, NetEase, and Singapore General Hospital. It would become a popular choice for distributed deep learning because it is easy to use, scalable, and efficient.<ref>{{cite web |title=Apache SINGA |url=https://singa.apache.org/ |website=singa.apache.org |accessdate=8 March 2020}}</ref>
 
|-
 
|-
| 2015 || || || "U-net (2015) - This is an CNN architecture specialised in biomedical image segmentation. It introduced an equal amount of upsampling and downsampling layers, and also skip connections. More information on what this means can be found on the Semantic Segmentation page."<ref name="dataversity.net"/>
+
| 2015 || Achievement || Beating Humans in Go || Google's {{w|AlphaGo}} program becomes the first {{w|Computer Go}} program to beat an unhandicapped professional human player<ref>{{cite web|title=Google achieves AI 'breakthrough' by beating Go champion|url=http://www.bbc.com/news/technology-35420579|website=BBC News|publisher=BBC|accessdate=5 June 2016|date=27 January 2016}}</ref> using a combination of machine learning and tree search techniques.<ref>{{cite web|title=AlphaGo|url=https://www.deepmind.com/alpha-go.html|website=Google DeepMind|publisher=Google Inc|accessdate=5 June 2016}}</ref>
 
|-
 
|-
| 2015 || || || "Machines and humans pair up to fight fraud online. When PayPal set out to fight fraud and money laundering on its site, it took a hybrid approach. Human detectives define the characteristics of criminal behavior, then a machine learning program uses those parameters to root out the bad guys on the PayPal site"<ref name="cloud.withgoogle.com"/>
+
| 2015 || Software release || || Google releases {{w|TensorFlow}}, an open source software library for machine learning.<ref>{{cite web|last1=Dean|first1=Jeff|last2=Monga|first2=Rajat|title=TensorFlow - Google’s latest machine learning system, open sourced for everyone|url=https://research.googleblog.com/2015/11/tensorflow-googles-latest-machine_9.html|website=Google Research Blog|accessdate=5 June 2016|date=9 November 2015}}</ref>
 
|-
 
|-
| 2015 (November 30) || || || {{w|Rnn (software)}} is released.
+
| 2015 || || || [[w:Amazon (company)|Amazon]] launches its own machine learning platform called Amazon Machine Learning (Amazon ML). It is a cloud-based service that allows developers to build, train, and deploy machine learning models without having to worry about the underlying infrastructure.<ref name="forbes.com"/><ref name="dataversity.net"/>
 
|-
 
|-
| 2016 (January 25) || || || {{w|Microsoft Cognitive Toolkit}} is initially released.
+
| 2015 || || || The Distributed Machine Learning Toolkit (DMTK) is first released. It is a Microsoft-developed open-source framework that enables the efficient distribution of machine learning problems across multiple computers. DMTK is based on the Apache Spark framework.<ref name="forbes.com"/>
 
|-
 
|-
| 2016 || || || "Google’s artificial intelligence algorithm beats a professional player at the Chinese board game Go, which is considered the world’s most complex board game and is many times harder than chess. The AlphaGo algorithm developed by Google DeepMind managed to win five games out of five in the Go competition."<ref name="forbes.com"/> "AlphaGo beat the world's number second player Lee sedol at Go game. In 2017 it beat the number one player of this game Ke Jie."<ref name="javatpoint.comu"/>
+
| 2015 || || || Over 3,000 AI and robotics researchers sign an open letter warning of the danger of autonomous weapons which select and engage targets without human intervention. The letter is endorsed by a number of high-profile figures, including Stephen Hawking, Elon Musk, and Steve Wozniak. The letter states that autonomous weapons pose a serious threat to humanity. They could be used to kill without discrimination, and they could be difficult to control. The letter calls for a ban on the development and use of autonomous weapons. The open letter would be influential in raising awareness of the dangers of autonomous weapons. It would also lead to a number of countries considering bans on the development and use of these weapons.<ref name="forbes.com"/>
 
|-
 
|-
| 2016 || Software || FBLearner Flow || Facebook details FBLearner Flow, an internal software platform that allows Facebook software engineers to easily share, train and use machine learning algorithms.<ref>{{cite web|last1=Dunn|first1=Jeffrey|title=Introducing FBLearner Flow: Facebook's AI backbone|url=https://code.facebook.com/posts/1072626246134461/introducing-fblearner-flow-facebook-s-ai-backbone/|website=Facebook Code|publisher=Facebook|accessdate=8 June 2016|date=10 May 2016}}</ref> FBLearner Flow is used by more than 25% of Facebook's engineers, more than a million models have been trained using the service and the service makes more than 6 million predictions per second.<ref>{{cite news|last1=Shead|first1=Sam|title=There's an 'AI backbone' that over 25% of Facebook's engineers are using to develop new products|url=http://www.businessinsider.com.au/over-a-quarter-of-facebooks-employees-are-using-fblearner-flow-2016-5?r=UK&IR=T|accessdate=8 June 2016|work=Business Insider|publisher=Allure Media|date=10 May 2016}}</ref>
+
| 2015 || || || Google's speech recognition program has a 49% performance jump using CTC-trained LSTMs. This is a major milestone in the development of speech recognition technology, as it shows that CTC-trained LSTMs could be used to train speech recognition programs that were significantly more accurate than previous models. CTC-trained LSTMs would be later used in a variety of commercial speech recognition products, including Google's Voice Search and Amazon's Alexa. They have the potential to revolutionize the way we interact with computers, and they are likely to be used in a variety of applications in the years to come.<ref name="dataversity.net"/>
 
|-
 
|-
| 2016 (October) || || || {{w|PyTorch}} is first released.
+
| 2015 || Organization || {{w|OpenAI}} || {{w|OpenAI}} is founded as a non-profit research company by {{w|Elon Musk}}, {{w|Sam Altman}}, {{w|Ilya Sutskever}}, and others. The company's mission is to ensure that artificial general intelligence benefits all of humanity. <ref name="dataversity.net"/>
 
|-
 
|-
| 2017 (April 18) || || Software release || {{w|Caffe (software)}} is initially released.  
+
| 2015 || || {{w|PayPal}} || {{w|PayPal}} adopts a collaborative approach to combat fraud and money laundering on its platform by combining the efforts of humans and machines. Human detectives play a crucial role in identifying the patterns and traits associated with criminal behavior. This knowledge is then utilized by a machine learning program to effectively detect and eliminate fraudulent activity on the PayPal site. The synergy between human expertise and automated algorithms enhances PayPal's ability to identify and thwart fraudulent individuals.<ref name="cloud.withgoogle.com"/>
 
|-
 
|-
| 2017 (April 25) || || Software release || {{w|Shogun (toolbox)}} is released.
+
| 2016 (January 25) || || {{w|Microsoft Cognitive Toolkit}} || {{w|Microsoft Cognitive Toolkit}} is initially released. It is an AI solution aimed at helping users to advance in their machine learning projects.<ref name="Sharing is Caring with Algorithms"/>
 
|-
 
|-
| 2017 || || || "In 2017, the Alphabet's Jigsaw team built an intelligent system that was able to learn the online trolling. It used to read millions of comments of different websites to learn to stop online trolling."<ref name="javatpoint.comu"/> "As part of its anti-harassment efforts, Alphabet’s Jigsaw team built a system that learned to identify trolling by reading millions of website comments. The underlying algorithms could be a huge help for sites with limited resources for moderation"<ref name="cloud.withgoogle.com"/>
+
| 2016 || || AlphaGo || Google's artificial intelligence algorithm, AlphaGo, achieves a significant milestone by defeating a professional player in the complex Chinese board game Go. Considered more challenging than chess, Go is known for its intricate gameplay. AlphaGo, developed by Google {{w|DeepMind}}, emerges victorious in all five games of a Go competition against top players. It first defeates {{w|Lee Sedol}}, the world's second-ranked player, and later goes on to defeat {{w|Ke Jie}}, the game's number one player in 2017.<ref name="javatpoint.comu"/><ref name="forbes.com"/>  
 
|-
 
|-
| 2019 (September 10) || || Software release || {{w|Deeplearning4j}} is initially released.
+
| 2016 || Software release || FBLearner Flow || Facebook details FBLearner Flow, an internal software platform that allows Facebook software engineers to easily share, train and use machine learning algorithms.<ref>{{cite web|last1=Dunn|first1=Jeffrey|title=Introducing FBLearner Flow: Facebook's AI backbone|url=https://code.facebook.com/posts/1072626246134461/introducing-fblearner-flow-facebook-s-ai-backbone/|website=Facebook Code|publisher=Facebook|accessdate=8 June 2016|date=10 May 2016}}</ref> FBLearner Flow is used by more than 25% of Facebook's engineers, more than a million models have been trained using the service and the service makes more than 6 million predictions per second.<ref>{{cite news|last1=Shead|first1=Sam|title=There's an 'AI backbone' that over 25% of Facebook's engineers are using to develop new products|url=http://www.businessinsider.com.au/over-a-quarter-of-facebooks-employees-are-using-fblearner-flow-2016-5?r=UK&IR=T|accessdate=8 June 2016|work=Business Insider|publisher=Allure Media|date=10 May 2016}}</ref>
 
|-
 
|-
| 2019 (November 26) || || Software release || {{w|mlpack}} is released.
+
| 2016 (October) || || {{w|PyTorch}} || {{w|PyTorch}} is first released by Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, and others. It is an open-source machine learning framework that is based on the Torch library. Torch is a scientific computing library that is used for deep learning research. PyTorch would become a popular choice for deep learning research and development. It is easy to use and it is very flexible. It is also well-supported by a large community of developers.<ref>{{cite web |title=PyTorch Releases Major Update, Now Officially Supports Windows |url=https://medium.com/syncedreview/pytorch-releases-major-update-now-officially-supports-windows-2426c9f29d2d |website=medium.com |accessdate=8 March 2020}}</ref>
 
|-
 
|-
| 2019 (December 20) || || Software release || {{w|Weka (machine learning)}} is released.  
+
| 2017 || || || Alphabet's Jigsaw team develops an intelligent system to combat online trolling. This system is designed to learn and identify trolling behavior by analyzing millions of comments from various websites. The algorithms behind the system have the potential to assist websites with limited moderation resources in detecting and addressing online harassment.<ref name="javatpoint.comu"/><ref name="cloud.withgoogle.com"/>
 
|-
 
|-
| 2020 (February 5) || || Software release || {{w|KNIME}} is released.
+
| 2017 (May 1) || || {{w|CellCognition}} || {{w|CellCognition}}<ref>{{cite web |last1= |first1= |title=CellCognition Explorer |url=https://software.cellcognition-project.org/ |website=software.cellcognition-project.org |accessdate=8 March 2020}}</ref><ref>{{cite journal |title=A deep learning and novelty detection framework for rapid phenotyping in high-content screening. |doi=10.1091/mbc.E17-05-0333 |pmid=28954863 |url=http://europepmc.org/article/PMC/5687041 |pmc=5687041}}</ref>
 
|-
 
|-
 
|}
 
|}
 +
 +
== Visual data ==
 +
 +
=== Google Trends ===
 +
 +
The image below shows {{w|Google Trends}} data for Machine learning (Field of study), from January 2004 to March 2021, when the screenshot was taken. Interest is also ranked by country and displayed on world map.<ref>{{cite web |title=Machine learning |url=https://trends.google.com/trends/explore?date=all&q=%2Fm%2F01hyh_ |website=Google Trends |access-date=11 March 2021}}</ref>
 +
 +
[[File:Machine learning gt.png|thumb|center|700px]]
 +
 +
=== Google Ngram Viewer ===
 +
 +
The chart below shows {{w|Google Ngram Viewer}} data for Machine learning, from 1950 to 2019.<ref>{{cite web |title=Machine learning |url=https://books.google.com/ngrams/graph?content=Machine+learning&year_start=1950&year_end=2019&corpus=26&smoothing=3&case_insensitive=true |website=books.google.com |access-date=11 March 2021 |language=en}}</ref>
 +
 +
[[File:Machine learning ngram.png|thumb|center|800px]]
 +
 +
=== Wikipedia Views ===
 +
 +
The chart below shows pageviews of the English Wikipedia article {{w|Machine learning}}, on desktop from December 2007, and on mobile-web, desktop-spider, mobile-web-spider and mobile app, from July 2015; to February 2021.<ref>{{cite web |title=Machine learning |url=https://wikipediaviews.org/displayviewsformultiplemonths.php?page=Machine+learning&allmonths=allmonths&language=en&drilldown=all |website=wikipediaviews.org |access-date=11 March 2021}}</ref>
 +
 +
[[File:Machine learning wv.png|thumb|center|600px]]
  
 
==See also==
 
==See also==
Line 283: Line 331:
 
===How the timeline was built===
 
===How the timeline was built===
  
The initial version of the timeline was written by [[User:FIXME|FIXME]].
+
The initial version of the timeline was written by [[User:Issa]].
  
 
{{funding info}} is available.
 
{{funding info}} is available.
Line 294: Line 342:
  
 
===What the timeline is still missing===
 
===What the timeline is still missing===
+
 
 +
* {{w|Outline of machine learning}}
 +
* https://factored.ai/machine-learning-engineering/
 +
* https://www.dataversity.net/a-brief-history-of-machine-learning/
 +
* https://www.techtarget.com/whatis/A-Timeline-of-Machine-Learning-History
 +
* https://www.lightsondata.com/the-history-of-machine-learning/
 +
* https://www.clickworker.com/customer-blog/history-of-machine-learning/
 +
* https://analyticsindiamag.com/the-history-of-machine-learning-algorithms/
 +
* https://www.startechup.com/blog/machine-learning-history/
 +
* https://blog.bccresearch.com/brief-history-of-machine-learning
 +
* https://dataconomy.com/2022/04/27/the-history-of-machine-learning/
 +
* https://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine-learning-every-manager-should-read/?sh=3dfbc88515e7
 +
* https://techcrunch.com/2017/08/08/the-evolution-of-machine-learning/
 +
* https://concisesoftware.com/blog/history-of-machine-learning/
 +
* https://www.linkedin.com/pulse/rise-machine-learning-history-florian-steinig/
 +
* https://labelyourdata.com/articles/history-of-machine-learning-how-did-it-all-start
 +
* https://pandio.com/when-was-machine-learning-invented/
 +
* https://medium.com/@codetain/a-brief-history-of-machine-learning-38f20c155c42
 +
* https://medium.com/bloombench/history-of-machine-learning-7c9dc67857a5
 +
* https://theintactone.com/2021/11/27/history-of-machine-learning/
 +
* https://recro.io/blog/history-of-machine-learning/
 +
* https://www.britannica.com/technology/machine-learning
 +
* https://ifatwww.et.uni-magdeburg.de/ifac2020/media/pdfs/3439.pdf
 +
* https://people.idsia.ch/~juergen/deep-learning-history.html
 +
* https://www.researchgate.net/figure/History-of-machine-learning_fig1_366424883
 +
* https://www.bvp.com/atlas/the-evolution-of-machine-learning-infrastructure
 +
* https://www.securityinfowatch.com/cybersecurity/article/21114214/a-brief-history-of-machine-learning-in-cybersecurity
 +
* https://builtin.com/artificial-intelligence/deep-learning-history
 +
* https://www.interactions.com/blog/technology/history-machine-learning/
 +
* https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/2041-210X.14061
 +
* https://ourworldindata.org/brief-history-of-ai
  
 
===Timeline update strategy===
 
===Timeline update strategy===
  
 
==See also==
 
==See also==
 +
 +
* [[Timeline of artificial intelligence]]
  
 
==External links==
 
==External links==

Latest revision as of 21:34, 21 July 2023

The content on this page is forked from the English Wikipedia page entitled "Timeline of machine learning". The original page still exists at Timeline of machine learning. The original content was released under the Creative Commons Attribution/Share-Alike License (CC-BY-SA), so this page inherits this license.

This page is a timeline of machine learning. Major discoveries, achievements, milestones and other major events are included.

Big picture

Time period Development summary More details
1950s-1970s Early days The early days of machine learning are marked by the development of statistical methods and the use of simple algorithms. In the 1950s, Arthur Samuel develops a machine learning algorithm that can learn to play checkers. In the 1960s, Frank Rosenblatt develops the perceptron, a simple neural network that could learn to classify patterns. However, the early days of machine learning were also marked by a period of pessimism, known as the AI Winter. This was due to a number of factors, including the failure of some early AI projects and the difficulty of scaling up machine learning algorithms to large datasets.
1980s-1990s Resurgence The rediscovery of backpropagation causes a resurgence in machine learning research. Convolutional neural networks emerge. Support vector machines and recurrent neural networks become popular. Machine learning shifts from a knowledge-driven approach to a data-driven approach.[1]
2000s-present Modern era The modern era of machine learning begins in the 2000s, when the development of deep learning make it possible to train neural networks on even larger datasets. This leads to a resurgence of interest in neural networks, and they are now used in a wide variety of applications, including image recognition, natural language processing, speech recognition, machine translation, medical diagnosis, financial trading, and self-driving cars.

Summary by decade

Decade Summary
<1950s Statistical methods are discovered and refined.
1950s Pioneering machine learning research is conducted using simple algorithms.
1960s The field of neural network research experiences a notable development with the discovery and utilization of multilayers.[2] neural networks were primarily shallow in structure, meaning they consisted of only a few layers of interconnected neurons. These shallow neural networks had limitations in handling complex problems that required more sophisticated data representations. However, they laid the foundation for further advancements in neural network research and paved the way for the development of deeper and more powerful networks in the future.[3]
1970s The AI Winter is caused by pessimism about machine learning effectiveness. Backpropagation is developed, allowing a network to adjust its hidden layers of neurons/nodes to adapt to new situations.[2]
1980s During the mid-1980s, the focus of research in the field of machine learning shifts towards artificial neural networks (ANN). However, in the subsequent decade of the 1990s, statistical learning systems gain prominence and temporarily overshadows the popularity of ANN. A pivotal event during this period is the emergence of convolution as a significant concept in machine learning, while the rediscovery and renewed exploration of backpropagation techniques leads to a resurgence of interest and advancement in the field of machine learning research. Rediscovery of backpropagation causes a resurgence in machine learning research.[4][3]
1990s There is a shift away from neural networks and towards statistical learning methods. Statistical learning methods are able to achieve comparable or better performance than neural networks on a wider range of tasks. However, neural networks continue to be used for some specific tasks, such as natural language processing and image recognition.[5][6][7]
2000s Deep learning becomes feasible and neural networks see widespread commercial use.[3]
2010s Machine learning becomes integral to many widely used software services and receives great publicity.

Full timeline

Year Event Type Caption Event
1642 At the age of 19, French child prodigy Blaise Pascal creates an "arithmetic machine" for his father, a tax collector. This machine has the capability to perform addition, subtraction, multiplication, and division. Fast forward three centuries, the Internal Revenue Service (IRS) now utilizes machine learning techniques to tackle tax evasion.[8]
1679 Gottfried Wilhelm Leibniz, a German mathematician, philosopher, and sometimes poet, is credited with inventing the binary code system, which serves as the basis for contemporary computing.[8]
1763 Discovery The Underpinngs of Bayes' Theorem Thomas Bayes's work An Essay towards solving a Problem in the Doctrine of Chances is published two years after his death, having been amended and edited by a friend of Bayes, Richard Price.[9] The essay presents work which underpins Bayes theorem.
1801 French weaver and merchant Joseph-Marie Jacquard introduces a groundbreaking innovation in data storage through the invention of a programmable weaving loom. The loom utilizes punched cards to control the movement of warp threads, enabling the creation of intricate patterns in fabric. This revolutionary technology not only allows weavers to produce complex designs more efficiently but also paves the way for future advancements in data storage. The concept of punched cards, pioneered by Jacquard, would become a fundamental principle in computer data storage systems during the 20th century. This significant development lays the foundation for the evolution of data storage technology as we know it today.[10][11]
1805 Discovery Least Squares Adrien-Marie Legendre describes the "méthode des moindres carrés", known in English as the least squares method.[12] The least squares method is used widely in data fitting, which in machine learning, refers to the process of finding a model or function that best represents or fits a given dataset.
1812 Bayes' Theorem Pierre-Simon Laplace publishes Théorie Analytique des Probabilités, in which he expands upon the work of Bayes and defines what is now known as Bayes' Theorem.[13]
1834 English polymath Charles Babbage, known as the father of the computer, envisions a machine that could be programmed using punch cards. Although the device would be never constructed, its logical framework forms the basis for all modern computers. Charles Babbage's contribution to punch-card programming is significant in the development of computer technology.[14][8]
1842 English mathematician and writer Ada Lovelace becomes the world's first computer programmer. She develops an algorithm that outlines a series of steps for solving mathematical problems on Charles Babbage's theoretical punch-card machine. Ada Lovelace's pioneering work in computer programming would be recognized years later when the US Department of Defense names a new software language "Ada" in her honor.[8]
1847 English mathematician, philosopher, and logician George Boole devises a type of algebra that allows all values to be simplified as either "true" or "false." This concept, known as Boolean logic, would play a crucial role in contemporary computing by aiding the central processing unit (CPU) in determining how to handle incoming inputs.[8][11]
1854 English physician John Snow, during a deadly cholera outbreak in London, challenges the prevailing belief that cholera spreads through "bad air." Using a map, Snow plots the locations of cholera cases and identifies the regions closest to each water pump. He makes a significant discovery by finding that most deaths occurred near a specific pump on Broad Street in the Soho district. Snow deduces that the contaminated water from that pump is responsible for the outbreak. By convincing the locals to disable the pump, the epidemic is brought under control. This event marks the birth of epidemiology and serves as an early success of the nearest-neighbor algorithm, even before its official invention, nearly a century later.[15]
1890 German-American statistician, inventor, and businessman Herman Hollerith develops a pioneering mechanical system that integrates punch cards with mechanical calculation methods. This groundbreaking system would enable the rapid computation of statistics compiled from vast amounts of data collected from millions of individuals.[11] Such advancement would contribute to the evolution of computing and provide a basis for future developments in machine learning.
1913 Discovery Markov Chains Andrey Markov first describes techniques he used to analyse a poem. The techniques later become known as Markov chains.[16]
1936 English mathematician Alan Turing proposes a theory outlining how a machine could identify and carry out a predefined set of instructions.[14] His theory of computation forms the foundation of modern computing and has direct relevance to machine learning. Turing's concept of a universal machine laid the groundwork for the development of computers capable of executing algorithms and processing data.[17]
1940 ENIAC (Electronic Numerical Integrator and Computer) is created as the first manually operated computer. This invention marks the birth of the first electronic general-purpose computer. Following this milestone, stored program computers such as EDSAC in 1949 and EDVAC in 1951 would be subsequently developed. These advancements introduce the concept of storing and executing programs electronically, paving the way for the evolution of modern computer systems.[14]
1943 American neurophysiologist Warren McCulloch and mathematician Walter Pitts publish a paper describing the functioning of neurons and their desire to create a model of it using an electrical circuit. This marks the first instance of neural networks. Building on this concept, they begin exploring the application of their idea and delve into the analysis of human neuron behavior.[2][14]
1949 Canadian psychologist Donald Hebb introduces a pioneering concept that marks the initial advancement in machine learning. Known as Hebbian Learning theory, it draws from a neuropsychological framework and aims to establish correlations among nodes within a recurrent neural network (RNN). This theory essentially captures and retains shared patterns within the network, functioning as a memory for future reference. In simpler terms, Hebbian Learning theory enables the network to identify connections and store relevant information for later use.[18]
1950 Turing's Learning Machine Alan Turing proposes a 'learning machine' that could learn and become artificially intelligent. Turing's specific proposal foreshadows genetic algorithms.[19] "Alan Turing creates the “Turing Test” to determine if a computer has real intelligence. To pass the test, a computer must be able to fool a human into believing it is also human."[7][14]
1951 First Neural Network Machine Marvin Minsky and Dean Edmonds build the SNARC, the first neural network machine able to learn.[20]
1952 Machines Playing Checkers Arthur Samuel at IBM's Poughkeepsie Laboratory becomes one of the early pioneers of machine learning. He develops some of the first machine learning programs, starting with programs that play checkers. Samuel's program, designed for an IBM computer, analyzes winning strategies by studying gameplay. Over time, the program would improve its performance by incorporating successful moves into its algorithm, thereby enhancing its gameplay abilities. Samuel's use of alpha-beta pruning in his computer program enables it to play checkers at a championship level, marking a significant milestone in the application of machine learning to gaming.[21][7][22]
1957 Discovery Perceptron Frank Rosenblatt invents the perceptron while working at the Cornell Aeronautical Laboratory. This groundbreaking invention garners significant attention and receives extensive media coverage. The perceptron is the first neural network for computers. It aims to simulate the cognitive processes of the human brain, marking a significant milestone in the field of artificial intelligence.[23][24][7]
1959 A significant advancement in neural networks occurrs when Bernard Widrow and Marcian Hoff develop two models at Stanford University. The initial model, known as ADELINE, showcases the ability to recognize binary patterns and make predictions about the next bit in a sequence. The subsequent generation, called MADELINE, proves to be highly practical as it effectively eliminates echo on phone lines, providing a valuable real-world application. Remarkably, this technology continues to be utilized to this day.[8][2]
1959 The term "Machine Learning" is first coined by Arthur Samuel[14], who defines it as the “field of study that gives computers the ability to learn without being explicitly programmed”.[25]
1959 The first practical application of a neural network occurrs when it is utilized to address the issue of echo removal on phone lines. This is achieved through the implementation of an adaptive filter.[14]
1962 U.S. professor Bernard Widrow and Ted Hoff introduce the ADALINE algorithm, a single-layer neural network that can be used for classification and regression tasks. The ADALINE algorithm is a significant breakthrough in the field of machine learning, but it is limited to a single layer. This is because it is difficult to train neural networks with multiple layers.[2]
1963 United States government agencies like the Defense Advanced Research Projects Agency (DARPA) fund AI research at universities such as MIT, hoping for machines that would translate Russian instantly. The Cold War is in full swing at the time, and the US government is eager to develop technologies that would give them an edge over the Soviet Union. Machine translation is seen as one such technology, and DARPA is willing to invest heavily in its development. MIT is one of the leading universities in the field of AI research at the time, and DARPA funds a number of projects at the university.[26]
1965 Soviet mathematician Alexey Ivakhnenko publishes a number of articles and books on group method of data handling (GMDH), a method for inductive inference that is used to build complex models from data. Ivakhnenko's work on GMDH is influential in the development of neural networks, as the GMDH algorithm is similar to the backpropagation algorithm, which is a widely used algorithm for training neural networks. Ivakhnenko's work on GMDH is considered to be one of the foundations of deep learning. His work would have a significant impact on the development of machine learning, and it is still used today in a variety of applications.[27]
1967 Nearest Neighbor Thomas M. Cover and Peter E. Hart make a significant contribution to the field of pattern recognition by introducing the nearest neighbor algorithm. This algorithm marks the beginning of basic pattern recognition capabilities for computers. Its initial application is in mapping routes, particularly for traveling salesmen who needed to visit multiple cities in a short tour. By leveraging the nearest neighbor algorithm, computers could identify similarities between items in large datasets and automatically recognize patterns. This breakthrough paves the way for further advancements in pattern recognition and data analysis.[28][7]
1969 Limitations of Neural Networks Marvin Minsky and Seymour Papert publish their book Perceptrons, describing some of the limitations of perceptrons and neural networks. The interpretation that the book shows that neural networks are fundamentally limited is seen as a hindrance for research into neural networks.[29][30]
1970 Automatic Differentation (Backpropagation) Finnish mathematician and computer scientist Seppo Linnainmaa publishes the general method for automatic differentiation (AD) of discrete connected networks of nested differentiable functions.[31][32] This corresponds to the modern version of backpropagation, but is not yet named as such.[33][34][35][36][18]
1974 Algorithm Greek biomedical engineer Evangelia Micheli-Tzanakou and Harth introduce ALOPEX (ALgorithms Of Pattern EXtraction) as a correlation based machine learning algorithm, which focuses on extracting patterns from data by identifying correlations between variables or features.
1974 Backpropagation American social scientist and machine learning pioneer Paul Werbos lays the foundation for backpropagation in his dissertation, a technique that adjusts the weights of neural networks to improve prediction accuracy.[22]
1977 Algorithm The Expectation–maximization algorithm is explained and given its name in a paper by Arthur Dempster, Nan Laird, and Donald Rubin.[37]
1979 Stanford Cart Students at Stanford University develop a cart that can navigate and avoid obstacles in a room.[38] The Stanford Cart consists in a remote-controlled robot, successfully navigating a room filled with obstacles without human intervention, showcasing advancements in autonomous movement.[39][7]
1980 Discovery Neocognitron Japanese computer scientist Kunihiko Fukushima introduces the neocognitron, a hierarchical multilayered convolutional neural network. This groundbreaking work lays the foundation for convolutional neural networks, which would become a fundamental architecture in the field of artificial neural networks. The neocognitron's innovative design would inspire further advancements and applications in image and pattern recognition.[40][41][27]
1980 The Linde–Buzo–Gray algorithm is introduced by Yoseph Linde, Andrés Buzo and Robert M. Gray.[42]
1980 The first instance of the International Conference on Machine Learning takes place. The conference serves as a platform for researchers, practitioners, and industry professionals to come together and present their latest research, share ideas, and discuss advancements in machine learning algorithms, methodologies, and applications.
1981 Explanation Based Learning Gerald Dejong introduces Explanation Based Learning (EBL), a concept in machine learning where a computer algorithm analyzes training data to create a general rule by discarding unimportant information. This approach allows the algorithm to focus on relevant patterns and extract valuable knowledge from the data.[43][7]
1981 American social scientist and machine learning pioneer Paul Werbos publishes a paper in the Mathematics of Control, Signals, and Systems journal that introduces the backpropagation algorithm for training multilayer perceptrons (MLPs). MLPs are a type of neural network that can learn to solve complex problems by adjusting the weights of its connections. Werbos's paper is a major breakthrough in the field of machine learning. It shows that MLPs can be trained to solve problems that are previously thought to be intractable. This would lead to a resurgence of interest in neural networks, paving the way for the development of more advanced neural network architectures.[18]
1982 Discovery Recurrent Neural Network John Hopfield popularizes Hopfield networks, a type of recurrent neural network that can serve as content-addressable memory systems.[44][2][3]
1982 Japan makes a significant announcement regarding its emphasis on the development of more sophisticated neural networks. This declaration serves as a catalyst for increased American funding in this field, subsequently leading to a surge of research endeavors in the same domain.[2]
1982 Self-learning as machine learning paradigm is introduced along with a neural network capable of self-learning named Crossbar Adaptive Array (CAA).[45]
1985 NetTalk Terry Sejnowski, along with Charles Rosenberg, develop a neural network called NetTalk. This innovative system has the ability to learn the pronunciation of words similar to how a baby learns. NetTalk demonstrates impressive capabilities by teaching itself the correct pronunciation of approximately 20,000 words within just one week. This breakthrough in neural network technology showcases the potential of self-learning systems and their ability to acquire language skills. A program that learns to pronounce words the same way a baby does, is developed by Terry Sejnowski.[46][14]
1985–1986 Researchers in the field of neural networks introduce the concept of Multilayer Perceptron (MLP) along with the practical Backpropagation (BP) training algorithm. Although the idea of BP was proposed earlier, the specific implementation for neural networks was suggested by Werbos in 1981. These developments mark a significant acceleration in neural network research and lay the foundation for the neural network architectures used today.[18]
1986 Discovery Backpropagation The process of backpropagation is described by David Rumelhart, Geoff Hinton and Ronald J. Williams.[47][48]
1986 Australian computer scientist Ross Quinlan proposes the ID3 algorithm, today a very-well known ML algorithm.[18]
1986 Algorithm The Dehaene–Changeux model is developed by cognitive neuroscientists Stanislas Dehaene and Jean-Pierre Changeux.[49] It is used to provide a predictive framework to the study of inattentional blindness and the solving of the Tower of London test.[50][51]
1986 Peer-reviewed scientific journal Machine Learning is first issued. Published by Springer Nature, it is considered to be one of the leading journals in the field of machine learning. The journal publishes articles on a wide range of topics related to machine learning, including statistical learning theory, natural language processing, computer vision, data mining, reinforcement learning, and robotics.[52]
1986 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases
1987 The Conference on Neural Information Processing Systems (NeurIPS) is first held. It is a prominent conference in the field of artificial intelligence and machine learning, where researchers, academics, and industry professionals gather to present and discuss the latest advancements, research findings, and developments related to neural networks, deep learning, and various aspects of information processing systems. NeurIPS would become a significant platform for showcasing breakthroughs and fostering collaborations within the AI community.
1988 The Knowledge Engineering and Machine Learning Group is founded at the Technical University of Catalonia (UPC) in Barcelona, Spain. KEMLG is a research group that focuses on the development of knowledge engineering and machine learning techniques. The group would make significant contributions to the field of artificial intelligence, and its work would be used in a wide variety of applications, including medical diagnosis, fraud detection, and natural language processing.
1989 Discovery Reinforcement Learning Christopher Watkins develops Q-learning, which greatly improves the practicality and feasibility of reinforcement learning.[53]
1989 Commercialization Commercialization of Machine Learning on Personal Computers Axcelis, Inc. releases Evolver, the first software package to commercialize the use of genetic algorithms on personal computers.[54]
1989 Algorithm Chris Watkins introduces Q-learning, a model-free reinforcement learning algorithm.[55][56]
1992 Achievement Machines Playing Backgammon Gerald Tesauro develops TD-Gammon, a computer backgammon program that utilises an artificial neural network trained using temporal-difference learning (hence the 'TD' in the name). TD-Gammon is able to rival, but not consistently surpass, the abilities of top human backgammon players.[57]
1995 A significant breakthrough in machine learning occurrs with the introduction of Support Vector Machines (SVM) by Vapnik and Cortes. SVMs possesses a solid theoretical foundation and delivers impressive empirical results. This development would lead to a division within the machine learning community, with some advocating for neural networks (NN) while others supporting SVM as the preferred approach.[18]
1995 Discovery Random Forest Algorithm Tin Kam Ho publishes a paper describing random decision forests. Random decision forests are a type of ensemble learning algorithm that combines multiple decision trees to improve the accuracy of predictions. Ho's paper, titled Random Decision Forests, introduces the basic idea of random decision forests. He shows that by randomly selecting features and thresholds, it is possible to construct a large number of decision trees that are relatively independent of each other. This independence helps to reduce the variance of the predictions, which leads to improved accuracy. Ho's paper is met with a positive reception from the machine learning community. Random decision forests would since become one of the most popular machine learning algorithms, and they would be used in a wide variety of applications, including image classification, natural language processing, and medical diagnosis.[58]
1995 Discovery Support Vector Machines Corinna Cortes and Vladimir Vapnik publish their work on support vector machines in the journal Machine Learning. Their paper, titled "Support-Vector Networks", introduces SVMs as a new machine learning algorithm for classification and regression problems. SVMs are based on the idea of finding a hyperplane that separates two classes of data points with the maximum possible margin. The margin is the distance between the hyperplane and the closest data points on either side. The more data points that lie on the margin, the more robust the SVM will be to noise in the data. SVMs would show to be very effective for a wide variety of classification and regression problems. They are particularly well-suited for problems where the data is not linearly separable, as SVMs can be used to map the data to a higher-dimensional space where it becomes linearly separable. SVMs are also relatively easy to train and are very efficient in terms of computational resources. Today it is one of the most popular machine learning algorithms. They are used in a wide variety of applications, including spam filtering, image classification, and fraud detection.[18][59]
1996 (Octgober 10) Orange is released by the University of Ljubljana. It is a visual programming language and integrated development environment (IDE) for data mining and machine learning.
1997 IBM Deep Blue Beats Kasparov Supercomputer Deep Blue, developed by IBM, achieves a historic victory by defeating chess grandmaster Garry Kasparov in a match. This landmark event demonstrates the potential of artificial intelligence to surpass human capability in complex tasks such as chess. It marks a pivotal moment in machine learning, highlighting the ability of AI systems to learn and evolve independently, posing new challenges and possibilities for mankind.[60][22]
1997 Discovery LSTM Sepp Hochreiter and Jürgen Schmidhuber invent Long-short term memory recurrent neural networks,[61] greatly improving the efficiency and practicality of recurrent neural networks.
1997 Yoav Freund and Robert Schapire introduce Adaboost, which would become an influential machine learning model. Adaboost is an ensemble method that combines multiple weak classifiers to create a strong classifier. The model gained recognition and received the prestigious Godel Prize for its contributions. Adaboost works by iteratively training weak classifiers on difficult instances while giving them more importance. This approach has proven effective in various tasks such as face recognition and detection, and it continues to serve as a foundation for many machine learning applications.[18]
1998 MNIST database A team led by Yann LeCun releases the MNIST database, a dataset comprising a mix of handwritten digits from American Census Bureau employees and American high school students.[62] The MNIST database has since become a benchmark for evaluating handwriting recognition.
1998 Researchers at AT&T Bell Laboratories develop a neural network that can accurately recognize handwritten ZIP codes. The network was trained on a dataset of 100,000 ZIP codes, and it is able to achieve an accuracy of 99%. The network uses a technique called backpropagation to train itself. Backpropagation is a method for adjusting the weights of a neural network so that it can better predict the output for a given input. The development of this network is a major breakthrough in the field of machine learning. It shows that neural networks can be used to solve real-world problems, and it paves the way for the development of more advanced neural networks.[2]
1999 A study is published in the Journal of the National Cancer Institute showing that computer-aided diagnosis (CAD) is more accurate than radiologists at detecting breast cancer on mammograms. The study, which is conducted by researchers at the University of Chicago, finds that CAD detecta cancer 52% more accurately than radiologists do.
2000 Algorithm In anomaly detection, the local outlier factor (LOF) is an algorithm proposed by Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng and Jörg Sander for finding anomalous data points by measuring the local deviation of a given data point with respect to its neighbours.[63]
2000 LogitBoost, a boosting algorithm In machine learning and computational learning theory, is formulated by Jerome H. Friedman, Trevor Hastie, and Robert Tibshirani.[64]
2000 The Journal of Machine Learning Research is first published by the JMLR Foundation. It is considered to be one of the leading journals in the field of machine learning. JMLR publishes articles on a wide range of topic, including statistical learning theory, natural language processing, computer vision, data mining, reinforcement learning, and robotics.
2001 Breiman introduces an alternative ensemble model that involves the combination of multiple decision trees. In this model, each decision tree is carefully constructed by considering only a random subset of instances, while the selection of each node is based on a random subset of features.[18]
2001 The iDistance indexing and query processing technique is first proposed by Cui Yu, Beng Chin Ooi, Kian-Lee Tan and H. V. Jagadish.[65] It is a method for indexing and querying data in high-dimensional metric spaces. A metric space is a space where the distance between two points can be measured. High-dimensional metric spaces are often used to represent data that has a large number of features, such as images or text documents. The iDistance indexing technique would show to be effective for a variety of applications, including image retrieval, text mining, and data mining. It is a powerful tool for indexing and querying data in high-dimensional metric spaces.
2002 (October) Torch Machine Learning Library Torch is first released. It is a scientific computing library that is used for machine learning research. Torch is a popular choice for deep learning research and development. It would be used to develop a wide variety of deep learning models.[66]
2002 Software release Computer vision and machine learning library Dlib is first released by Davis King. It is a popular choice for developing facial recognition, object detection, and image processing applications.
2003 Algorithm The concept of manifold alignment is first introduced as by Ham, Lee, and Saul as a class of machine learning algorithms that produce projections between sets of data, given that the original data sets lie on a common manifold.[67]
2004 Google unveils its MapReduce technology, which is a distributed programming model for processing and generating large data sets. MapReduce is based on the idea of breaking down a large data set into smaller chunks that can be processed in parallel on a cluster of computers. The development of MapReduce is a major breakthrough in the field of distributed computing. It makes possible to process large data sets more efficiently and cost-effectively. This would lead to a decrease in the cost of parallel computing and memory, in turn making possible to develop more powerful machine learning models.[27]
2004 Jeff Hawkins and Sandra Blakeslee introduce the concept of Hierarchical Temporal Memory (HTM), which can be regarded as a theory or model of intelligence that adheres to biological limitations. This concept is extensively explained in their book On Intelligence.
2005 The third rise of neural networks (NN) begins with the conjunction of many different discoveries from past and present by recent mavens Geoffrey Hinton, Yoshua Bengio, Yann LeCun, Andrew Ng, and other valuable older researchers. This is a time when a number of factors come together to enable a new wave of progress in NN. These factors include the availability of large datasets, such as the ImageNet dataset, the development of powerful computers, such as GPUs, the development of new algorithms for training NN, such as backpropagation, and the combination of these factors led to a rapid increase in the performance of NN. NN begin to achieve state-of-the-art results in a wide variety of tasks, including image classification, natural language processing, and speech recognition.[18]
2006 Concept development Deep learning British-Canadian cognitive psychologist and computer scientist Geoffrey Hinton introduces the term "deep learning" to describe a set of new algorithms that enable computers to analyze and recognize objects and text within images and videos. This development marks a significant advancement in the field of neural networks and would since become a prominent and widely adopted technology in various industries.[7][14][4]
2006 Competition Face Recognition Grand Challenge The Face Recognition Grand Challenge (FRGC) is held by the National Institute of Standards and Technology (NIST) to evaluate the state-of-the-art in face recognition technology. It would become a landmark event in the field of face recognition, helping to accelerate the development of new and more accurate face recognition algorithms. The FRGC uses a variety of data sets, including 3D face scans, iris images, and high-resolution face images. The results of the FRGC would show that the new algorithms are significantly more accurate than the facial recognition algorithms from 2002 and 1995. The FRGC would help to establish face recognition as a viable technology for a variety of applications. The results of the FRGC would also be used to improve the accuracy of face recognition algorithms in commercial products.[2]
2006 Big data processing This is a significant year in the development of big data processing, as it sees the release of Hadoop, an open-source software framework that allows for the distributed processing of large data sets across clusters of computers. Hadoop was developed by Doug Cutting and Mike Cafarella at the Apache Software Foundation. It is based on the MapReduce programming model, which was originally developed by Google. MapReduce is a programming model that breaks down a large data processing task into a series of smaller tasks that can be run in parallel on a cluster of computers. This makes it possible to process very large data sets that would be too large to process on a single computer. Hadoop would be widely used framework for big data processing. It would be used by a variety of organizations, including Google, Facebook, and Yahoo.[27]
2006 Software release RapidMiner RapidMiner is first released by Ingo Mierswa and Ralf Klinkenberg. It is a data mining and machine learning software platform. RapidMiner is a powerful tool for data mining and machine learning tasks. It is easy to use and has a wide range of features. RapidMiner would be used by a wide range of companies and organizations, including Google, Amazon, and IBM.
2007 Scientific development Long Short-Term Memory A significant breakthrough occurs in the field of speech recognition with the introduction of a neural network architecture called Long Short-Term Memory (LSTM), which demonstrates superior performance compared to more traditional speech recognition programs at the time.[2]
2007 (June) Scikit-learn scikit-learn is released by David Cournapeau, Gael Varoquaux, and others. It is a free and open-source machine learning library for Python. Scikit-learn would become a popular choice for machine learning practitioners because it is easy to use, well-documented, and has a wide range of features. It includes implementations of a variety of machine learning algorithms, including support vector machines, decision trees, random forests, and k-nearest neighbors.[68]
2007 Software release Theano Theano is initially released. It is an open source Python library that allows users to easily make use of various machine learning models.[69]
2008 (January 11) Software release Pandas American software developer Wes McKinney releases the first version of pandas, a software library written for the Python programming language for data manipulation and analysis. pandas is fast, efficient, easy to use, and well-documented. It is used by a wide range of companies and organizations, including Google, Facebook, and Amazon. It is also used by many academic researchers. The name pandas is a play on the phrase "panel data", which is a type of data that is commonly used in statistical analysis. The pandas library was created by Wes McKinney, who was working as a researcher at AQR Capital Management at the time. Since its release, pandas would become one of the most popular data analysis libraries in the Python ecosystem. It is used by a wide range of companies and organizations, and it is also used by many academic researchers.[70]
2008 Scientific development Isolation Forest The Isolation Forest (iForest) algorithm is initially proposed by Fei Tony Liu, Kai Ming Ting and Zhi-Hua Zhou.[71]
2008 Encog Encog is created as a pure-Java/C# machine learning framework to support genetic programming, NEAT/HyperNEAT, and other neural network technologies.[72]
2009 (April 7) Software release Apache Mahout Apache Mahout is first released.[73]
2010 (April) Kaggle, a website that serves as a platform for machine learning competitions, is launched.[74][75]
2010 Microsoft releases the Kinect, a motion-sensing input device that can track 20 human features at a rate of 30 times per second. This allows people to interact with the computer via movements and gestures. The Kinect is originally developed for the Xbox 360 gaming console, but it would since be used for a variety of other applications, including gaming, healthcare, and education. It can be used to play games that require physical movement, such as Dance Central and Kinect Sports. It can also be used for rehabilitation therapy, as it can track the movements of patients and provide feedback on their progress. In the education space, the Kinect can be used to help students learn languages or math, as it can track their movements and provide feedback on their answers. It is likely to be used in a variety of applications in the future, as the technology continues to develop.[7]
2010 (May 20) Software release Accord.NET Accord.NET is initially released.[76]
2010 George Konidaris, Scott Kuindersma, Andrew Barto, and Roderic Grupen introduce a hierarchical reinforcement learning algorithm called Constructing skill trees (CST), which can build skill trees from a set of sample solution trajectories obtained from demonstration. CST works by first segmenting the demonstration trajectories into a set of primitive skills. These skills are then combined to form a skill tree, with each node in the tree representing a different skill. The skill tree is then used to guide the agent's exploration of the environment, and to help the agent learn new skills. CST would be shown to be effective in a variety of domains, including robotics, video games, and board games. It is a promising approach to hierarchical reinforcement learning, and it is likely to be used in a variety of applications in the future.
2011 Achievement Beating Humans in Jeopardy Using a combination of machine learning, natural language processing and information retrieval techniques, IBM's Watson beats two human champions in a Jeopardy! competition.[77]
2012 Achievement Recognizing Cats on YouTube The Google Brain team, led by Andrew Ng and Jeff Dean, create a neural network that learns to recognize cats by watching unlabeled images taken from frames of YouTube videos.[78][79] " In 2012, Google created a deep neural network which learned to recognize the image of humans and cats in YouTube videos."[14]
2012 Google's X Lab develops a machine learning algorithm that can identify cat videos on YouTube. The algorithm was trained on a dataset of manually labeled videos. It works by extracting features from videos and training a classifier to distinguish between videos that contain cats and videos that do not. The algorithm can be used to recommend cat videos to users or generate statistics about cat videos on YouTube. The cat-detection algorithm is a powerful example of the use of machine learning to solve real-world problems. It shows that machine learning can be used to solve problems in a variety of domains.[7]
2012 Alex Krizhevsky and his colleagues develop the AlexNet convolutional neural network, which wins the ImageNet Large Scale Visual Recognition Challenge and becomes one of the first successful applications of deep learning to image recognition. This would lead to a surge in the use of GPUs and convolutional neural networks (CNNs) in machine learning. AlexNet is the first CNN to use GPUs for training, and it also introduces the ReLU activation function. These two innovations make it possible to train much larger and deeper CNNs than had been possible before, which would lead to significant improvements in the performance of CNNs on a variety of tasks.[2]
2012 Special Interest Group on Knowledge Discovery and Data Mining
2012 (March 12) mlpy is released by Richard Arp and Ralf Klinkenberg. It is a Python module for machine learning. mlpy is a free and open-source library that provides a wide range of machine learning algorithms, including support vector machines, decision trees, random forests, and k-nearest neighbors. It also includes a number of utility functions for data manipulation and visualization.[80]
2013 International Conference on Learning Representations
2014 Leap in Face Recognition Facebook researchers publish their work on DeepFace, a system that uses neural networks that identifies faces with 97.35% accuracy. The results are an improvement of more than 27% over previous systems and rivals human performance.[81] "Facebook develops DeepFace, a software algorithm that is able to recognize or verify individuals on photos to the same level as humans can."[7] "DeepFace was a deep neural network created by Facebook, and they claimed that it could recognize a person with the same precision as a human can do."[14]
2014 (May 26) Software release Apache Spark is first released by Matei Zaharia, Andrew Fire, and others at the AMPLab at UC Berkeley. Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Java, Scala, Python, and R, and an optimized engine that supports general execution graphs. It also supports a rich set of higher-level tools including Spark SQL for SQL and structured data processing, MLlib for machine learning, GraphX for graph processing, and Spark Streaming. It would become a popular choice for big data processing, and would be used by a wide variety of companies, including Uber, Airbnb, and Netflix.[27][82]
2014 Sibyl Researchers from Google detail their work on Sibyl,[83] a proprietary platform for massively parallel machine learning used internally by Google to make predictions about user behavior and provide recommendations.[84]
2014 The Chabot named "Eugen Goostman" successfully passes the Turing Test by convincing 33% of human judges that it is a human rather than a machine. This marks the first instance in which a Chabot achieves such a level of deception in the test.[14] It is developed by Russian-born Vladimir Veselov, Ukrainian-born Eugene Demchenko, and Russian-born Sergey Ulasen.[85]
2014 British artificial intelligence company DeepMind is founded by Demis Hassabis, Shane Legg, and Mustafa Suleyman. The company is acquired by Google in the same year. DeepMind is known for its work in reinforcement learning, a type of machine learning that allows computers to learn by trial and error. DeepMind would develop a number of successful reinforcement learning algorithms, including AlphaGo, which would defeat a professional human Go player in 2016.[2]
2014 Ian Goodfellow and his colleagues invent Generative Adversarial Networks (GANs), a type of unsupervised learning algorithm that can be used to generate realistic images, text, and other data. GANs work by pitting two neural networks against each other: the generator, which creates new data, and the discriminator, which determines whether the data was generated by the generator or is real data. As the generator and discriminator are trained, they become better at their respective tasks, and GANs would be used to generate realistic images, text, and other data for a variety of applications.[3][86]
2015 (February) spaCy is released. It is a free, open-source natural language processing (NLP) library for Python. It is a powerful tool for NLP tasks such as text classification, named entity recognition, and part-of-speech tagging. It is fast, efficient, and easy to use. spaCy would be used by a wide range of companies and organizations, including Google, Facebook, and Amazon. It is also used by many academic researchers.[87][88]
2015 (March 27) Software release Keras Keras is first released. It is an open source software library designed to simplify the creation of deep learning models.[89]
2015 (June 9) Software release Chainer Chainer is released by Preferred Networks, Inc. in Japan. A deep learning framework written in Python, it would become a popular choice for deep learning research and development.[90][91]
2015 (October 8) Software release Apache SINGA Apache SINGA is first released. It is an open-source distributed machine learning library that facilitates the training of large-scale machine learning (especially deep learning) models over a cluster of machines. The SINGA project was initiated by the DB System Group at National University of Singapore in 2014, in collaboration with the database group of Zhejiang University. The goal of the project was to support complex analytics at scale, and make database systems more intelligent and autonomic. Apache SINGA would be used by a number of organizations, including Citigroup, NetEase, and Singapore General Hospital. It would become a popular choice for distributed deep learning because it is easy to use, scalable, and efficient.[92]
2015 Achievement Beating Humans in Go Google's AlphaGo program becomes the first Computer Go program to beat an unhandicapped professional human player[93] using a combination of machine learning and tree search techniques.[94]
2015 Software release Google releases TensorFlow, an open source software library for machine learning.[95]
2015 Amazon launches its own machine learning platform called Amazon Machine Learning (Amazon ML). It is a cloud-based service that allows developers to build, train, and deploy machine learning models without having to worry about the underlying infrastructure.[7][2]
2015 The Distributed Machine Learning Toolkit (DMTK) is first released. It is a Microsoft-developed open-source framework that enables the efficient distribution of machine learning problems across multiple computers. DMTK is based on the Apache Spark framework.[7]
2015 Over 3,000 AI and robotics researchers sign an open letter warning of the danger of autonomous weapons which select and engage targets without human intervention. The letter is endorsed by a number of high-profile figures, including Stephen Hawking, Elon Musk, and Steve Wozniak. The letter states that autonomous weapons pose a serious threat to humanity. They could be used to kill without discrimination, and they could be difficult to control. The letter calls for a ban on the development and use of autonomous weapons. The open letter would be influential in raising awareness of the dangers of autonomous weapons. It would also lead to a number of countries considering bans on the development and use of these weapons.[7]
2015 Google's speech recognition program has a 49% performance jump using CTC-trained LSTMs. This is a major milestone in the development of speech recognition technology, as it shows that CTC-trained LSTMs could be used to train speech recognition programs that were significantly more accurate than previous models. CTC-trained LSTMs would be later used in a variety of commercial speech recognition products, including Google's Voice Search and Amazon's Alexa. They have the potential to revolutionize the way we interact with computers, and they are likely to be used in a variety of applications in the years to come.[2]
2015 Organization OpenAI OpenAI is founded as a non-profit research company by Elon Musk, Sam Altman, Ilya Sutskever, and others. The company's mission is to ensure that artificial general intelligence benefits all of humanity. [2]
2015 PayPal PayPal adopts a collaborative approach to combat fraud and money laundering on its platform by combining the efforts of humans and machines. Human detectives play a crucial role in identifying the patterns and traits associated with criminal behavior. This knowledge is then utilized by a machine learning program to effectively detect and eliminate fraudulent activity on the PayPal site. The synergy between human expertise and automated algorithms enhances PayPal's ability to identify and thwart fraudulent individuals.[8]
2016 (January 25) Microsoft Cognitive Toolkit Microsoft Cognitive Toolkit is initially released. It is an AI solution aimed at helping users to advance in their machine learning projects.[69]
2016 AlphaGo Google's artificial intelligence algorithm, AlphaGo, achieves a significant milestone by defeating a professional player in the complex Chinese board game Go. Considered more challenging than chess, Go is known for its intricate gameplay. AlphaGo, developed by Google DeepMind, emerges victorious in all five games of a Go competition against top players. It first defeates Lee Sedol, the world's second-ranked player, and later goes on to defeat Ke Jie, the game's number one player in 2017.[14][7]
2016 Software release FBLearner Flow Facebook details FBLearner Flow, an internal software platform that allows Facebook software engineers to easily share, train and use machine learning algorithms.[96] FBLearner Flow is used by more than 25% of Facebook's engineers, more than a million models have been trained using the service and the service makes more than 6 million predictions per second.[97]
2016 (October) PyTorch PyTorch is first released by Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, and others. It is an open-source machine learning framework that is based on the Torch library. Torch is a scientific computing library that is used for deep learning research. PyTorch would become a popular choice for deep learning research and development. It is easy to use and it is very flexible. It is also well-supported by a large community of developers.[98]
2017 Alphabet's Jigsaw team develops an intelligent system to combat online trolling. This system is designed to learn and identify trolling behavior by analyzing millions of comments from various websites. The algorithms behind the system have the potential to assist websites with limited moderation resources in detecting and addressing online harassment.[14][8]
2017 (May 1) CellCognition CellCognition[99][100]

Visual data

Google Trends

The image below shows Google Trends data for Machine learning (Field of study), from January 2004 to March 2021, when the screenshot was taken. Interest is also ranked by country and displayed on world map.[101]

Machine learning gt.png

Google Ngram Viewer

The chart below shows Google Ngram Viewer data for Machine learning, from 1950 to 2019.[102]

Machine learning ngram.png

Wikipedia Views

The chart below shows pageviews of the English Wikipedia article Machine learning, on desktop from December 2007, and on mobile-web, desktop-spider, mobile-web-spider and mobile app, from July 2015; to February 2021.[103]

Machine learning wv.png

See also

Meta information on the timeline

How the timeline was built

The initial version of the timeline was written by User:Issa.

Funding information for this timeline is available.

Feedback and comments

Feedback for the timeline can be provided at the following places:

  • FIXME

What the timeline is still missing

Timeline update strategy

See also

External links

References

  1. Firican, George (31 January 2022). "The history of Machine Learning". LightsOnData. Retrieved 5 July 2023. 
  2. 2.00 2.01 2.02 2.03 2.04 2.05 2.06 2.07 2.08 2.09 2.10 2.11 2.12 2.13 2.14 "A Brief History of Machine Learning". dataversity.net. Retrieved 20 February 2020. 
  3. 3.0 3.1 3.2 3.3 3.4 "A History of Machine Learning and Deep Learning". import.io. Retrieved 21 February 2020. 
  4. 4.0 4.1 "A brief history of the development of machine learning algorithms". subscription.packtpub.com. Retrieved 25 February 2020. 
  5. "A BRIEF HISTORY OF MACHINE LEARNING". provalisresearch.com. Retrieved 21 February 2020. 
  6. "What is Machine Learning?". mlplatform.nl. Retrieved 25 February 2020. 
  7. 7.00 7.01 7.02 7.03 7.04 7.05 7.06 7.07 7.08 7.09 7.10 7.11 7.12 7.13 7.14 "A Short History of Machine Learning". forbes.com. Retrieved 20 February 2020. 
  8. 8.0 8.1 8.2 8.3 8.4 8.5 8.6 8.7 "A history of machine learning". cloud.withgoogle.com. Retrieved 21 February 2020. 
  9. Bayes, Thomas (1 January 1763). "An Essay towards solving a Problem in the Doctrine of Chance" (PDF). Philosophical Transactions. 53: 370–418. doi:10.1098/rstl.1763.0053. Retrieved 15 June 2016. 
  10. "Jacquard Loom, 1934 - The Henry Ford". www.thehenryford.org. Retrieved 14 June 2023. 
  11. 11.0 11.1 11.2 "History of Machine Learning". medium.com. Retrieved 25 February 2020. 
  12. Legendre, Adrien-Marie (1805). Nouvelles méthodes pour la détermination des orbites des comètes (in French). Paris: Firmin Didot. p. viii. Retrieved 13 June 2016. 
  13. O'Connor, J J; Robertson, E F. "Pierre-Simon Laplace". School of Mathematics and Statistics, University of St Andrews, Scotland. Retrieved 15 June 2016. 
  14. 14.00 14.01 14.02 14.03 14.04 14.05 14.06 14.07 14.08 14.09 14.10 14.11 14.12 14.13 "History of Machine Learning". javatpoint.com. Retrieved 21 February 2020. 
  15. Domingos, Pedro (22 September 2015). The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World (1st ed.). Basic Books. 
  16. Hayes, Brian. "First Links in the Markov Chain". American Scientist. Sigma Xi, The Scientific Research Society (March–April 2013): 92. doi:10.1511/2013.101.1. Retrieved 15 June 2016. Delving into the text of Alexander Pushkin’s novel in verse Eugene Onegin, Markov spent hours sifting through patterns of vowels and consonants. On January 23, 1913, he summarized his findings in an address to the Imperial Academy of Sciences in St. Petersburg. His analysis did not alter the understanding or appreciation of Pushkin’s poem, but the technique he developed—now known as a Markov chain—extended the theory of probability in a new direction. 
  17. Bernhardt, Chris (2016). "Turing's Vision: The Birth of Computer Science". The MIT Press. 
  18. 18.0 18.1 18.2 18.3 18.4 18.5 18.6 18.7 18.8 18.9 "Brief History of Machine Learning". erogol.com. Retrieved 24 February 2020. 
  19. Turing, Alan (October 1950). "COMPUTING MACHINERY AND INTELLIGENCE". MIND. 59 (236): 433–460. doi:10.1093/mind/LIX.236.433. Retrieved 8 June 2016. 
  20. Crevier 1993, pp. 34–35 and Russell & Norvig 2003, p. 17
  21. McCarthy, John; Feigenbaum, Ed. "Arthur Samuel: Pioneer in Machine Learning". AI Magazine (3). Association for the Advancement of Artificial Intelligence. p. 10. Retrieved 5 June 2016. 
  22. 22.0 22.1 22.2 Koch, Robert (1 September 2022). "History of Machine Learning - A Journey through the Timeline". clickworker.com. Retrieved 3 July 2023. 
  23. Rosenblatt, Frank (1958). "THE PERCEPTRON: A PROBABILISTIC MODEL FOR INFORMATION STORAGE AND ORGANIZATION IN THE BRAIN" (PDF). Psychological Review. 65 (6): 386–408. 
  24. Mason, Harding; Stewart, D; Gill, Brendan (6 December 1958). "Rival". The New Yorker. Retrieved 5 June 2016. 
  25. Bheemaiah, Kariappa; Esposito, Mark; Tse, Terence (3 May 2017). "What is machine learning?". The Conversation. Retrieved 3 July 2023. 
  26. "Seventy years of highs and lows in the history of machine learning". fastcompany.com. Retrieved 25 February 2020. 
  27. 27.0 27.1 27.2 27.3 27.4 "History of deep machine learning". medium.com. Retrieved 21 February 2020. 
  28. Marr, Marr. "A Short History of Machine Learning - Every Manager Should Read". Forbes. Retrieved 28 Sep 2016. 
  29. Cohen, Harvey. "The Perceptron". Retrieved 5 June 2016. 
  30. Colner, Robert. "A brief history of machine learning". SlideShare. Retrieved 5 June 2016. 
  31. Seppo Linnainmaa (1970). The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors. Master's Thesis (in Finnish), Univ. Helsinki, 6-7.
  32. Seppo Linnainmaa (1976). Taylor expansion of the accumulated rounding error. BIT Numerical Mathematics, 16(2), 146-160.
  33. Griewank, Andreas (2012). Who Invented the Reverse Mode of Differentiation?. Optimization Stories, Documenta Matematica, Extra Volume ISMP (2012), 389-400.
  34. Griewank, Andreas and Walther, A.. Principles and Techniques of Algorithmic Differentiation, Second Edition. SIAM, 2008.
  35. Jürgen Schmidhuber (2015). Deep learning in neural networks: An overview. Neural Networks 61 (2015): 85-117. ArXiv
  36. Jürgen Schmidhuber (2015). Deep Learning. Scholarpedia, 10(11):32832. Section on Backpropagation
  37. Dempster, A.P.; Laird, N.M.; Rubin, D.B. (1977). "Maximum Likelihood from Incomplete Data via the EM Algorithm". Journal of the Royal Statistical Society, Series B. 39 (1): 1–38. 
  38. "Rise of the machines". mydigitalpublication.com. Retrieved 5 July 2023. 
  39. Marr, Marr. "A Short History of Machine Learning - Every Manager Should Read". Forbes. Retrieved 28 Sep 2016. 
  40. Fukushima, Kunihiko (1980). "Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern The Recognitron Unaffected by Shift in Position" (PDF). Biological Cybernetics. 36: 193–202. doi:10.1007/bf00344251. Retrieved 5 June 2016. 
  41. Le Cun, Yann. "Deep Learning". Retrieved 5 June 2016. 
  42. Linde, Y.; Buzo, A.; Gray, R. (1980). "An Algorithm for Vector Quantizer Design". IEEE Transactions on Communications. 28: 84–95. doi:10.1109/TCOM.1980.1094577. 
  43. Marr, Marr. "A Short History of Machine Learning - Every Manager Should Read". Forbes. Retrieved 28 Sep 2016. 
  44. Hopfield, John (April 1982). "Neural networks and physical systems with emergent collective computational abilities" (PDF). Proceedings of the National Academy of Sciences of the United States of America. 79: 2554–2558. doi:10.1073/pnas.79.8.2554. Retrieved 8 June 2016. 
  45. Bozinovski, S. (1982). "A self-learning system using secondary reinforcement" . In Trappl, Robert (ed.). Cybernetics and Systems Research: Proceedings of the Sixth European Meeting on Cybernetics and Systems Research. North Holland. pp. 397–402.
  46. Marr, Marr. "A Short History of Machine Learning - Every Manager Should Read". Forbes. Retrieved 28 Sep 2016. 
  47. Rumelhart, David; Hinton, Geoffrey; Williams, Ronald (9 October 1986). "Learning representations by back-propagating errors" (PDF). Nature. 323: 533–536. doi:10.1038/323533a0. Retrieved 5 June 2016. 
  48. "A brief history of machine learning". slideshare.net. Retrieved 24 February 2020. 
  49. Dehaene S, Changeux JP. Experimental and theoretical approaches to conscious processing. Neuron. 2011 Apr 28;70(2):200-27.
  50. Changeux JP, Dehaene S. Hierarchical neuronal modeling of cognitive functions: from synaptic transmission to the Tower of London. Comptes Rendus de l'Académie des Sciences, Série III. 1998 Feb–Mar;321(2–3):241-7.
  51. Dehaene S, Changeux JP, Nadal JP. Neural networks that learn temporal sequences by selection. Proc Natl Acad Sci U S A. 1987 May;84(9):2727-31.
  52. "Machine Learning". springer.com. Retrieved 9 March 2020. 
  53. Watksin, Christopher (1 May 1989). "Learning from Delayed Rewards" (PDF). 
  54. Markoff, John (29 August 1990). "BUSINESS TECHNOLOGY; What's the Best Answer? It's Survival of the Fittest". New York Times. Retrieved 8 June 2016. 
  55. Watkins, C.J.C.H. (1989), Learning from Delayed Rewards (PDF) (Ph.D. thesis), Cambridge University 
  56. Watkins and Dayan, C.J.C.H., (1992), 'Q-learning.Machine Learning'
  57. Tesauro, Gerald (March 1995). "Temporal Difference Learning and TD-Gammon". Communications of the ACM. 38 (3). 
  58. Ho, Tin Kam (August 1995). "Random Decision Forests" (PDF). Proceedings of the Third International Conference on Document Analysis and Recognition. Montreal, Quebec: IEEE. 1: 278–282. ISBN 0-8186-7128-9. doi:10.1109/ICDAR.1995.598994. Retrieved 5 June 2016. 
  59. Cortes, Corinna; Vapnik, Vladimir (September 1995). "Support-vector networks" (PDF). Machine Learning. Kluwer Academic Publishers. 20 (3): 273–297. ISSN 0885-6125. doi:10.1007/BF00994018. Retrieved 5 June 2016. 
  60. Marr, Marr. "A Short History of Machine Learning - Every Manager Should Read". Forbes. Retrieved 28 Sep 2016. 
  61. Hochreiter, Sepp; Schmidhuber, Jürgen (1997). "LONG SHORT-TERM MEMORY" (PDF). Neural Computation. 9 (8): 1735–1780. doi:10.1162/neco.1997.9.8.1735. 
  62. LeCun, Yann; Cortes, Corinna; Burges, Christopher. "THE MNIST DATABASE of handwritten digits". Retrieved 16 June 2016. 
  63. Breunig, M. M.; Kriegel, H.-P.; Ng, R. T.; Sander, J. (2000). LOF: Identifying Density-based Local Outliers (PDF). Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. SIGMOD. pp. 93–104. ISBN 1-58113-217-4. doi:10.1145/335191.335388. 
  64. Friedman, Jerome; Hastie, Trevor; Tibshirani, Robert (2000). "Additive logistic regression: a statistical view of boosting". Annals of Statistics. 28 (2): 337–407. doi:10.1214/aos/1016218223. 
  65. Cui Yu, Beng Chin Ooi, Kian-Lee Tan and H. V. Jagadish Indexing the distance: an efficient method to KNN processing, Proceedings of the 27th International Conference on Very Large Data Bases, Rome, Italy, 421-430, 2001.
  66. Collobert, Ronan; Benigo, Samy; Mariethoz, Johnny (30 October 2002). "Torch: a modular machine learning software library" (PDF). Retrieved 5 June 2016. 
  67. Ham, Ji Hun; Daniel D. Lee; Lawrence K. Saul (2003). "Learning high dimensional correspondences from low dimensional manifolds" (PDF). Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003). 
  68. "What is scikit-learn ?". njtrainingacademy.com. Retrieved 5 March 2020. 
  69. 69.0 69.1 "Sharing is Caring with Algorithms". towardsdatascience.com. Retrieved 8 March 2020. 
  70. "Python's pandas library is on its way to v.1.0.0 – first release candidate has arrived". jaxenter.com. Retrieved 9 March 2020. 
  71. Liu, Fei Tony; Ting, Kai Ming; Zhou, Zhi-Hua (December 2008). "Isolation Forest". 2008 Eighth IEEE International Conference on Data Mining: 413–422. ISBN 978-0-7695-3502-9. doi:10.1109/ICDM.2008.17. 
  72. "Encog Machine Learning Framework". heatonresearch.com. Retrieved 8 March 2020. 
  73. "Apache Mahout". people.apache.org. Retrieved 9 March 2020. 
  74. "About". Kaggle. Kaggle Inc. Retrieved 16 June 2016. 
  75. Simon, Phil. Too Big to Ignore: The Business Case for Big Data. 
  76. "Accord.NET Framework – An extension to AForge.NET". crsouza.com/. Retrieved 9 March 2020. 
  77. Markoff, John (17 February 2011). "Computer Wins on 'Jeopardy!': Trivial, It's Not". New York Times. p. A1. Retrieved 5 June 2016. 
  78. Le, Quoc; Ranzato, Marc’Aurelio; Monga, Rajat; Devin, Matthieu; Chen, Kai; Corrado, Greg; Dean, Jeff; Ng, Andrew (12 July 2012). "Building High-level Features Using Large Scale Unsupervised Learning". CoRR. arXiv:1112.6209Freely accessible. 
  79. Markoff, John (26 June 2012). "How Many Computers to Identify a Cat? 16,000". New York Times. p. B1. Retrieved 5 June 2016. 
  80. "mlpy". mlpy.sourceforge.net. Retrieved 8 March 2020. 
  81. Taigman, Yaniv; Yang, Ming; Ranzato, Marc’Aurelio; Wolf, Lior (24 June 2014). "DeepFace: Closing the Gap to Human-Level Performance in Face Verification". Conference on Computer Vision and Pattern Recognition. Retrieved 8 June 2016. 
  82. "Popular Big Data Engine Apache Spark 2.0 Released". adtmag.com. Retrieved 8 March 2020. 
  83. Canini, Kevin; Chandra, Tushar; Ie, Eugene; McFadden, Jim; Goldman, Ken; Gunter, Mike; Harmsen, Jeremiah; LeFevre, Kristen; Lepikhin, Dmitry; Llinares, Tomas Lloret; Mukherjee, Indraneel; Pereira, Fernando; Redstone, Josh; Shaked, Tal; Singer, Yoram. "Sibyl: A system for large scale supervised machine learning" (PDF). Jack Baskin School Of Engineering. UC Santa Cruz. Retrieved 8 June 2016. 
  84. Woodie, Alex (17 July 2014). "Inside Sibyl, Google's Massively Parallel Machine Learning Platform". Datanami. Tabor Communications. Retrieved 8 June 2016. 
  85. "The Turing Test Is Not What You Think It Is | WNYC | New York Public Radio, Podcasts, Live Streaming Radio, News". WNYC. Retrieved 4 July 2023. 
  86. Goodfellow, Ian; Pouget-Abadie, Jean; Mirza, Mehdi; Xu, Bing; Warde-Farley, David; Ozair, Sherjil; Courville, Aaron; Bengio, Yoshua (2014). Generative Adversarial Networks (PDF). Proceedings of the International Conference on Neural Information Processing Systems (NIPS 2014). pp. 2672–2680. 
  87. "A Little spaCy Food for Thought: Easy to use NLP Framework". towardsdatascience.com. Retrieved 5 March 2020. 
  88. "Introducing spaCy". explosion.ai. Retrieved 5 March 2020. 
  89. "Keras". news.ycombinator.com. Retrieved 5 March 2020. 
  90. "Big-in-Japan AI code 'Chainer' shows how Intel will gun for GPUs". The Register. 2017-04-07. Retrieved 8 March 2020. 
  91. "Deep Learning のフレームワーク Chainer を公開しました" (in 日本語). 2015-06-09. Retrieved 8 March 2020. 
  92. "Apache SINGA". singa.apache.org. Retrieved 8 March 2020. 
  93. "Google achieves AI 'breakthrough' by beating Go champion". BBC News. BBC. 27 January 2016. Retrieved 5 June 2016. 
  94. "AlphaGo". Google DeepMind. Google Inc. Retrieved 5 June 2016. 
  95. Dean, Jeff; Monga, Rajat (9 November 2015). "TensorFlow - Google's latest machine learning system, open sourced for everyone". Google Research Blog. Retrieved 5 June 2016. 
  96. Dunn, Jeffrey (10 May 2016). "Introducing FBLearner Flow: Facebook's AI backbone". Facebook Code. Facebook. Retrieved 8 June 2016. 
  97. Shead, Sam (10 May 2016). "There's an 'AI backbone' that over 25% of Facebook's engineers are using to develop new products". Business Insider. Allure Media. Retrieved 8 June 2016. 
  98. "PyTorch Releases Major Update, Now Officially Supports Windows". medium.com. Retrieved 8 March 2020. 
  99. "CellCognition Explorer". software.cellcognition-project.org. Retrieved 8 March 2020. 
  100. "A deep learning and novelty detection framework for rapid phenotyping in high-content screening.". PMC 5687041Freely accessible. PMID 28954863. doi:10.1091/mbc.E17-05-0333. 
  101. "Machine learning". Google Trends. Retrieved 11 March 2021. 
  102. "Machine learning". books.google.com. Retrieved 11 March 2021. 
  103. "Machine learning". wikipediaviews.org. Retrieved 11 March 2021.