# How Claude Shannon Helped Kick-start Machine Learning

## The “father of information theory” also paved the way for AI

KEYSTONE/GETTY IMAGES

Among the great engineers of the 20th century, who contributed the most to our 21st-century technologies? I say: Claude Shannon.

Shannon is best known for establishing the field of information theory. In a 1948 paper, one of the greatest in the history of engineering, he came up with a way of measuring the information content of a signal and calculating the maximum rate at which information could be reliably transmitted over any sort of communication channel. The article, titled “A Mathematical Theory of Communication,” describes the basis for all modern communications, including the wireless Internet on your smartphone and even an analog voice signal on a twisted-pair telephone landline. In 1966, the IEEE gave him its highest award, the Medal of Honor, for that work.

If information theory had been Shannon’s only accomplishment, it would have been enough to secure his place in the pantheon. But he did a lot more.

A decade before, while working on his master’s thesis at MIT, he invented the logic gate. At the time, electromagnetic relays—small devices that use magnetism to open and close electrical switches—were used to build circuits that routed telephone calls or controlled complex machines. However, there was no consistent theory on how to design or analyze such circuits. The way people thought about them was in terms of the relay coils being energized or not. Shannon showed that Boolean algebra could be used to move away from the relays themselves, into a more abstract understanding of the function of a circuit. He used this algebra of logic to analyze, and then synthesize, switching circuits and to prove that the overall circuit worked as desired. In his thesis he invented the AND, OR, and NOT logic gates. Logic gates are the building blocks of all digital circuits, upon which the entire edifice of computer science is based.

In 1950 Shannon published an article in Scientific American and also a research paper describing how to program a computer to play chess. He went into detail on how to design a program for an actual computer. He discussed how data structures would be represented in memory, estimated how many bits of memory would be needed for the program, and broke the program down into things he called subprograms. Today we would call these functions, or procedures. Some of his subprograms were to generate possible moves; some were to give heuristic appraisals of how good a position was.

While working on his master’s thesis at MIT, Shannon invented the logic gate.

Shannon did all this at a time when there were fewer than 10 computers in the world. And they were all being used for numerical calculations. He began his research paper by speculating on all sorts of things that computers might be programmed to do beyond numerical calculations, including designing relay and switching circuits, designing electronic filters for communications, translating between human languages, and making logical deductions. Computers do all these things today. He gave four reasons why he had chosen to work on chess first, and an important one was that people believed that playing chess required “thinking.” Therefore, he reasoned, it would be a great test case for whether computers could be made to think.

Shannon suggested it might be possible to improve his program by analyzing the games it had already played and adjusting the terms and coefficients in its heuristic evaluations of the strengths of board positions it had encountered. There were no computers readily available to Shannon at the time, so he couldn’t test his idea. But just five years later, in 1955, Arthur Samuel, an IBM engineer who had access to computers as they were being tested before being delivered to customers, was running a checkers-playing program that used Shannon's exact method to improve its play. And in 1959 Samuel published a paper about it with “machine learning” in the title—the very first time that phrase appeared in print.

So, let’s recap: information theory, logic gates, non-numerical computer programming, data structures, and, arguably, machine learning. Claude Shannon didn’t bother predicting the future—he just went ahead and invented it, and even lived long enough to see the adoption of his ideas. Since his passing 20 years ago, we have not seen anyone like him. We probably never will again.

This article appears in the February 2022 print issue as “Claude Shannon’s Greatest Hits.”

The Conversation (3)
Goran Djukanovic20 Mar, 2022
M

Hat down to Channon as one of the greatist minds in human history.

However, regarding the sentence in this article "In his thesis he invented the AND, OR, and NOT logic gates" it should of been noted that this is not an isolated and independent invention.

It is probably a less known fact, that NIKOLA TESLA is actually the acknowledged inventor of the electronic AND logic gate circuit, a critical element of every digital computer today, and he even patented it (claim started in 1899). A vast number of sources confirms it, i.e. https://www.computer.org/csdl/magazine/dt/2007/06/mdt2007060624/13rRUy0ZzTA, http://what-when-how.com/Tutorial/topic-203v31/The-History-of-Visual-Magic-in-Computers-180.html etc.

Scott Davidson17 Feb, 2022
LM

Thanks for this column. Shannon was one of the greatest minds of the 20th century. The professor of the logic design class I took my sophomore year at MIT gave us the paper based on his MS - no doubt the most significan Masters in the history of technology. I worked in a lab across the hall from his office for my Bachelor's Thesis. Alas, he never seemed to be there.

Besides all the accomplishments you list, he was a good juggler too, and there is some writing about that in his collected works.

Andrew Commons28 Jan, 2022
SM

His contribution to cryptography seems to be ignored here.

His, originally classified, paper "A Mathematical Theory of Cryptography " was published in 1945.

https://www.iacr.org/museum/shannon/shannon45.pdf

The declassified version, "Communications Theory of Secrecy Systems" was published in 1949.

https://web.archive.org/web/20120120001953/http://www.alcatel-lucent.com/bstj/vol28-1949/articles/bstj28-4-656.pdf

This was a year after “A Mathematical Theory of Communication” but you have a chicken and egg problem, did his cryptography research drive the more generalised theory?

He was invited to be a member of the NSA Scientific Advisory Board in 1954:

https://www.nsa.gov/Portals/75/documents/news-features/declassified-documents/friedman-documents/correspondence/FOLDER_030/41788649082765.pdf

And was a documented member of the Mathematics Panel in 1956:

https://www.nsa.gov/portals/75/documents/news-features/declassified-documents/friedman-documents/panel-committee-board/FOLDER_165/41895609093452.pdf

I would assume that other contributions in this area may still be classified.