Neural Networks and Beyond
Deep learning. Machine learning. Neural networks. All buzzwords that conjure visions of the frontiers of science, the crème de la crème of technology, and–maybe for some–even a utopian (or dystopian) future for humanity. These words also served as the perfect response when my father, who was paying thousands of dollars for my education, asked skeptically what type of job cognitive science could get you. Despite much of the public not knowing the exact denotation or textbook meaning of these terms, the buzz from the media causes them to carry a certain connotation that evokes these emotionally charged images. Even when I first knew I wanted to study cognitive science or computer science, I only thought of the field in broad and glamorized terms; the mixing of the capabilities of the human mind with the vast power of technology and computation. So besides answering loaded career questions to family and friends, what are neural networks good for?
There is the vying question of whether researchers are scratching the surface of emulating consciousness and omni-directional intelligence, or if the entire concept is blown out of proportion due to the glamorization of the idea of neural networks and deep learning like my own. If the latter is true, then what is the next great expansion of artificial intelligence and how far are we from realizing the capacity of the human mind?
Neural Networks: Application, History, and How They Work
Whether neural networks are capable of extrapolation and creation does not take away from the transformative tool they are and how impactful the rapidly evolving field of deep learning is. Neural networks can model an incredible array of mathematical and statistical problems, including speech recognition, facial and image classification, natural language processing, and many other forms of unstructured data, some of which we are still discovering. Neural networks fall under the umbrella of the field of artificial intelligence, but more specifically a subset of machine learning–a method of algorithms learning through past experiences. They were first proposed in 1944 by Warren McCullough and Walter Pitts as a bright and promising idea combining neuroscience and technology, but the initial spark died down because they lacked the processing power of GPUs and sheer data we have today. Tomaso Poggio, Professor of Brain and Cognitive Sciences at MIT, observed, “ideas in science are a bit like epidemics of viruses.” His metaphor of periodicity applies to many ideas in technology and neural networks are no exception. The idea rose and fell throughout the second half of the twentieth century, not truly succeeding until its most recent resurgence in the last 10 to 15 years.
Neural networks are modeled loosely on the human brain– with extra emphasis on ‘loosely.’ A node or ‘neuron’ is a computational unit that has one or more weighted values and acts as a miniature function. Nodes are then organized into a layer to comprise a network. Today, they have blossomed from an artificial neural network with a single layer of nodes into up to 100 layers. The first layer is the input layer and the output layer is the resulting transformation once propagated through the middle ‘hidden’ layers. That is what the “deep” in “deep learning” refers to — the depth of the network’s hidden layers.
So what is the ‘magic’ that goes on within these hidden layers? It turns out that magic looks a lot more like calculus than anything else. The full depth of the transformations that the data undertakes as it is fed forward within these layers is a complicated, but beautiful arrangement of millions or even billions of numbers in the form of matrices and vectors propagating through nodes. Attaining successful outputs through this process is not magical either and requires a vast amount of initial data to train on. The model demands an enormous amount of computation and trial and error to fine-tune each step of the process.
Drawbacks of Neural Networks
Deep learning very effectively analyzes large amounts of unstructured data and extracts representations without humans having to explicitly program specific features. Nevertheless, its versatility comes at a high cost. The training process consumes lots of time and energy and is therefore expensive both monetarily and environmentally. Additionally, deep learning only works if the correct kind of data is fed in — “if you put garbage in, you’ll get garbage out” holds true. The effectiveness of the machine is highly dependent on the quality of the data; if there are biases in the data source the machine will exhibit these biases. According to a new study by the Georgia Institute of Technology, self-driving cars may be better at detecting pedestrians with light skin due to the models having been mostly trained on examples of light-skinned pedestrians. Human bias has seeped into image recognition and many other automated decision-making systems we put our faith in.
As society continues to improve at managing data, deep learning training will have the potential to become more streamlined. However, deep learning is being held back by its lack of interpretability in the hidden layers. Neural networks can serve as Automated Essay Scoring (AES) systems that can score with high accuracy and fairness compared to human graders. But intellectually, there’s something unsatisfying about neural nets. Enough training may revise a network’s settings until it can usefully classify data, but what do those settings actually mean? Human decision-making is far from perfect logically but at least we can rationalize and explain our thought processes. Despite not knowing the workings of the human mind, we all have one (assumingly), and therefore, have significant empathy and understanding of each other. Looking at the weights of individual connections won’t satisfy this same understanding and when it comes to real-life application uses such as healthcare and medical operations people tend to want this clarity. The black box design requires trust in the underlying causation, a trust that will likely limit neural networks’ application when lives and other real-life consequences are at stake.
Lastly, as exciting as neural networks sound, they are far from the general intelligence that AI chases. The information the model attains through training can not be compared to the common sense humans develop through life experience. Neural networks can far surpass humans in one specific area: a person could spend their entire life mastering an arcade game and with ease, a deep learning algorithm could steal that precious high score. But with even the smallest tweak in the concept of the game, the neural network might be back at square one, whereas the human would successfully adapt. How might we give them this flexibility?
Giving Life to Neural Networks
“Supervision is the opium of AI researchers”, one professor said last week, referring to the technique of unsupervised learning, or letting the model discover its own patterns and information. This is the beauty of studying a cutting-edge field, no sooner were we tested on a concept than we were encouraged to move beyond. Many sources agree that Artificial General Intelligence, in contrast to static, statistical models, must be able to “come alive” and make predictions in a dynamic world with constantly changing sensory inputs. What better way to introduce these examples — neuromorphic computing, biologically constrained AI, and reinforcement learning — than through parallels between computer science and the brain?
1) Neuromorphic Computing
One rising “rockstar” is neuromorphic computing. In some recent chip designs, silicon neurons receive and send spikes like real neurons. There is even plasticity, reinforcing frequently-fired connections over those that do not. Impressively, Intel’s Loihi-2 chip can use the length of intervals between spikes to represent more information, another property of our brains.
The chip also naps. When exposed to noise that our brains produce in sleep, the previously unstable network was able to “recharge” and do better in classification tasks. Of course, it is very fast compared to other chips like GPUs, as its interconnectedness reduces memory accesses and increases efficiency. After “smelling” only a single sample of chemicals among 10 passing through wind tunnels, the chip was able to identify it, compared to thousands needed for deep learning! Not surprisingly, the biggest immediate application of this chip is for mobile robots that need to “consume minimal power” to navigate and learn the world.
2) Biologically constrained AI (HTM model)
While chips wire it physically, we can also engineer real neurons into the neurobiology of networks, most famously advocated by the company Numenta and its founder Jeff Hawkins, who also founded the Redwood Center for Theoretical Neuroscience, a part of Berkeley’s Helen Wills Neuroscience Institute. Their hierarchical temporal memory (HTM) model rivals the performance of leading deep learning models while being much more brain-inspired. Their neuron model has dendrites, axons, synapses, and dendritic processing.
Rather than achieving plasticity through the frequency of firing like the chip, HTM does it through prediction. A neuron does not simply pass along action potentials, but gets ready after a signal, and only fires when other neurons in the predictive state are inhibited. Pathways that are consistent with the predictions are reinforced.
Under this framework, our brains have the ability to encode data into uniform stuff, which is called “sparse distributed representations” (SDRs). In an SDR, nearby bit positions represent similar properties. If we invert a bit, then the description changes but not radically. This is different from current encoders because each bit is actually interpretable:
While not made of silicon, this model’s physicality comes from its robust representation of movement. Like an animal brain, these networks have grid cells that map movements into locations in reference frames. They mimic how our sensorimotor system works: receiving location signal, matching it with object features, and creating representations of the object that allows for better prediction. So the cortical columns know not only what they’re looking at, but also where they’re looking relative to the object. The result is an embodied mind that actively moves sensors to build predictive models of the world.
3) Reinforcement Learning
Finally, we return to the software of the brain with an area of AI that increasingly influences how we view cognition: reinforcement learning. It is extremely powerful when combined with deep learning and has wide applications in everything from autonomous driving to finance to robotics, as well as understanding how animals, including humans, make decisions. Like a child new to the ways of the world, a reinforcement learning agent doesn’t know which states are good or what its actions imply. Of course, in some contexts, it has a preconceived notion of what experiences it’s more likely to come across, but it is not common. And rather than just thinking through what it’ll do, it needs to learn by doing, and decide on which actions to take at each time step based on the reward and the observed environment.
One option is passive learning, where the agent follows a single “policy”, or plan of action, and discovers the values of the subsequent states to see how good it is. From a practicality standpoint, this would be a pretty bad way to get through life so instead, it explores through active learning. Once it eventually settles into a good trajectory and finds an optimal plan of action, how actions are selected and the quality of decisions ultimately do not matter.
Finally, as expected from a powerful intelligence, you can make the learning process optimal. The key is regret minimization, which allows the agent to explore more methodically in order to exploit the best rewards at the best time. (If you want to know how you can harness the power of these intricate methods to become a better agent in your own life, I highly recommend Algorithms to Live By Brian Christian and Thomas L. Griffiths.)
The road to general intelligence is teeming with life. Reinforcement learning generalizes by getting better at adapting to unseen environments, the chip generalizes by constantly rewiring silicon circuitry, and the HTM model generalizes to a common cortical algorithm across senses and abstraction layers. Behold this dazzling dance between machine and biology, for this is how neural networks could express the sheer joy of being alive.
- Hardesty, L. (2017, April 14). Explained: Neural networks | MIT News | Massachusetts Institute of Technology. MIT News. Retrieved March 21, 2022, from https://news.mit.edu/2017/explained-neural-networks-deep-learning-0414
- Nielsen, M. (n.d.). Neural networks and deep learning. Neural networks and deep learning. Retrieved March 13, 2022, from http://neuralnetworksanddeeplearning.com/chap1.html
- Hoffman, J., Wilson, B., & Morgenstern, J. (2019). Predictive Inequity in Object Detection. rXiv:1902.11097v1 [cs.CV]
- Service, R. F. (2021, September 30). LEARNING CURVE. Science, Vol 374, Issue 6563. https://www.science.org/content/article/new-brain-inspired-chips-could-soon-help-power-autonomous-robots-and-self-driving-cars
- Hole, K.J., Ahmad, S. A thousand brains: toward biologically constrained AI. SN Appl. Sci. 3, 743 (2021). https://doi.org/10.1007/s42452-021-04715-0
- UC Berkeley CS188 Intro to AI — Course Materials http://ai.berkeley.edu/lecture_slides.html
This article was written by Emily Moberly, who is a freshman undergraduate student at UC Berkeley studying Cognitive Science, and Anrui Gu, who is a junior undergraduate student at UC Berkeley studying Computer Science.
This article was edited by Jacob Marks, a junior undergraduate pre-medical student at UC Berkeley studying Cognitive Science, and Annabel Davis, a senior undergraduate student at UC Berkeley studying Cognitive Science.