Brains and Deep Nets

Are Brains Like Deep Neural Networks?

By Anil Ananthaswamy
Quanta magazine, October 2020

Edited by Andy Ross

Deep neural network are computers inspired by the neurological wiring of living brains. The deep networks best at classifying speech, music, and scents have architectures that parallel brain systems. So do deep nets that can look at a 2D scene and infer 3D structure.

Artificial neural networks use perceptrons to represent biological neurons. The networks have at least two layers of perceptrons, one for the input and one for the output. A deep neural network sandwiches hidden layers between input and output.

Deep nets can be trained to pick out patterns in data. An algorithm iteratively adjusts the strength of connections between perceptrons until the network associates a given input with the correct label. Once trained, a deep net can classify new input.

A convolutional neural network (CNN) is a deep net with hidden layers that apply a convolution filter to every portion of an image, with the more basic features captured in the early stages of the network and the more complex features in the deeper stages, as in the primate visual system. A CNN starts with random filter values and learns as it goes.

A team designed a CNN to classify sounds into speech and music. They found that networks with dedicated pathways after the input layer outdid networks that fully shared pathways. But a hybrid network did almost as well and matched up well against humans.

A CNN for auditory processing, with intermediate layers for the responses of the primary auditory cortex and deeper layers for higher areas in the auditory cortex, was good at predicting human brain activity.

A brain region called the fusiform face area (FFA) is specialized for the identification of faces. A team trained one deep net to recognize faces and one to recognize objects. The deep net for faces was bad at recognizing objects and vice versa. A single network trained on both tasks organized itself internally to segregate the processing of faces and objects in the later stages of the network. This is how the human visual system is organized.

A team designed a deep net to model the olfactory system of a fruit fly. They built their deep net with four layers and trained it. They found that it converged on much the same connectivity as seen in the fruit fly brain.

A team used a deep net to model the primate ventral visual stream. They showed images designed to elicit high levels of activity in V4 neurons to monkeys and elevated the activity of neural sites as predicted by the model.

Metamers are physically distinct input signals that produce the same representation in a system. Two audio metamers have different wave forms but sound the same to a human. Using a deep-net model of the auditory system, a team designed metamers that activated different stages of the neural network in the same way the audio clips did.

Humans recognized the metamers that produced the same activation as the corresponding audio clips in the early stages of the neural network but did not recognize metamers with matching activations deeper in the network.

Teams aim to develop unsupervised deep nets that can reconstruct a 3D scene from 2D input and reason about causation. They begin with parameters that describe an object to be rendered on a background. A generative model creates a 3D scene from the parameters and produces a 2D image of that scene as viewed from a certain position. Using data from the model, a CNN predicts the likely parameters of a 3D scene from an unfamiliar 2D image.

A team tested the model by verifying its predictions about activity in the inferior temporal cortex of primates. They found that the last three layers of the network matched the last three layers of the primate face processing network.

AR Conclusion: Brains use such models.