unit 7.0 - Artificial brains

What does a brain do? It is an organ capable of moving a body in an environment to fulfil the life mission of an entity. In other words, it needs to be able to sense the environment, process the information, and act on it. A brain is connected to a variety of sensor, for example vision, audition, proprioception, etc. It is also connected to a variety of actuators, for example muscles, vocal cords, etc. The brain is the organ that processes the information from the sensors and decides what to do with the actuators. Interaction with the worlds thus involves a repetitive sequence of sensing, processing, and acting.

Great - but how doe a brain learn to do all that? And what architecture does it have to have to be able to do that? And how can we build an artificial brain that can do the same?

These are the questions we will try to answer in this unit.

Biological brains

Biological brain are the source of inspiration for artificial brains. They are the only example of a system that can do what we want to do with artificial brains. So, it is natural to look at them and try to understand how they work.

Biological brains are made of neurons. Neurons are the basic building blocks of the brain. They are connected to each other through synapses. The connections between neurons are not fixed, but they can change over time. This is the basis of learning in the brain.

Nobody knows exactly the architecture of the brain. Many people have sketched diagrams of the brain, but likely none of our existing diagrams captures the full complexity of the brain. It is a very complex system, and we are still far from understanding it completely. However, we know some things. For example, we know that the brain is organized in areas. Each area is specialized in a specific task. For example, the visual cortex is specialized in processing visual information, the auditory cortex is specialized in processing auditory information, etc.

We also know that the brain is organized in layers. The layers are connected to each other in a hierarchical way. The lower layers process the raw sensory information, while the higher layers process more abstract information.

Finally, we know that the brain is a recurrent system. This means that the information flows in loops. This is important because it allows the brain to process information over time.

Learning in biological brains is performed through a process called synaptic plasticity. This is the ability of synapses to change their strength over time. This is the basis of learning in the brain. When a synapse is strengthened, it means that the connection between the two neurons is reinforced. When a synapse is weakened, it means that the connection between the two neurons is weakened. Hebbian Learning is a learning rule that says that synapses that are active at the same time should be strengthened. This is the basis of learning in the brain.

Artificial brains

Artificial brains are inspired by biological brains. Today the most popular artificial neural network that mimics the brain in the Transformer architecture. The Transformer architecture is a deep neural network that is based on the idea of self-attention. Self-attention is a mechanism that allows the network to focus on different parts of the input sequence. This is similar to what the brain does when it processes information.

The Transformer is the network that powers Large Language Models (LLMs) like GPT-4. These models are able to generate human-like text by processing a large amount of text data. They are able to do this because they are able to learn the structure of the text data and generate new text that is similar to the input data.

These models are able to learn “knowledge graphs” from text data. A knowledge graph is a graph that represents the relationships between entities in the world. For example, a knowledge graph can represent the relationships between people, places, and things. This is useful because it allows the model to generate text that is coherent and relevant to the input data.

But the Transformers in GPT-4 are only a part of the picture. First of all they only understand text. We will need to make them multi-modal to be able to understand the world. Second, they are not able to act on the world. We will need to connect them to a body and actuators to be able to do that. In other words they will need to be “embodied”. In other word the sensing and actions need to be coming from the robot body. Learning would be much impossible if we cannot correlate perception and action. This requires the robot and learning experience to be embodied.

Armed with a body and multi-modal sensors, large Transformers would now become Large World Models or LWM. They would be able to learn a complex knowledge graph in the real world. This is what will make them closer to a real brain.

What about learning?

Learning occurs in two main modalities. First an artificial brain needs to connect multi-modal sensory data together. This is done by performing sel-supervised co-occurrence learning. This means that if two signal happen at the same time, they are correlated.

This correlation forms the basis of “concepts” in a LWM. Concepts for an LWM are like tokens (words) for an LLM. They are the building blocks of knowledge.

Second, the brain is always predicting. From a concept at time t, it wants to predict what the next concepts is going to be at time t+1. This is done by a LWM transformer model.

The overall brain diagram for a LWM thus given as:

This model is called “mix-match” and it uses both learning modality to learn concepts and their evolution in space and time. Notice that the model can also be provided with generative output: sure it will have to produce actions for a robot, but it can also produce text or speech,a nd even generate images / graphics if needed. This way it should eb able to communicate with humans in a natural way, and also be able to operate in the natural environment.

Notice this LWM is just one possible model. There are many other models that can be used to build an artificial brain. The important thing is that the model should be able to learn from multi-modal sensory data, and be able to act on the world.

The future will tell us soon what the best model is. But we are getting closer to building an artificial brain that can do what a biological brain can do.