unit 3.1 - Neurons

Ok, so now we know tensors and found out about pytorch - how do we use them machine learning?

Suppose I do not have a formula for what a cat is, but I have many pictures of a cat.

image0

What can I do?

I can create a pseudo-equation with neural networks. This is similar to the neural network in our brain that can tell if we are seeing a cat…

REAL NEURONS

This on the left is a real neuron in our brain.

It has some input (from dendrites), some output (from axon) and an area that does some computation - or thinking (soma / nucleus)! It uses ionic current spikes to communicate to other neurons over large distances. The communication wire is the axon. Its pulses are asynchronous and are emitted once a neuron soma reaches a specific threshold voltage. Synapses facilitate the release of ionic current that polarize the soma of the receiving neurons. The amount of synapses present between neurons is the weight of the connection between neurons.

Computer scientists many years ago wanted to know how we can tell apart a picture of a cat, and did not know how to do this with computers.

So they thought of making artificial neurons. After all our brain can solve this problem = can tell if a picture is of a cat, and our brain uses neurons to do this.

image1

To date we still do not know how our real brain works and how it can tell a cat apart, but we have made neural networks that can do that and we have found out how.

This still does not tell us how our brain performs this task… but at least we have something computers can use

ARTIFICIAL NEURON

Even in an artificial neuron there are inputs (\(x_1\), \(x_2\),…) and outputs (\(y_k\)), and there is a region that “thinks” or computes.

\[y_k = {\sigma( \sum{W_{kj} * x_{j}} ) }\]

The function we here call \(\sigma\) is the neuron non-linearity. THIS IS A KEY COMPONENT of a neuron and one that allows neural network to stack layers without collapsing into a linear function.

Today most AI algorithms use the ReLU non-linearity. The function Sigmoid \(\sigma\) and the function Hyperbolic Tangent, Tanh are used in introduction to neural networks because they make the math easier.

Real vs artificial neurons

To summarize the differences

neuron

real

artificial

activations / amplitude coding

pulses

real-values

timing

asynchronous

synchronous

nonlinearity

threshold

programmable function

wires

axons

electrical wires in a microchip

weights

stored in synapses

stored in computer memory

MAGIC

Yes magic, because many of these neurons can do amazing things if trained appropriately. They can approximate any functions.

Many functions like XOR, AND etc are boring… but they can also approximate the function “picture of a cat?”

Artificial neurons are at the core of all the new and successful AI algorithms such as:

  • ChatGPT

  • LLM

  • DALL-E

  • etc…