unit 3.1 - Neurons

Ok, so now we know tensors and found out about pytorch - how do we use them machine learning?

Suppose I do not have a formula for what a cat is, but I have many pictures of a cat.

What can I do?

I can create a pseudo-equation with neural networks. This is similar to the neural network in our brain that can tell if we are seeing a cat…

REAL NEURONS

This on the left is a real neuron in our brain.

It has some input (from dendrites), some output (from axon) and an area that does some computation - or thinking (soma / nucleus)! It uses ionic current spikes to communicate to other neurons over large distances. The communication wire is the axon. Its pulses are asynchronous and are emitted once a neuron soma reaches a specific threshold voltage. Synapses facilitate the release of ionic current that polarize the soma of the receiving neurons. The amount of synapses present between neurons is the weight of the connection between neurons.

Computer scientists many years ago wanted to know how we can tell apart a picture of a cat, and did not know how to do this with computers.

So they thought of making artificial neurons. After all our brain can solve this problem = can tell if a picture is of a cat, and our brain uses neurons to do this.

To date we still do not know how our real brain works and how it can tell a cat apart, but we have made neural networks that can do that and we have found out how.

This still does not tell us how our brain performs this task… but at least we have something computers can use

ARTIFICIAL NEURON

Even in an artificial neuron there are inputs (\(x_1\), \(x_2\),…) and outputs (\(y_k\)), and there is a region that “thinks” or computes.

\[y_k = {\sigma( \sum{W_{kj} * x_{j}} ) }\]

The function we here call \(\sigma\) is the neuron non-linearity. THIS IS A KEY COMPONENT of a neuron and one that allows neural network to stack layers without collapsing into a linear function.

Today most AI algorithms use the ReLU non-linearity. The function Sigmoid \(\sigma\) and the function Hyperbolic Tangent, Tanh are used in introduction to neural networks because they make the math easier.

Real vs artificial neurons

To summarize the differences

neuron	real	artificial
activations / amplitude coding	pulses	real-values
timing	asynchronous	synchronous
nonlinearity	threshold	programmable function
wires	axons	electrical wires in a microchip
weights	stored in synapses	stored in computer memory

MAGIC

Yes magic, because many of these neurons can do amazing things if trained appropriately. They can approximate any functions.

Many functions like XOR, AND etc are boring… but they can also approximate the function “picture of a cat?”

Artificial neurons are at the core of all the new and successful AI algorithms such as:

ChatGPT
LLM
DALL-E
etc…