Neural Networks

Neural Networks Basics

Understand how artificial neural networks work, from perceptrons to deep learning.

Biological Inspiration

Artificial neural networks are loosely inspired by the brain. The brain has ~86 billion neurons connected by ~100 trillion synapses. ANNs have artificial neurons connected by weighted edges.

The Perceptron

The simplest neural unit. It:

  1. Takes multiple inputs
  2. Multiplies each by a weight
  3. Sums them up
  4. Adds a bias
  5. Applies an activation function
  6. Outputs a result

Activation Functions

  • Sigmoid: Output between 0 and 1 (binary classification output)
  • ReLU: max(0, x) — most common for hidden layers
  • Softmax: Multi-class probability distribution
  • Tanh: Output between -1 and 1

Backpropagation

The algorithm for training neural networks. It:

  1. Makes predictions (forward pass)
  2. Computes the error (loss)
  3. Calculates how each weight contributed to the error (backward pass)
  4. Updates weights to reduce error

Example

python
import numpy as np

# Implement a simple neural network from scratch

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def sigmoid_derivative(x):
    s = sigmoid(x)
    return s * (1 - s)

def relu(x):
    return np.maximum(0, x)

class SimpleNeuralNetwork:
    def __init__(self, input_size, hidden_size, output_size):
        # Initialize weights randomly
        self.W1 = np.random.randn(input_size, hidden_size) * 0.01
        self.b1 = np.zeros((1, hidden_size))
        self.W2 = np.random.randn(hidden_size, output_size) * 0.01
        self.b2 = np.zeros((1, output_size))

    def forward(self, X):
        # Hidden layer
        self.z1 = np.dot(X, self.W1) + self.b1
        self.a1 = relu(self.z1)

        # Output layer
        self.z2 = np.dot(self.a1, self.W2) + self.b2
        self.a2 = sigmoid(self.z2)

        return self.a2

    def backward(self, X, y, learning_rate=0.01):
        m = X.shape[0]

        # Output layer gradient
        dz2 = self.a2 - y
        dW2 = np.dot(self.a1.T, dz2) / m
        db2 = np.sum(dz2, axis=0, keepdims=True) / m

        # Hidden layer gradient
        dz1 = np.dot(dz2, self.W2.T) * (self.z1 > 0)  # ReLU derivative
        dW1 = np.dot(X.T, dz1) / m
        db1 = np.sum(dz1, axis=0, keepdims=True) / m

        # Update weights
        self.W1 -= learning_rate * dW1
        self.b1 -= learning_rate * db1
        self.W2 -= learning_rate * dW2
        self.b2 -= learning_rate * db2

# XOR problem (non-linearly separable)
X = np.array([[0,0], [0,1], [1,0], [1,1]])
y = np.array([[0], [1], [1], [0]])

nn = SimpleNeuralNetwork(2, 4, 1)

# Train
for epoch in range(10000):
    output = nn.forward(X)
    nn.backward(X, y)

    if epoch % 2000 == 0:
        loss = np.mean((output - y) ** 2)
        print(f"Epoch {epoch}, Loss: {loss:.4f}")

# Predict
print("Predictions:", nn.forward(X).round(2).flatten())
print("Actual:     ", y.flatten())
Try it yourself — PYTHON