How to Code a Neural Network from Scratch (Step-by-Step Guide)
Neural networks are at the core of modern artificial intelligence, powering applications such as image recognition, natural language processing, and recommendation systems. While many developers rely on high-level libraries, understanding how a neural network works under the hood is a major advantage for any software engineer.
In this guide, you’ll learn how to code a simple neural network from scratch, understand the math behind it, and see how training actually works.
What Is a Neural Network?
A neural network is a computational model inspired by the human brain. It consists of layers of interconnected units called neurons, where each neuron:
- Receives inputs
- Applies weights
- Adds a bias
- Passes the result through an activation function
The goal is to learn weights and biases that minimize prediction error.
Core Components of a Neural Network
Before writing code, let’s understand the building blocks.
1. Neurons and Weights
Each neuron computes:
z = (w1 * x1) + (w2 * x2) + ... + b
Where:
x= inputw= weightb= bias
2. Activation Function
The activation function introduces non-linearity.
Common examples:
- Sigmoid
- ReLU
- Tanh
We’ll use Sigmoid for simplicity:
sigmoid(x) = 1 / (1 + e^(-x))
3. Loss Function
The loss function measures how wrong the prediction is.
For simple regression:
Mean Squared Error (MSE)
4. Backpropagation
Backpropagation adjusts weights by computing gradients and minimizing loss using gradient descent.
Step 1: Define the Neural Network Structure
We’ll build a single hidden-layer neural network using Python and NumPy.
import numpy as np
Step 2: Activation Function
def sigmoid(x): return 1 / (1 + np.exp(-x)) def sigmoid_derivative(x): return x * (1 - x)
Step 3: Initialize the Network
class NeuralNetwork: def __init__(self, input_size, hidden_size, output_size): self.weights_input_hidden = np.random.rand(input_size, hidden_size) self.weights_hidden_output = np.random.rand(hidden_size, output_size) self.bias_hidden = np.random.rand(hidden_size) self.bias_output = np.random.rand(output_size)
Step 4: Forward Propagation
def forward(self, X): self.hidden_input = np.dot(X, self.weights_input_hidden) + self.bias_hidden self.hidden_output = sigmoid(self.hidden_input) self.output_input = np.dot(self.hidden_output, self.weights_hidden_output) + self.bias_output self.output = sigmoid(self.output_input) return self.output
Step 5: Backpropagation
def backward(self, X, y, learning_rate): output_error = y - self.output output_delta = output_error * sigmoid_derivative(self.output) hidden_error = output_delta.dot(self.weights_hidden_output.T) hidden_delta = hidden_error * sigmoid_derivative(self.hidden_output) self.weights_hidden_output += self.hidden_output.T.dot(output_delta) * learning_rate self.weights_input_hidden += X.T.dot(hidden_delta) * learning_rate self.bias_output += np.sum(output_delta, axis=0) * learning_rate self.bias_hidden += np.sum(hidden_delta, axis=0) * learning_rate
Step 6: Train the Neural Network
def train(self, X, y, epochs, learning_rate): for epoch in range(epochs): self.forward(X) self.backward(X, y, learning_rate) if epoch % 1000 == 0: loss = np.mean(np.square(y - self.output)) print(f"Epoch {epoch}, Loss: {loss}")
Step 7: Test with Sample Data
X = np.array([[0,0], [0,1], [1,0], [1,1]]) y = np.array([[0], [1], [1], [0]]) nn = NeuralNetwork(input_size=2, hidden_size=4, output_size=1) nn.train(X, y, epochs=10000, learning_rate=0.1) print(nn.forward(X))
This example trains the network to solve the XOR problem, a classic test that simple linear models cannot solve.
What You Just Built
✔ A neural network from scratch
✔ Forward propagation
✔ Backpropagation
✔ Gradient descent training
✔ Non-linear decision making
Understanding this gives you deep insight into how modern AI frameworks work internally.
How This Scales in Real Applications
In real-world systems:
- Libraries like TensorFlow and PyTorch handle gradients automatically
- GPUs accelerate matrix operations
- Networks grow deeper and wider
- Regularization prevents overfitting
But the core logic remains exactly what you coded above.
Final Thoughts
Coding a neural network from scratch is one of the best ways to understand AI beyond buzzwords. Even if you later rely on high-level frameworks, this knowledge helps you:
- Debug AI models more effectively
- Design better architectures
- Integrate AI into production systems confidently
If you’re a software engineer looking to move deeper into AI, this foundational understanding is non-negotiable.
TensorFlow Version (Simple Example)
import tensorflow as tf from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense model = Sequential([ Dense(4, activation='sigmoid', input_shape=(2,)), Dense(1, activation='sigmoid') ]) model.compile( optimizer='adam', loss='mean_squared_error' ) X = [[0,0], [0,1], [1,0], [1,1]] y = [[0], [1], [1], [0]] model.fit(X, y, epochs=500, verbose=0) print(model.predict(X))
When to Use TensorFlow
- Production-grade systems
- Mobile / edge deployment
- Large-scale training
PyTorch Version (Simple Example)
import torch import torch.nn as nn import torch.optim as optim class NeuralNet(nn.Module): def __init__(self): super().__init__() self.fc1 = nn.Linear(2, 4) self.fc2 = nn.Linear(4, 1) def forward(self, x): x = torch.sigmoid(self.fc1(x)) x = torch.sigmoid(self.fc2(x)) return x model = NeuralNet() criterion = nn.MSELoss() optimizer = optim.Adam(model.parameters(), lr=0.1) X = torch.tensor([[0.,0.],[0.,1.],[1.,0.],[1.,1.]]) y = torch.tensor([[0.],[1.],[1.],[0.]]) for _ in range(5000): optimizer.zero_grad() output = model(X) loss = criterion(output, y) loss.backward() optimizer.step() print(model(X))
When to Use PyTorch
- Research & experimentation
- Custom model architectures
- Full control over training loops
Frequently Asked Questions
What is the easiest way to code a neural network?
The easiest way is using libraries like TensorFlow or PyTorch, but coding one from scratch helps you understand how backpropagation, weights, and gradients actually work.
Do I need advanced math to build a neural network?
Basic linear algebra and calculus concepts are enough to start. Libraries handle most mathematical complexity in real-world applications.
Is Python required to code neural networks?
Python is the most common language due to its ecosystem, but neural networks can also be implemented in C++, Java, and JavaScript.
What is the difference between machine learning and neural networks?
Neural networks are a subset of machine learning models designed to learn complex non-linear patterns.
When should I use TensorFlow or PyTorch instead of scratch code?
Use scratch implementations for learning. Use TensorFlow or PyTorch for production systems, scalability, and performance.
Can neural networks be used in backend applications?
Yes. Neural networks are commonly integrated into APIs, microservices, and data pipelines.
Is neural network coding relevant for software engineers?
Absolutely. Understanding neural networks improves problem-solving skills, system design, and AI integration capabilities.
Related AI, Machine Learning & Neural Network Guides
Neural Networks & Deep Learning
- Neural Network Programming: A Deep Dive
- Neural Network Programming: Practical & Conceptual Guide
- Deep Learning Architectures, Algorithms & Applications
- LLMs Architecture and Training Explained
Machine Learning & Core AI Concepts
- Machine Learning Algorithms, Applications & Future
- AI vs Machine Learning: Key Differences Explained
- Artificial Intelligence: Future Innovations & Solutions
- Data Mining Concepts and Techniques
AI Engineering, Tools & Development
- AI Application Development: Technical Roadmap
- Optimizing AI Chatbot Interactions
- AI Tools and Engineering Best Practices
- Responsible AI Best Practices for Engineers
Advanced AI Models & Assistants
Leave A Comment