Understanding the Differences Between Convolution and Matrix Multiplication: A Guide to Master 2D Convolution in Python with Scipy
Matrix multiplication and convolution are both linear operations, but they are different in the way they operate on the input data.
Matrix multiplication
Matrix multiplication involves the element-wise multiplication of two matrices followed by summing the resulting products to produce a scalar value. The result is a new matrix with dimensions that depend on the dimensions of the input matrices. Specifically, the resulting matrix has a number of rows equal to the number of rows in the first matrix and a number of columns equal to the number of columns in the second matrix.
Matrix multiplication in Python
import numpy as np
# define two 2x2 matrices
matrix1 = np.array([[1, 2], [3, 4]])
matrix2 = np.array([[2, 0], [1, 2]])
# perform matrix multiplication using numpy.dot() function
result = np.dot(matrix1, matrix2)
# print the matrices and the result
print("Matrix 1:\n", matrix1)
print("\nMatrix 2:\n", matrix2)
print("\nResult:\n", result)
Mechanics of Convolution
Contrarily, convolution involves sliding a small matrix, called a kernel or filter, over the input matrix, and computing the element-wise product between the kernel and the overlapping sub-matrix of the input. The resulting products are summed up to produce a single value for each position of the kernel on the input matrix, which is stored in the output matrix. The dimensions of the output matrix depend on both the kernel’s size and the size of the input matrix.
Code for convolution in Python
import numpy as np
def conv2d(image, kernel):
"""
Computes a 2D convolution of an image with a kernel.
Args:
- image: a 2D NumPy array of shape (input_height, input_width)
- kernel: a 2D NumPy array of shape (kernel_height, kernel_width)
Returns:
- convolved_image: a 2D NumPy array of shape (input_height - kernel_height + 1, input_width - kernel_width + 1)
"""
# Get image dimensions and kernel size
input_height, input_width = image.shape
kernel_height, kernel_width = kernel.shape
# Initialize the output image
output_height = input_height - kernel_height + 1
output_width = input_width - kernel_width + 1
convolved_image = np.zeros((output_height, output_width))
# Loop over every pixel of the output image
for i in range(0,output_height ):
for j in range(0, output_width):
# Compute the convolution at the current pixel
for ii in range(0, kernel_height):
for jj in range(0, kernel_width):
convolved_image[i,j] += image[i+ii, j+jj] * kernel[ii,jj]
return convolved_image
# Example usage
image = np.array([[1, 2, 2],
[3, 4, 5],
[6, 7, 8]])
kernel = np.array([[0, 1],
[2, 3]])
convolved_image = conv2d(image, kernel)
print(convolved_image)
Code for convolution using Scipy
To simplify we can use the “convolve2d” function from scipy.
import numpy as np
from scipy.signal import convolve2d
# define a 2D array (or matrix) as input
input_array = np.array([[1, 2, 2],
[3, 4, 5],
[6, 7, 8]])
# define a kernel (or filter) as a 2D array
kernel = np.array([[0, 1],
[2, 3]])
# perform 2D convolution using scipy.signal.convolve2d
output_array = convolve2d(input_array, kernel, mode='valid')
# print the input, kernel, and output arrays
print("Input Array:\n", input_array)
print("\nKernel:\n", kernel)
print("\nOutput Array:\n", output_array)
The problem with the above code is, if you use this function as is, you will get a totally different result than we did with the “Code for convolution in Python” approach above. This is because “convolve2d” does a true convolution and not the deep learning version of convolution (i.e. with plus instead of a minus).
In order to make the scipy convolution work the same way, we have to flip the filter both horizontally and vertically and set the mode argument as ‘valid’.
output_array = convolve2d(input_array, np.fliplr(np.flipud(kernel)), mode='valid')
As a side note, convolution is a commutative operative i.e. “input_array” convolved with “kernel” is the same as the “kernel” convolve with “input_array”, therefore it does not matter which input we flip.
Actually, what we perform in deep learning is cross-correlation, not true convolution. The operation itself is a combination of element-wise multiplication and addition, but it’s conventionally referred to as convolution in the field of deep learning since the filters’ weights are learned through training.
Summary
Convolution and Matrix Multiplication processes are distinct from one another, despite the fact that matrix multiplication can be thought of as a specific example of convolution (where the kernel is the second matrix transposed).
Convolution is frequently utilized in machine learning and linear algebra activities like linear regression and neural network calculations, while matrix multiplication is more frequently employed in signal processing tasks like filtering and feature extraction from images.
Therefore, convolution and matrix multiplication are independent operations with unique mathematical features and intended applications, despite certain similarities between them.