Understanding how images are transformed into numerical forms such as matrices and tensors is a fundamental step for anyone interested in machine vision. At its core, an image is just a collection of pixel values arranged in a grid, and these values can be represented as numbers in a matrix. When dealing with multiple images, these matrices can be expanded into multidimensional arrays known as tensors. This numerical representation allows computers to perform operations such as image recognition or object detection using mathematical algorithms.