1959
The first significant experiments were conducted by neurophysiologists who showed a series of images to a cat, attempting to link the brain’s response to visual stimuli. They discovered that the cat primarily reacted to sharp edges and lines, suggesting that image processing begins with simple shapes.
Around the same time, the first computer-based image scanning technology was developed, enabling systems to digitize and capture images. This breakthrough laid the foundation for modern computer vision, allowing machines to process and analyze visual data in ways that mimic early stages of human perception.
1963
Computers became able to transform two-dimensional images into three-dimensional shapes. During these years, AI emerged as an academic research field, and the first attempts to apply it to human vision began to take shape.
1974
Optical Character Recognition (OCR) technology was introduced, allowing computers to recognize printed text in various fonts. At the same time, Intelligent Character Recognition (ICR) was developed to decipher handwritten text using neural networks.
These advancements paved the way for many common applications, including document processing, license plate recognition, mobile payments, and automated translation.
1982
Neuroscientist David Marr demonstrated that vision operates hierarchically and introduced algorithms that enabled machines to detect edges, corners, curves, and basic shapes. Around the same time, computer scientist Kunihiko Fukushima developed a network of cells capable of recognizing patterns, called the Neocognitron, which incorporated convolutional layers into a neural network.
2000
Research began focusing on object recognition, and by 2001, the first real-time facial recognition applications emerged. Additionally, the standardization of visual datasets—through labeling and annotation—became widely established, providing a crucial foundation for training and evaluating computer vision models.
2010
The ImageNet dataset was released, containing millions of labeled images across thousands of object categories. This dataset became a cornerstone for the development of convolutional neural networks (CNNs) and modern deep learning models, significantly advancing the field of computer vision.
2012
A team from the University of Toronto introduced a CNN in an image recognition program called AlexNet, which significantly reduced error rates in image classification. This breakthrough led to a dramatic drop in error rates, bringing them down to just a few percentage points, and marked a pivotal moment in the evolution of computer vision.
To recognize and interpret images, computer vision relies on vast amounts of data. Through repeated analysis, the system learns to identify key image features such as shapes, colors, and patterns. Technologies like deep learning and CNN play a crucial role in this process.
Deep learning enables machine learning models to improve autonomously by analyzing large sets of visual data. CNN break images down into pixels, label this information, and apply convolutions to make accurate predictions.
This iterative process allows machines to interpret images in a way that resembles human vision, continuously improving the accuracy of their predictions. Depending on the techniques used, the type of image, and the specific task, computer vision algorithms can perform various analyses and validations on an image.
Image classification
Image analysis and label assignment
Object Detection
Identification of objects within an image
Image Segmentation
Dividing the image into distinct segments to facilitate recognition
Action Recognition
Identification of objects and their links in space and time
Neurally S.r.l.
Capitale sociale € 30.000 (i.v.)
VAT: 02160050387
LEGAL AND OPERATIONAL HEAD OFFICE
Via L.V. Beethoven 15/C
44124 | Ferrara (FE)
Via Copernico 38
20125 | Milan (MI)