What is?

Computer vision is a field of artificial intelligence that enables computer systems to extract meaningful information from images, videos and other visual input. By leveraging cameras, algorithms, and powerful computing systems, it allows machines to perceive, analyze, and interpret visual data, executing tasks rapidly and at scale. Much like human vision, computer vision can identify objects, determine their positions, recognize movements, and detect anomalies.

The story

1959

The first significant experiments were conducted by neurophysiologists who showed a series of images to a cat, attempting to link the brain’s response to visual stimuli. They discovered that the cat primarily reacted to sharp edges and lines, suggesting that image processing begins with simple shapes.
Around the same time, the first computer-based image scanning technology was developed, enabling systems to digitize and capture images. This breakthrough laid the foundation for modern computer vision, allowing machines to process and analyze visual data in ways that mimic early stages of human perception.

1963

Computers became able to transform two-dimensional images into three-dimensional shapes. During these years, AI emerged as an academic research field, and the first attempts to apply it to human vision began to take shape.

1974

Optical Character Recognition (OCR) technology was introduced, allowing computers to recognize printed text in various fonts. At the same time, Intelligent Character Recognition (ICR) was developed to decipher handwritten text using neural networks.
These advancements paved the way for many common applications, including document processing, license plate recognition, mobile payments, and automated translation.

1982

Neuroscientist David Marr demonstrated that vision operates hierarchically and introduced algorithms that enabled machines to detect edges, corners, curves, and basic shapes. Around the same time, computer scientist Kunihiko Fukushima developed a network of cells capable of recognizing patterns, called the Neocognitron, which incorporated convolutional layers into a neural network.

2000

Research began focusing on object recognition, and by 2001, the first real-time facial recognition applications emerged. Additionally, the standardization of visual datasets—through labeling and annotation—became widely established, providing a crucial foundation for training and evaluating computer vision models.

2010

The ImageNet dataset was released, containing millions of labeled images across thousands of object categories. This dataset became a cornerstone for the development of convolutional neural networks (CNNs) and modern deep learning models, significantly advancing the field of computer vision.

2012

A team from the University of Toronto introduced a CNN in an image recognition program called AlexNet, which significantly reduced error rates in image classification. This breakthrough led to a dramatic drop in error rates, bringing them down to just a few percentage points, and marked a pivotal moment in the evolution of computer vision.

How does computer vision work?

To recognize and interpret images, computer vision relies on vast amounts of data. Through repeated analysis, the system learns to identify key image features such as shapes, colors, and patterns. Technologies like deep learning and CNN play a crucial role in this process.
Deep learning enables machine learning models to improve autonomously by analyzing large sets of visual data. CNN break images down into pixels, label this information, and apply convolutions to make accurate predictions.
This iterative process allows machines to interpret images in a way that resembles human vision, continuously improving the accuracy of their predictions. Depending on the techniques used, the type of image, and the specific task, computer vision algorithms can perform various analyses and validations on an image.

Image classification

Image analysis and label assignment

Object Detection

Identification of objects within an image

Image Segmentation

Dividing the image into distinct segments to facilitate recognition

Action Recognition

Identification of objects and their links in space and time

Where can we apply Computer Vision?

Computer vision is applied across various industries, from manifacturing to heathcare, from transportation to entertainment. For example, in autonomous vehicles, this technology plays a crucial role in recognizing traffic signs, pedestrians, and other vehicles, ensuring safe and independent driving. Another example is the use of computer vision in security systems, where it analyzes video feeds to detect suspicious activity.

Here is how we have applied Computer Vision in Neurally
Based on the research and experience of the Neurally team, we developed a software called AREA. Using Computer Vision, AREA detects objects in one or multiple sections of a live or recorded video.