in Blog

August 10, 2021

Computer Vision Applications: How Automatically Analyze Information From Images and Videos (update: August 2021)

Author:




Edwin Lisowski

CSO & Co-Founder


Reading time:




10 minutes


Computer Vision (CV) is a fast developing branch of Machine Learning that uses images and videos to extract knowledge about the world. Because the sense of sight is so important for humans and we have a lot of actions dependent on it, computer vision applications will become crucial in future automation. Moreover, computer vision algorithms will be also implemented in visual intensive works like RTG luggage inspection, finding criminals with public cameras, or preventing financial fraud using face recognition. This domain will open new areas of development and help to create new industries.

Below we will try to explain some of the real Computer Vision applications.

Computer Vision Meaning

Computer vision is a subset of Deep Learning and Artificial Intelligence, which studies technologies and tools that allow computers to be trained to perceive and interpret visual data from the real world. This means that the computer must handle the following three tasks efficiently [3]:

  • Automatically understand what these objects are in the image and where they are located.
  • Sort these objects into categories and understand how they relate to one another.
  • Understand the scene’s context.

Computer Vision applications

Object Detection – the most popular computer vision applications

Object Detection is a part of Computer Vision that focuses on detecting various objects on photos like cats, dogs, cars, bikes, humans, etc., by extracting features from pixels and applying deep learning to recognize patterns.

One of the main areas of Object Detection is face recognition.

Object recognition

Image and Video Pre-Processing

Advanced computer vision algorithms with the use of neural networks can perform image transformations not available for traditional image processing algorithms. As a computer vision example, we can artificially increase the number of trees or remove them without noticing an artificial change.

It is possible to generate missing parts of the photo or change the sky’s appearance from Earth to Mars. Possibilities of image enhancing and transformation are limitless and require just creating a specialized model for a given task.

Scene Segmentation

Another computer vision application – scene segmentation. Traditionally, to detect an object on an image it used to be sufficient to just select its position by the rectangle. Now, an improvement of this technique is outlining the given object (for example by a slight change of its color) and in that way segment images on different objects where the result is obtaining an image very similar to the stained glass.

Scene segmentationSource: researchgate.net

This technology will be extensively used in autonomous navigation and radiology (outlining cancerous changes in tissue).

Video and Image Content Indexing

A model trained to detect objects on photos can extract its content and prepare tags automatically. Nowadays, the inference is so fast that videos can be processed in real-time. This computer vision application can be used in personalized advertisements (for example screens in public space) where ads are chosen based on your clothes and things you carry.

Video and Image Content Indexing

3D Scene Reconstruction

Computer vision algorithms are able to reconstruct 3D objects from 2D imagery taken from different angles. As an example, we can acquire a city model from images gathered by drones. We may even create a model of the cave based on a movie recorded inside it due to this computer vision application.

Social Distancing Tool – computer vision application during COVID-19

The computer vision application was actively used during the pandemic, when the following social distance became an integral part of people’s daily life. It can be used to track people indoors or in a specific area to determine whether or not they are following social distancing norms. The social distancing tool allows to track objects in real time. The computer vision algorithms identify people in the video and can automatically calculate the distance between them, detecting violations. [1]

It might be interesting for you – Computer Vision Fighting The Coronavirus Outbreak

Deep Learning in Building Computer Vision Applications

Deep Learning (DL) originates from a large number of layers in neural networks. Thanks to the constant development of computing power in recent years, we are now able to train more and more complex neural networks with an increased number of NN layers. Such sophisticated models better generalize truths hidden in data than “shallow” neural networks.

Computer vision uses special types of neural networks called Convolutional Neural Networks. They use convolutional layers which are 2D surfaces learning from correlations between image pixels. CNN watches images multiple times, tweaking its parameters constantly to improve the outcome.

Real-life Examples of Computer Vision Applications

Retail Shelf analysis

Automatic product detection allows recognizing missing and misplaced products on shelves with comparison to the planogram. Aggregated information about shop conditions gives the opportunity to improve the quality of customer service.

RTG analysis

Сomputer vision applications are actively used at airports. Image analysis and computer vision algorithms can automate the process of discovering illicit items in luggage during customs inspection on the airports. Such a mundane task is ideal for Convolutional Neural Networks taking into consideration the huge size of the available data-set.

security check

Automatic video tagging for real-time marketing

One more computer vision example is automatic tagging for real-time marketing. This technology will improve the advertisement industry, making it more personalized. For example, after tagging customer’s favorite brands and gaining deep insights into their preferences, we can recommend products with a higher probability of being chosen. It is a win-win situation for both customers (more relevant ads) and e-commerce (higher income).

See more Computer Vision Applications in eCommerce

Real estate valuation

Make identification easier for security officers and ordinary people – no more need for additional cards or keys. Also, there is a possibility to determine when somebody is a wanted criminal. This is a good example of how computer vision applications can be implemented even in security systems.

Recognizing faces in security systems

Make identification easier for security officers and ordinary people – no more need for additional cards or keys. Also, there is a possibility to determine when somebody is a wanted criminal.

face recognition

Automatic reading of personal information from identity cards

Due to computer vision algorithms, this technique protects from misspelling and it is much faster than reading information manually. It has the potential to simplify maintaining a customer database and improve the quality of data.

personal information from identity cards

Industrial maintenance

Image analysis and computer vision techniques use data from cameras to visually check the condition of assets, for example, valves and pipes, and compare it with optimal conditions. This information can be transferred to a remote maintenance crew that checks anomalies. This type of computer vision application saves time, as there is no need to manually check everything, and makes work easier.

Self-driving cars

Talking about computer vision applications, it is worth mentioning AI self-driving cars. Computer vision algorithms use data gathered from sensors to drive a car safely from point A to B. Moreover, it can automate our commuting habits and make life a lot easier especially for elderly and disabled people. On the other hand, this computer vision example can increase car usage (hence increase traffic). However, it can also prevent accidents and reduce the number of cars by automating the taxi system self-driving cars, so there will be no need to own a private car.

computer vision applicationSource: gcn.com

See more Computer Vision solutions

Automatic Harvesting

Thanks to computer vision applications, high-tech intelligent agricultural harvesting machines such as harvesting machines and harvesting robots are actively used in the agriculture sector. Computer vision applications include automatic picking of cucumbers in the greenhouse or automatic identification of cherries in their natural environment. [2]

computer vision in agricultureSource: automate.org

Parking Occupancy Monitoring

Based on deep Convolutional Neural Networks (CNN), computer vision applications provide decentralized and efficient methods for visual determining parking occupancy. There are numerous datasets available for parking monitoring, including PKLot and CNRPark-EXT. [2]

Python – The Best Open-source Tool for Computer Vision Applications

Training Convolutional Neural Networks using Python has become easier thanks to a great abundance of libraries to choose from. Below we present the most popular ones:

Caffe

Caffe is a framework built especially to be used in deep learning. Developed in Berkeley, it is one of the best libraries for computer vision applications where models are not defined in code but in configuration files that can be a drawback for some of us. It isn’t developed in Python but it provides bindings to it. Caffe is known to be fast, it can inference an image analysis and computer vision in 1ms and learn from it in 4ms if used e.g. on Nvidia K40 GPU.

Theano

Theano is one of the oldest Python libraries built for operating on multi-dimensional arrays and that allow training neural networks. It is integrated with NumPy. Furthermore, it has efficient symbolic differentiation, possibilities to evaluate expressions faster thanks to dynamic C code generation, and can automatically diagnose many types of errors. Its development has finished in late 2017 but it is still a decent library to use for your project.

TensorFlow

TensorFlow was designed by Google Brain Team and released as an open-source library for abstract (using tensors) numerical computation. It is a low-level library, old enough to have many sophisticated projects using it as a backbone, decent documentation, and vast community. TensorFlow’s main advantage (over Theano) is multi-GPU support. It has two API: low-level (original), and high-level Keras.

Lasagne

Lasagne is built on top of Theano with the intention to be simple to understand, use, and easy to directly process and return Theano expression or NumPy data types. Lasagne allows defining Convolutional Neural Networks, Recurrent Neural Networks, and its combinations. It supports CPU and GPU thanks to Theano’s compiler. In terms of library level, it is medium – somewhere between low-level libraries like TensorFlow or Theano and high-level libraries like Keras.

Keras

Keras is a high-level library that uses TensorFlow, CNTK, or Theano as a back-end. It is officially supported by Google (TensorFlow) which has intercepted its development. Keras positions itself as a CV API for “human beings”. Moreover, it focuses on simplicity so creating networks is fast and intuitive. Model architecture is divided into fully-configurable modules like neural layers, optimizers (Adam, RMSProp), cost functions, etc. It includes built-in models like ResNet50, InceptionV3, or MobileNet.

MXNet

MXNet allows using many GPUs in distributed systems. It is also easy to manage where every piece of data should be stored in the systems. This library also has built-in methods for fast derivative calculations. Every coded layer has been optimized and now MXNet is one of the fastest available computer vision libraries. However, it takes more time to start modeling compared to Keras.

Key Takeaways

  • Computer vision is great for: object detection, scene segmentation, 3D scene reconstruction, medical imaging and more.
  • There are many real-life examples of computer vision applications: RTG luggage inspection, self-driving cars, real estate valuation, social distancing tool and more.
  • Python – the best open-source tool for computer vision applications.
  • Training Convolutional Neural Networks using Python has become easier thanks to a great abundance of libraries to choose from: Caffe, Theano, Keras, Lasagne and more.

Check our case studies to understand better real-world use cases of computer visions and deep learning. Contact us and arrange a consultation on how computer vision algorithms can be implemented in your business.



Category:


Computer Vision