As an advancements in a type of technology called ‘Computer Vision’, engineers at University of California and Stanford University have recently demonstrated a AI system with an ability to find and identify real-world objects it sees, mimicking the same method that humans use for visual learning.
Computer vision allows computers to read and identify visual objects, while the new system could be a step closer to artificial intelligence systems – where computers are more intuitive, learn on their own, and make decisions on the basis of reasoning and interactions with humans, in a more human-like way.
Currently, AI computer vision systems are increasingly capable and powerful, but are not designed to learn on their own. They are task-specific and their ability to identify or recognize an object is limited by how much they are trained and programmed by a human operator. These systems cannot make up a complete picture of an object after seeing only few parts, or build a common-sense model of known objects the way humans do. Engineers have been aiming to develop systems with abilities – just like humans can recognize a dog, even if only paws are visible or the animal is hiding behind a chair – that still elude most AI systems.
The newly developed method that finds a way around these shortcomings has been detailed in the journal Proceedings of the National Academy of Sciences.
The new approach is composed of three major steps:
- The system splits up an image in small parts, which the team named ‘viewlets’.
- The system learns how the viewlets fit together to form an object in the question.
- Then, it observes other objects in the surrounding, irrespective of their relevance in identifying and describing the primary object.
In order to enable the system learn more like humans, the researchers immersed it in an internet replica of human environment. According to principal investigator Vwani Roychowdhury, Professor of Electrical and Computer Engineering at UCLA, the internet offered two things for the brain-inspired computer vision system to learn like a human:
- A wealth of images and videos that represent the same types of objects.
- The objects are displayed from varying perspectives – up-close, bird’s eye, obscured – and they are located at various types of environments.
To develop the framework, the researchers gained insights from neuroscience and cognitive psychology. Humans start to learn things through many examples and in many contexts, Roychowdhury said. The contextual learning is an important feature of the brain that helps human develop definite models of objects, he added.
The research team tested the new system using around 9,000 images that either shows people or other objects. It was able to develop a detailed human body model without the image being labelled and without any external guidance. Similar tests were conducted using images of bikes, cars, and airplanes. In most cases, the new AI system performed as well as conventional computer vision systems which are developed through several years of training.