'MIT's computer vision could make robots more useful'

Written By DNA Web Team | Updated: Sep 10, 2018, 04:55 PM IST

Robots of the future could be useful in homes and offices, thanks to MIT scientists who have developed an advanced computer vision that enables machines to inspect random objects and accomplish specific tasks. Breakthroughs in computer vision have enabled robots to make basic distinctions between objects. However, the systems do not truly understand objects' shapes, so there is little the robots can do after a quick pick-up.

Robots of the future could be useful in homes and offices, thanks to MIT scientists who have developed an advanced computer vision that enables machines to inspect random objects and accomplish specific tasks. Breakthroughs in computer vision have enabled robots to make basic distinctions between objects. However, the systems do not truly understand objects' shapes, so there is little the robots can do after a quick pick-up.

The new system created by researchers at Massachusetts Institute of Technology (MIT) in the US, called Dense Object Nets (DON), looks at objects as collections of points that serve as sort of visual roadmaps. This approach lets robots better understand and manipulate items, and, most importantly, allows them to even pick up a specific object among a clutter of similar things. "Many approaches to manipulation can't identify specific parts of an object across the many orientations that object may encounter," said Lucas Manuelli, a PhD student MIT.

The team views potential applications not just in manufacturing settings, but also in homes. "Imagine giving the system an image of a tidy house, and letting it clean while you're at work, or using an image of dishes so that the system puts your plates away while you're on vacation," researchers said. None of the data was actually labelled by humans. Instead, the system is what the team calls "self-supervised," not requiring any human annotations.

The DON system essentially creates a series of coordinates on a given object, which serve as a kind of visual roadmap, to give the robot a better understanding of what it needs to grasp, and where. The team trained the system to look at objects as a series of points that make up a larger coordinate system.

It can then map different points together to visualise an object's 3D shape, similar to how panoramic photos are stitched together from multiple photos. After training, if a person specifies a point on a object, the robot can take a photo of that object, and identify and match points to be able to then pick up the object at that specified point. This is different from systems like DexNet, which can grasp many different items, but can't satisfy a specific request, researchers said.

In one set of tests done on a soft caterpillar toy, a robotic arm powered by DON could grasp the toy's right ear from a range of different configurations. This showed that, among other things, the system has the ability to distinguish left from right on symmetrical objects. When testing on a bin of different baseball hats, DON could pick out a specific target hat despite all of the hats having very similar designs -- and having never seen pictures of the hats in training data before.