Bridging the gap between human and machine vision
Suppose you look briefly from a few feet away at a person you have never met before. Step back a few paces and look again. Will you be able to recognize her face? “Yes, of course,” you probably are thinking. If this is true, it would mean that our visual system, having seen a single image of an object such as a specific face, recognizes it robustly despite changes to the object’s position and scale, for example. On the other hand, we know that state-of-the-art classifiers, such as vanilla deep networks, will fail this simple test. In order to recognize a specific face under a range of transformations, neural networks need to be trained with many examples of the face under the different conditions. In other words, they can achieve invariance through memorization, but cannot do it if only one image is available. Thus, understanding how human vision can pull off...