Stanford professor Marc Levoy, fresh off a two-year leave to work on Google Glass, recently spoke to a packed house at Stanford’s Center for Image Engineering (SCIEN) about the new era of photography that Glass, and other increasingly powerful wearable cameras, have begun to usher in. While many of the new applications have been talked about — like first-person videos and the ability to take pictures without losing eye contact — Levoy explained that those are only the tip of the iceberg. He sees a combination of computational imaging and new-form-factor, camera-equipped devices will allow for a set of what he described as “superhero vision” capabilities.
Rapidly increasing processor power will help fuel this new world of powerful new photographic tools. Levoy, a pioneer in both computer graphics and computational imaging, noted that GPU power is growing by roughly 80% per year, while megapixels are only growing by about 20%. That means more horsepower to process each pixel — with the available cycles increasing each year. Coupled with near-real-time multi-frame image capture, the bounds of traditional photography can even be stretched beyond the borders of a single image.
See in the dark with HDR
While our eyes and brain have a pretty good ability to see in a variety of lighting conditions, we know from how well animals can see “in the dark” that more is possible. Digital cameras have the luxury of leaving their aperture open long enough to gather photons even in very low light levels. By combining those long exposures with shorter ones, high-dynamic-range (HDR) scenes can be captured. It doesn’t take much imagination to see how a wearable camera could provide virtual-reality or head-up assistance to the wearer — allowing them to see into the shadows or even in largely dark rooms. Newer smartphones from Apple and others have some HDR capability built-in, but it will take integration with a a wearable computer to allow the use of HDR to augment our own vision.
Now is probably a good time to define the different ways a wearable display can change what the viewer sees. Augmented reality adds information to the normal field of view. Head-up displays are a common example. By contrast, virtual reality completely synthesizes your world view — although combined with camera input, a VR system can certainly be programmed to present augmented reality. Google Glass isn’t technically either of these — although Levoy groups it with the head-up AR category. It displays information above and to the side of the wearer’s normal field of view, so it doesn’t intrude on your attention unless you deliberately look at it. That makes Glass limited in what it can present to the wearer, but also makes it more practical and easier to get used to in the short term.
Removing objects from photographs by using bits of multiple frames isn’t new. Above you can see an example of how I removed a pony with a red blanket and some other annoyingly-placed tourists from my evening shot of the always popular Angkor Wat temple. I stacked several images (including the two smaller ones shown), then selected the Median value for each pixel to get rid of elements that were only in some of the frames. What’s new is that the combination of high native frame rate and increased computational power will begin to make it possible to achieve this special effect in real time, perhaps even as a form of augmented reality vision. The computational power is needed to align consecutive frames even when the wearer is moving, and then to quickly filter out changing elements.
Today, a powerful tool like Photoshop is required to take images including these and combine them to remove distractions, but in the future this will be possible with realtime video clips and powerful GPUs.
Slow motion & motion detection
Motion is often especially important to us. Our brains are wired to detect motion, but not nearly as well as those of many animals. Computer software, by amplifying the variation between frames in a video clip, can serve to make apparently invisible motion visible.
William Freemans’s group at MIT has done some amazing work on augmenting video clips to amplify motion — making moving objects much easier to isolate and identify. The most familiar example is the group’s demo of measuring a person’s pulse with a simple camera, but they have extended the technique to making motion more visible in generalized scenes. In addition to its use for medical diagnostics, law enforcement and the military are likely candidates to benefit from an augmented ability to detect movement.
- 1 of 2