Maybe the biggest surprise of Google’s hardware event today was the launch of Clips,
a small stand-alone AI-driven camera that can capture up to three hours
of video and images and then automatically select the best moments for
you. I’m not sure how well Clips will do in the marketplace, but
technically, it’s a fascinating product.
During my conversation with Clips product lead Juston Payne, he
repeatedly stressed that Clips is not an accessory to the Pixel — or
anything else, really. “It’s an accessory to anything, I’d say. It’s a
stand-alone camera. A new type of camera and insofar as that any digital
camera has become an accessory to a computer or a phone, so too with
this,” he said. “The reason for that comes back to the fact that the
intelligence is built into the device to decide when to take these
shots, which is really important because it gives users total control
over it.”
So unlike a product like Google Home, which fully relies on being
connected to the cloud, Clips is pretty much a self-contained unit. It
takes your images (probably while you set it down in your living room
while you play with your kids), runs its pre-trained machine learning
algorithms to find the best ones and then automatically generates your
clips and picks your best images for you.
This means it just works, no matter whether you are an iOS or Android
user (though it comes with an app that lets you see the clips on the
device and share them). And the device reflects this simplicity, with
its one button (for manually starting recording) and straightforward
design.
“We care very deeply about privacy and control and it was one of the
hardest parts of the whole project,” Payne told me. “The thing is that
until really quite recently, you needed at least a desktop or you needed
literally a server farm to take imagery in, run convolutional neural
networks against them, do semantic analysis and then spit something
out.”
Only recently has silicon evolved to the point where a company like
Google can put all of this into a small device like Clips. Indeed, when
you hold Clips, it’s surprisingly small (and disappointingly, it doesn’t
feature a built-in clip, though you can put it into a little plastic
housing that features a clip). Most of the weight is probably the
battery, which should last about three hours, and the camera unit
itself, which features a pretty wide-angle view.
To run its models on the camera, Google went to Intel’s Movidius and its extremely low-power vision processing unit (VPU).
“In
our collaboration with the Clips team, it has been remarkable to see
how much intelligence Google has been able to put right into a small
device like Clips,” said Remi El-Ouazzane, vice president and general
manager of Movidius, Intel New Technology Group, in his company’s own announcement
today. “This intelligent camera truly represents the level of onboard
intelligence we dreamed of when developing our Myriad VPU technology.”
Every AI model needs to be trained, though, and to train Clips,
Google actually worked with video editors and an army of image raters to
train its models. “There’s not a great ML [machine learning] model that
can say: there’s a baby crawling on the floor, that probably looks
good,” explained Payne. So Google collected a lot of its own video. It
then had editors on staff look at the content and say what they liked —
and then the labelers looked at the clips and decided which ones they
liked better, which became the training material for the model.
Over time, the unit learns who the people are you care about and what images you are interested in.
But there’s a drawback here, too. For now, Clips is great for finding
images of people and pets (or really, cats and dogs — not pet pigs).
It’s not a device you can take on a vacation and expect it to find the
best images for you. Over time, Google plans to expand the machine
learning model on the device to include support for more situations, but
right now, it’s basically probably best as a device for young families.
“We’re starting with a focus and then we’ll build out from there,”
explained Payne. “Right now, it doesn’t understand the world in
general.”
Over time, Clips will understand more of the world. At $249, it’s
definitely an expensive device, though I wouldn’t be surprised if Clips
caught on and made regular appearances on baby shower registries.
No comments:
Post a Comment