Planning a family vacation? Chances are good you’ve scoured Google for information and photos for potential destinations like the Grand Canyon or New York City. Google has made it their business to make the finding of things as easy as possible for the masses, and we’ve proven over and over again that people are on an endless quest to find things. Plus having some decent online pics of Rockefeller Center to present to the kids never hurts the cause on the vacation front, right?
Those picture searches lead us to Google’s latest endeavour, a machine heavy on artificial intelligence that is programmed to recognize details or objects within an image in order to ‘memorize’ them, christened PlaNet. PlaNet focuses on the successful photo geolocation of objects, buildings, even animals and plants, but without all that geotagging silliness normally required to do so.
Google engineers Tobias Weyand, Ilya Kostrikov, and James Philbin recently released their paper detailing the science and math behind how PlaNet works, and how it stacks up against the human brain when it comes to identifying what a building or landmark is or where a photo was taken-provided it’s a relatively clear photo, of course (c’mon-cut it some slack for that).
The (very) simplified breakdown of the PlaNet ‘brain’ is this: Weyand, Kostrikov and Philbin devised a grid system that had the earth covered in 26 000 boxes. The size of the box varied depending on how many geotagged photos or images existed for that particular location. For example, New York City-millions of pictures available, smaller squares needed. Antartica? Not so many visuals around, bigger squares as a result.
From there, 91 million geotagged images were placed in a database, and PlaNet was given the task of learning where each image was taken in relation to the boxed grid system already in place. To test how well PlaNet had done with its homework the team then culled 2.3 million geotagged images from Flickr and presented them to PlaNet, minus the geotags.
PlaNet correctly identified the continent on which the photo was taken 48 percent of the time, the country 28.4 percent, and the city 10.1 percent. PlaNet even had a 3.6 percent success rate with street level views-not too shabby for something plugged into a wall outlet.
Those numbers may seem lackluster on the surface, but compared to human capabilities it’s quite impressive. PlaNet took on ten worldly human travellers in a friendly online game of geoguessr and showed everyone who’s boss, winning 28 of 50 rounds played.
The developers and Google are being tight-lipped about possible uses or markets for PlaNet, but since the current core program is relatively reasonable in size (377mb) don’t be surprised if you’re asking Siri to open it for you sometime down the road.