TECH PUNDIT TIM O’Reilly had just tried the new Google Photos app, and he was amazed by the depth of its artificial intelligence.
O’Reilly was standing a few feet from Google CEO and co-founder Larry Page this past May, at a small cocktail reception for the press at the annual Google I/O conference—the centerpiece of the company’s year. Google had unveiled its personal photos app earlier in the day, andO’Reilly marveled that if he typed something like “gravestone” into the search box, the app could find a photo of his uncle’s grave, taken so long ago.
Google is open sourcing software that sits at the heart of its empire.
The app uses an increasingly powerful form of artificial intelligence called deep learning. By analyzing thousands of photos of gravestones, this AI technology can learn to identify a gravestone it has never seen before. The same goes for cats and dogs, trees and clouds, flowers and food.
Well, this morning, Google took O’Reilly’s idea further than even he expected. It’s not selling access to its deep learning engine. It’s open sourcing that engine, freely sharing the underlying code with the world at large. This software is called TensorFlow, and in literally giving the technology away, Google believes it can accelerate the evolution of AI. Through open source, outsiders can help improve on Google’s technology and, yes, return these improvements back to Google.
“What we’re hoping is that the community adopts this as a good way of expressing machine learning algorithms of lots of different types, and also contributes to building and improving [TensorFlow] in lots of different and interesting ways,” says Jeff Dean, one of Google’s most important engineers and a key player in the rise of its deep learning tech.
'If Google open sources its tools, this can make everybody else better at machine learning.'CHRIS NICHOLSON
In recent years, other companies and researchers have also made huge strides in this area of AI, includingFacebook, Microsoft, and Twitter. And some have already open sourced software that’s similar to TensorFlow. This includes Torch—a system originally built by researchers in Switzerland—as well as systems like Caffe and Theano. But Google’s move is significant. That’s because Google’s AI engine is regarded by some as the world’s most advanced—and because, well, it’s Google.
“This is really interesting,” says Chris Nicholson, who runs a deep learning startup called Skymind. “Google is five to seven years ahead of the rest of the world. If they open source their tools, this can make everybody else better at machine learning.”
To be sure, Google isn’t giving away all its secrets. At the moment, the company is only open sourcing part of this AI engine. It’s sharing only some of the algorithms that run atop the engine. And it’s not sharing access to theremarkably advanced hardware infrastructure that drives this engine (that would certainly come with a price tag). But Google is giving away at least some of its most important data center software, and that’s not something it has typically done in the past.
Google became the Internet’s most dominant force in large part because of the uniquely powerful software and hardware it built inside its computer data centers—software and hardware that could help run all its online services, that could juggle traffic and data from an unprecedented number of people across the globe. And typically, it didn’t share its designs with the rest of the world until it had moved on to other designs. Even then, it merely shared research papers describing its tech. The company didn’t open source its code. That’s how it kept an advantage.
With TensorFlow, however, the company has changed tack, freely sharing some of its newest—and, indeed, most important—software. Yes, Google open sources parts of its Android mobile operating system and so many other smaller software projects. But this is different. In releasing TensorFlow, Google is open sourcing software that sits at the heart of its empire. “It’s a pretty big shift,” says Dean, who helped build so much of the company’s groundbreaking data center software, including the Google File System, MapReduce, and BigTable.
Typically, Google trains these neural nets using a vast array of machines equipped with GPU chips—computer processors that were originally built to render graphics for games and other highly visual applications, but havealso proven quite adept at deep learning. GPUs are good at processing lots of little bits of data in parallel, and that’s what deep learning requires.
But after they’ve been trained—when it’s time to put them into action—these neural nets run in different ways. They often run on traditional computer processors inside the data center, and in some cases, they can run on mobile phones. The Google Translate app is one mobile example. It can run entirely on a phone—without connecting to a data center across the ‘net—letting you translate foreign text into your native language even when you don’t have a good wireless signal. You can, say, point the app at a German street sign, and it will instantly translate into English.
TensorFlow is a way of building and running these neural networks—both at the training stage and the execution stage. It’s a set of software libraries—a bunch of code—that you can slip into any application so that it too can learn tasks like image recognition, speech recognition, and language translation.
In open sourcing the tool, Google will also provide some sample neural networking models and algorithms, including models for recognizing photographs, identifying handwritten numbers, and analyzing text. “We’ll give you all the algorithms you need to train those models on public data sets,” Dean says.
The rub is that Google is not yet open sourcing a version of TensorFlow that lets you train models across a vast array of machines. The initial open source version only runs on a single computer. This computer can include many GPUs, but it’s a single computer nonetheless. “Google is still keeping an advantage,” Nicholson says. “To build true enterprise applications, you need to analyze data at scale.” But at the execution stage, the open source incarnation of TensorFlow will run on phones as well as desktops and laptops, and Google indicates that the company may eventually open source a version that runs across hundreds of machines.
A Change in Philosophy
Why this apparent change in Google philosophy—this decision to open source TensorFlow after spending so many years keeping important code to itself? Part of it is that the machine learning community generally operates in this way. Deep learning originated with academics who openly shared their ideas, and many of them now work at Google—including University of Toronto professor Geoff Hinton, the godfather of deep learning.
“They were not developed with open sourcing in mind. They had a lot of tendrils into existing systems at Google and it would have been hard to sever those tendrils,” Dean says. “With TensorFlow, when we started to develop it, we kind of looked at ourselves and said: ‘Hey, maybe we should open source this.'”
That said, TensorFlow is still tied, in some ways, to the internal Google infrastructure, according to Google engineer Rajat Monga. This is why Google hasn’t open sourced all of TensorFlow, he explains. As Nicholson points out, you can also bet that Google is holding code back because the company wants to maintain an advantage. But it’s telling—and rather significant—that Google has open sourced as much as it has.
Google has not handed the open source project to an independent third party, as many others have done in open sourcing major software. Google itself will manage the project at the new Tensorflow.org website. But it has shared the code under what’s called an Apache 2 license, meaning anyone is free to use the code as they please. “Our licensing terms should convince the community that this really is an open product,” Dean says.
Certainly, the move will win Google some goodwill among the world’s software developers. But more importantly, it will feed new projects. According to Dean, you can think of TensorFlow as combining the best of Torch and Caffe and Theano. Like Torch and Theano, he says, it’s good for quickly spinning up research projects, and like Caffe, it’s good for pushing those research projects into the real world.
Others may disagree. According to many in the community, DeepMind, a notable deep learning startup now owned by Google, continues to use Torch—even though it has long had access to TensorFlow and DistBelief. But at the very least, an open source TensorFlow gives the community more options. And that’s a good thing.
“A fair bit of the advancement in deep learning in the past three or four years has been helped by these kinds of libraries, which help researchers focus on their models. They don’t have to worry as much about underlying software engineering,” says Jimmy Ba, a PhD student at the University of Toronto who specializes in deep learning, studying under Geoff Hinton.
Even with TensorFlow in hand, building a deep learning app still requires some serious craft. But this too may change in the years to come. As Dean points out, a Google deep-learning open source project and a Google deep-learning cloud service aren’t mutually exclusive. Tim O’Reilly’s big idea may still happen.
But in the short term, Google is merely interested sharing the code. As Dean says, this will help the company improve this code. But at the same time, says Monga, it will also help improve machine learning as a whole, breeding all sorts of new ideas. And, well, these too will find their way back into Google. “Any advances in machine learning,” he says, “will be advances for us as well.”
Journalists, yes even WIRED ones - please stop calling this "AI". Even in the body of the article the phrase "machine learning" is more or less correctly used, not AI.
This isn't mere pedantry. Using hyped and just plain ignorant terms confuses the picture and degrades the language. It's important here because this is science, it's not some vacuous activity like marketing or fashion. We need to identify things accurately, else we cannot discuss them meaningfully.
Given that real AI will bring an absolutely pivotal, seismic transformation of our world, we should reserve the term's use for when it matters ... because it is going to matter, more than helping catalogue a few photos.
It really depends how you look at it... A lot of well trained patterns along with formal logic can even form the abstract forms of ideas & this goes for the upper part of humanity concerning the IQ test as it is. We are certainly far from this but their is not much more to it in the way we think. AI won't be able to really resolve this until it have it's (or at least to it) real existence (to experience emotional IQ, existentialism...), I am certain we are not redy to give it one (thanks to our past as well as our current form of behavior with that "intelligence"). The animals have a certain intelligence and it's based on trained or birth implanted behavioral forms.
That's why it's important for humans. If the issue was computers that simulate the behaviour of snails, well this has indeed been achieved a long time ago and every computer running snail behaviour models is intelligent. But historically, the term means something that can mimic/achieve human abstruct thinking. Even if we can't define what abstruct thinking is.
I said nothing about intelligence in general. I wrote about "artificial intelligence", which is a term. Terms have historic content and AI's historic background is related to Turing and his test. If we remove the historic background AI means nothing, cause, as I wrote, even a 90's desktop could run snail behaviour models - but nobody considers it AI. Maybe it sounds restrictive but that's what terms are for. In this particular case, the term "learning machine" is broader and devoid of history, so it's more appropriate.
And with this Google will succeed in the same manner as with MapReduce. This software will grant companies and individuals very powerful AI, basically for free, but will make all of them fall neatly lock-step behind Google. Only companies the size of Amazon or Facebook will even attempt at creating something with a different basis (and even there, it will be a hard sell for developers) and TensorFlow will become the Last Universal Ancestor of most if not all future AIs. I think this is great, in as such as AI development will be greatly accelerated worldwide, but we must not forget that there will also be a price to pay.
What we need is a TRUE open source equivalent AI, a Linux of sorts in the world of machine learning. It is a shame that so many (otherwise) brilliant AI scientists have decided to sell out to this greedy monopolistic entity rather than working on independent open source AI (without holding back anything), but then again, we know that many scientists have some problems with morals not just since the atomic bomb...
By Google open sourcing TensorFlow it hopes to create a monopoly on the infinite commercial applications made available for it. TensorFlow is free for now while it is being slowly assimilated into every aspect of technology, then once it is accepted as the standard deep learning algorithm Google will charge like a Roman chariot for it. Brilliant move.
did i miss it or does the article entirely skip a huge question: does google collect any of the data fed into apps using their AI? if theyre essentially getting eyeballs for their engine placed on potentially millions of new devices that would explain everything
but are you guessing when you say that? this particular article doesnt say what youre saying. what it does say is: "TensorFlow is still tied, in some ways, to the internal Google infrastructure, according to Google engineer Rajat Monga. This is why Google hasn’t open sourced all of TensorFlow, he explains."
Actually they used collected data to train it in the form that it have now to form the patterns. The row uncritical data is base to use to form patterns. It's natural that they will use neural patterns to filter it more in the future (not to gather more of it but to get quality data from it). No one even mentioned a prime achievement of this that we have so far that is represented in the form of Google handwriting input method. Everyone is missing the point. Goggle is doing this so that it can gain even bigger advantage by letting community to Iron out algorithms & to collect fresh ideas. This will only be a let's say it a first gen of artificial intelligence that can be concerned to be at the level of well trained behavior reflex. The real thing will come by harvesting large behavior forms of data in the VR space & will be used to shape & form real VR at least at first. This is why Google (& the world) actually needs much better & much more of formed patterns. The roles for this are given up a long time ago.
This is very cool news for those interested in machine learning and it looks like one vendor is already offering hardware preinstalled with Google's Tensor Flow Software: http://exxactcorp.com/index.ph...
Deep learning is something interesting. I am happy that Google is making TensorFlow an open source platform. The best thing is that one can learn Python or C++ to create an AI-based product, using TensorFlow platform.
Meanwhile Oracle already structures unstructured data: 1. Oracle obtains statistics on queries and data from the data itself, internally'. 3. Oracle gets 100% patterns from data. 4. Oracle uses synonyms searching. 5. Oracle indexes data by common dictionary. 6. Oracle killed SQL: SQL, Structured Query Language either does not use statistics at all or uses manually assigned one.IBM should stop to use SQL. The structuring of data is the way to AI.
& you think Google didn't do that already? & they (Google) have much more gathered data. Try out handwriting input method from Google (Android app) even for normal use you will need a pen. By the way this is concerned (still) asas a holy grail in the AI world. Google can eat Oracle any time they want, look how Java get off from all Chrome platform right after a Oracles low suit for usage of Java without patent right in lo lv Android. Not a fair fight but Oracle is not concerned as a very ethical player anyway...
That's one of the major goals, but the hurdle is defining a universal fitness function for software quality. If we had that, unit tests would work totally differently, and recursively improving A.I. would be a stone's throw away.