Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Google's Intelligence Designer (technologyreview.com)
97 points by finisterre on Dec 2, 2014 | hide | past | favorite | 28 comments



These MIT Tech Review articles, alas, emphasize hype: "No one had ever demonstrated software that could learn to master such a complex task from scratch," and "But until DeepMind’s Atari demo, no one had built a system capable of learning anything nearly as complex as how to play a computer game, says Hassabis."

I think the article must have overlooked significant activity in training learning systems to play games well. The glaring omission for me was Neurogammon (1987), later TD-Gammon (1992), developed by Gerry Tesauro and colleagues (http://en.wikipedia.org/wiki/TD-Gammon).

Neurogammon was, at the time, a sensation at the same conference the article coyly refers to as "a leading research conference on machine learning." The paper has almost 1000 citations. A curious omission.


Aren't these quite different tasks, though? There's a big difference in 'learning to play a specific game well' vs. 'learning to play arbitrary games'; such a big difference that I think they're entirely different disciplines. Correct me if I'm wrong, but the software in the research you reference was given the ruleset to the game, right? And DeepMind's software is not given that information, I think. I doubt they intentionally omitted that work, I think it's more likely they didn't consider it relevant enough.


Thanks for the correction. The "arbitrary" qualifier is not in TFA, but (as, indeed, you said) that's the point of the demo, e.g.: https://www.youtube.com/watch?v=EfGD2qveGdQ Note that they're using just the video signal from the game as input.

It's really a sad comment on the state of reporting at MIT Tech Review that you learn more about the tech from a youtube video than from an article.

(My complaint is not with the DeepMind people, it's with the article, which should put the work in context.)


> It's really a sad comment on the state of reporting at MIT Tech Review that..

I feel compelled to point out that the only connection between the "MIT" tech review and MIT is that the magazine licenses the name from the alumni association. It's how the alumni associations funds itself and every MIT grad gets a lifetime subscription to a version of the magazine with the alumni notes bound into the back. I doubt many of us read it. I don't know how many people other than MIT grads read it, but I would imagine vanishingly few.

A friend of mine calls it "the magazine of things that will never happen" which I think is dead on. It's a shame because the editor, Jason Pontin, as actually a good guy so it's surprising that the magazine continued to suck after he took it over.

There are many reasons to criticize MIT (don't I know it!) but you can't judge the institute by this magazine.


I'm going to disagree a bit here. Tech Review does tend to focus on the possibilities of technology and to highlight potentially exciting research. Almost by definition, a lot of this stuff is never going to amount to anything commercially interesting. I suppose that TR could insert more implicit or explicit disclaimers to that effect but I find it a good source for insights into what's going on in the labs.

Personally, I think that Jason has brought a lot of positive changes to a magazine that, for a long time, tended toward a technology policy wonkish orientation.

So I think it's fair that a lot of what's written about "will never happen." But I'm not sure that's really avoidable if you cover cutting-edge research.


I like your comments and I am all about holding Tech Review to a high standard, but I think I am going to side with them on this. The key part is "from scratch". I'd venture that there are lots of AI projects that similarly relevant precursors to DeepMind, all of which (including your Backgammon examples) do not actually accomplish the same from-scratch abilities described.


The thing about Tesauro's backgammon work that excited the community is that the system trained by playing itself (http://webdocs.cs.ualberta.ca/~sutton/book/ebook/node108.htm... -- "To apply the learning rule we need a source of backgammon games. Tesauro obtained an unending sequence of games by playing his learning backgammon player against itself.").

Also, it didn't use an elaborate set of features and heuristics adapted for backgammon, just a simple representation of the state of the board (a list of 0/1 variables encoding how many pieces of each color are on each position).

This is pretty close to "from scratch", and I think the article would have done well to point out what is actually new here.


They are different tasks, but I'm not seeing any really clear descriptions of what DeepMind can and cannot do. It's possible that their software is only good at a very specific kind of thing that happened to be in line with what Google wants. And for all we know, they could be at a complete loss as to how to progress.

I mean, to what extent did they restrict what it means to be an "arbitrary game"? I highly doubt their software can play Pictionary, for instance, but I haven't found anything that really explains their limitations.

Because of this, I am leaning towards the cynical and assume it's just hype, and not actually that incredible.


It's basically the same algorithm, or at least very similar. The main difference is they use huge neural networks running on GPUs, and they feed it raw video data, rather than the game board state directly.

It's not any less impressive though, to my knowledge no one had done anything like that before. That is, beating video games with raw video data and reinforcement learning.


Did they hard-code the rules of backgammon into the software, or only the board state? I think there's a sort of conceptual ladder visual input --> game state --> games rules --> game strategy and it's very important to specify which rungs the software started on.


Just the position of the pieces on the board. They did give it some other features to help it. I forget what they were though, but just simple stuff that was calculated directly from the board state.


From what I've read, DeepMind's approach is to just feed in the raw pixel data and the score. No rules, or anything like that.


Here's the original research paper if you're interested.

http://arxiv.org/abs/1312.5602

I'll just quote their introduction instead of trying to summarize the paper:

"Our goal is to create a single neural network agent that is able to successfully learn to play as many of the games as possible. The network was not provided with any game-specific information or hand-designed visual features, and was not privy to the internal state of the emulator; it learned from nothing but the video input, the reward and terminal signals, and the set of possible actions—just as a human player would. Furthermore the network architecture and all hyperparameters used for training were kept constant across the games. So far the network has outperformed all previous RL algorithms on six of the seven games we have attempted and surpassed an expert human player on three of them."


My gripe is with the post, not the paper. But you're right, the best way to figure out what's new is to go to the source.

The paper does a good job going over related work (section 3), beginning with the example I gave.


Indeed. I am also surprised that no one mentioned Tom Murphy's Sigbovik paper from April 1st 2013 - "The First Level of Super Mario Bros. is Easy with Lexicographic Orderings and Time Travel ... after that it gets a little tricky" http://www.cs.cmu.edu/~tom7/mario/mario.pdf

Murphy created an agent that can play arbitrary games by inspecting the RAM and attempting to maximize the score.

See also this writeup on Ars Technica - http://arstechnica.com/gaming/2013/04/this-ai-solves-super-m...


While interesting, he uses a brute force approach (try every possible combination of moves so many seconds into the future and see which one is the best.)


FYI this is also the guy that made Elon Musk fear strong AI. Elon Musk invested in DeepMind in the early days just to see where AI is going.


For a brief, horrifying moment, I thought this was the name of a product.

What a time to be alive.


Machine learning folks don't know the history of CS or AI, so they've reinvented neural networks as "deep learning"?

Or, industry types are looking for the next big thing, after "big data," and have rebranded neural networks as "deep learning"?

I don't mean to be too cynical, but I still don't understand if "deep learning" represents any meaningful advance besides the ML and EE communities finding the benefits of a certain amount of structure, which is already well established in other lines of research.


While this complaint generally has validity, their paper [1] does IMO present an advance; it's not just handing a bunch of labeled data off to a large neural network.

IIRC (forgive me, I read the paper a few weeks ago) the solution is at its core a reinforcement learning system, with the deep net only making up the component that predicts reward from a (state, action) pair. With that in hand, there remains the non-trivial RL problem of balancing "exploration vs exploitation" in learning good strategies to play the game(s). While NN's have been used in this capacity before, I believe that, as other comments have mentioned, using a deep net to learn to map a high-dimension state-action space (e.g,the state of the game represented as pixels of the screen at a particular time) to expected reward in real time was indeed an advance, both theoretical and technical.

And, oh yeah, I just remembered that a University of Texas research group is doing work in this area too (there was a recent paper [2] from Peter Stone and others).

(Edited for clarity)

(Edited again to suggest another paper).

[1] - http://arxiv.org/pdf/1312.5602.pdf

[2] - http://www.cs.utexas.edu/~pstone/Papers/bib2html-links/TCIAI...


Deep learning is not just neural networks, but rather the application of these in deep (i.e. many-layered) architectures, broadly speaking.

This enables hierarchical learning of increasingly complex concepts – building new concepts upon less complex concepts from previous layers. Deep architectures are thus able to learn high abstractions, as in [1], for instance.

If you have not yet done so, I would strongly urge you to read some papers on the subject from the last decade (e.g. Hinton, Bengio or LeCun), or even just skim through the Wikipedia entry [2].

[1] http://www.technologyreview.com/view/532886/how-google-trans...

[2] http://en.wikipedia.org/wiki/Deep_learning


Deep learning is a large scale application of Restricted Boltzmann Machines, of which Hinton (among others) was a pioneer. But that was in the 80s, not in the 2000s.

http://en.wikipedia.org/wiki/Restricted_Boltzmann_machine


I don't believe the term "deep learning" is restricted to RBMs only – at least that's not the way I've seen the term used in literature (e.g. Deep Convolutional Neural Networks, various deep Autoencoders, etc.).


Convolutional networks were also developed in the 80s as well as backpropagating algorithms (autoencoders). The way i see it used, "deep" usually means many layers, indicating a difference in quantity, not in quality.

Point is, the science was there since the 80s, and not much has changed.


Sure, but these types of deep architectures haven't really been practical until relatively recently.

Well, then we're in agreement about the meaning of the term. Deep Learning, then, would be Machine Learning using any of these deep architectures – be they Restricted Boltzmann Machines, or otherwise.


I sometimes wonder if in the 2030s, people will be complaining about how all the interesting stuff was really invented back in the 2010s.

But yes, the available computing power has been a huge limitation for much AI research.


It's just that previously neural networks with more than 3 layers have been prohibitively expensive to train. Now that we've discovered some shortcut tricks and got the training on to GPUs we can finally have those neural networks with lots of layers. Only the implementation details have changed.


This guy vs Shingy, AOL's "Digital Prophet".




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: