Mind boggling machine learning results from AlphaZero

In summary, the conversation discusses a groundbreaking achievement in the field of machine learning, where a self-learning algorithm was able to master the game of chess within 24 hours, surpassing any other existing chess program despite running on slower hardware. The algorithm also showed impressive results in learning other games such as Go and Shogi. The conversation also touches on the potential impact of this advancement in AI and the debate surrounding the possibility of AI surpassing human intelligence. However, the success of machine learning in recent years, particularly in games like Go and chess, challenges the belief that AI's impact on society will be minimal. The conversation ends with a discussion on the potential for a "Moore's law" for AI and the need to come up with a
  • #71
Nice video on the subject
 
  • Like
Likes Devils
Technology news on Phys.org
  • #72
I love chess. But I don't see the appeal in playing a game against a machine, knowing you're going to lose. Chess is an exercise for the mind. There is no benefit knowing the inevitable outcome is failure. I do, however, see a huge significance for science. Imagine a machine that can tell you exactly how to cure an incurable disease. Or one that can compute the closest planet that will sustain life. The possibilities are mind boggling.
 
  • #73
One thing I wonder about is the capacity to abstract generalized intelligence from the physical world. One thing that defines AI is the vast training sets, humans don't really use. But we do have vast training sets in terms of a continuous stream of experiences from birth, and somehow we are able to use that experience to rapidly learn new abstract things. Scary as it sounds, the bridge to real AI, or a demonstration that Google has it, may have to come from agents processing such streams of world experience: droids!
 
  • #74
Fooality said:
One thing I wonder about is the capacity to abstract generalized intelligence from the physical world. One thing that defines AI is the vast training sets, humans don't really use. But we do have vast training sets in terms of a continuous stream of experiences from birth, and somehow we are able to use that experience to rapidly learn new abstract things. Scary as it sounds, the bridge to real AI, or a demonstration that Google has it, may have to come from agents processing such streams of world experience: droids!
Actually, the distinguishing feature of AlphaZero is that it had no training data at all, nor any input of human expertise. However, in current form it is not at all general intelligence. Instead, it is a general capability to self learn extreme expertise within a closed system all on its own. I agree that much of what people consider general intelligence is tied to interacting with the world and other people, especially via language. At some point this will have to be tackled, to achieve any form of true AI.
 
  • Like
Likes Fooality
  • #75
PAllen said:
Actually, the distinguishing feature of AlphaZero is that it had no training data at all, nor any input of human expertise. However, in current form it is not at all general intelligence. Instead, it is a general capability to self learn extreme expertise within a closed system all on its own. I agree that much of what people consider general intelligence is tied to interacting with the world and other people, especially via language. At some point this will have to be tackled, to achieve any form of true AI.

It's really impressive. But in a sense I think does have training data, in terms of the games (as I understand it) it plays against itself. What seems unique is our ability to abstract general information from basic experience, like if one of these were able to see how programming was like chess, and use generalized chess skills to program.

Do you notice how we do that with language? So much in computer science is metaphors for physical world: Trees, folders, firewalls, viruses, sockets...All these terms relate physical experience to abstract entities. The elements of experience we generalize have application beyond the physical world we have experienced thus far and relate also to new realms we have not experienced, including information realms.
 
  • #76
Fooality said:
But in a sense I think does have training data, in terms of the games (as I understand it) it plays against itself.
That still means it had to develop everything itself. It didn't have even the most basic knowledge ("keeping the queen is good"). I wonder how the first games were played. Completely randomly until one side happened to be able to check mate the other side within a few moves? A few games until it discovers that it is advisable to beat pieces of the opponent?
 
  • Like
Likes PeroK
  • #77
mfb said:
That still means it had to develop everything itself. It didn't have even the most basic knowledge ("keeping the queen is good"). I wonder how the first games were played. Completely randomly until one side happened to be able to check mate the other side within a few moves? A few games until it discovers that it is advisable to beat pieces of the opponent?

I agree, its really compelling. That's why it's got me thinking about how far it is from being general AI. The game trees of these games the Alpha machines are playing are huge, so it must be abstracting lessons or rules already, classifying situations, generalizing in a sense. How far is it from generalizing rules that help in other games, or in general world situations do you think?
 
  • #78
Imperfect knowledge is a big issue. See poker: All relevant probabilities for the hands are trivial even for a 1980 computer, but it took until 2017 for computers to accurately evaluate how to deal with cards others see but the program does not. StarCraft is still an open problem because the AIs cannot scout as well as humans, have trouble including the scouting results in their strategy, and so on. Well, in August Blizzard released tools that made it easier to train AIs. And of course the DeepMind team is involved. Certainly something to watch in 2018.
 
  • Like
Likes Fooality
  • #79
mfb said:
That still means it had to develop everything itself. It didn't have even the most basic knowledge ("keeping the queen is good"). I wonder how the first games were played. Completely randomly until one side happened to be able to check mate the other side within a few moves? A few games until it discovers that it is advisable to beat pieces of the opponent?

This is what I can't quite understand. Even if a human being wasn't told the queen is powerful, they would soon work that out. It does seem clumsy to have to play randomly.

I wonder, however, whether the knowledge of checkmate led initially not to random moves but to direct attacks. The position without the queen was immediately assesed as less favourable because there were fewer checkmating patterns left? Perhaps, generally, that's how it very quickly learned the value of extra material?

In the end, of course, it appears to have developed an indirect, super-strategic style - the opposite of direct attack. The game above is a good example, where checkmate or even an attack on the king never entered into the game. It was a pure strategic out-manoeuvering of Stockfish with no sense that checkmate had anything to do with the game at all.

Although, perhaps that's only its style against a highly-powerful computer than makes no tactical errors. I wonder how it would play against a human?
 
  • #80
I would be surprised if it can understand the opponent - I would expect it to play purely based on the current board (and the RNG output).

It plays unconventionally - I can imagine that human opponents get lost quickly. The AI will take an opportunity to check mate if it is there, but simply improving the material and tactical advantage more and more is a very reliable strategy as well.A checkmate can be done quickly, that is a good point - probably not too many random moves then.
 
  • #81
mfb said:
I would be surprised if it can understand the opponent - I would expect it to play purely based on the current board (and the RNG output).

It plays unconventionally - I can imagine that human opponents get lost quickly. The AI will take an opportunity to check mate if it is there, but simply improving the material and tactical advantage more and more is a very reliable strategy as well.A checkmate can be done quickly, that is a good point - probably not too many random moves then.

It's not that is understands its opponent, but that Stockfish never gave it the opportunity to go in for some tactics. I'm not sure how Stockfish works, but a strategy for Stockfish would be to be programmed to favour ultra-sharp positions, where its calculating power might be better than Alpha's.

A human could try that strategy, although against a computer it would almost certainly backfire. But, a human could (easier said than done) - especially as White - force Alpha out of its comfort zone. I doubt that Alpha would get caught out by trying to retreat into a strategic game. I would expect Alpha to be able to scrap it out.

I also meant that human mistakes would lead Alpha to a more aggressive style that Stockfish didn't allow. In any case, against a human opponent (especially a weaker player), we might see another side to Alpha's game.
 
  • #82
Andy Resnick said:
I finally got a chance to read the arXiv report, which is fascinating- my question is, is there some way to 'peek under the hood' to see the process by which AlphaZero optimized the move probabilities based on the Monte-Carlo tree search, and if during the process of selecting and optimizing the parameters and value estimates arrived at an overall strategic process that is measurably distinct from 'human' approaches to play- could AlphaZero pass a 'Chess version' Turing test?

I don't think so. You are right in what the paper is about. There is no alpha-beta, no "clever heuristics", no evaluation function. Instead there is a deep neural network and Monte Carlo Tree Search.

What they seem to have done is transform the problem from one domain to another. Evaluations functions are subject to human whims and best guess. On the other hand numerical optimisation problems have been studied for well over 40 years.

The authors make statements like "AlphaGo Zero tuned the hyper-parameter of its search by Bayesian optimisation" , " AlphaZero instead estimates and optimises the expected outcome" , AlphaZeroevaluates positions using non-linear function approximation." We know an awful lot about efficient numerical optimisation algorithms, and not so much about hand-coded evaluation functions.

They also say "while alpha-beta programs based on neural networks have previously been un-able to compete with faster, handcrafted evaluation functions." In other words, people have tried similar approaches before and failed, and the authors believe their new approach is right.

The "breakthrough" seems to be the transformation process, or creating a "dual" problem. So have they given the algorithm ? No, just a hint. All they have said is there is a "secret sauce" without providing the recipe - yet.
 
  • #83
Devils said:
and the authors believe their new approach is right.
With the success in Go and Chess: The approach can't be too bad...
 
  • #84
Listened to a podcast w Gary Kasparov from 2
 
  • #85
Is the key issue that the neural networks can do a better job than humans of trimming the tree of potential moves?

The way I understand traditional chess engines is that a human expert provides a framework which keeps the combinations that the computer crunches with raw force at a manageable level. This is combined with an opening book and an endgame book of solved positions (which I think is now solved up to 6-7 pieces on each side, at about the limit that any computer will ever do given how rapidly the number of combinations explodes). Its impressive if Deep Mind trained itself not only on the middle game, which traditional computers rely on human-programmed heuristics to keep the calculations manageable, but also trained itself to a level equivalent with the current 100TB databases of solved endgames.
 
  • #86
BWV said:
Is the key issue that the neural networks can do a better job than humans of trimming the tree of potential moves?
It would be interesting to see how well AlphaZero performs if the number of states it can go through is limited to human-like levels. I'm not aware of such a competition.
 
  • Like
Likes QuantumQuest
  • #87
mfb said:
It would be interesting to see how well AlphaZero performs if the number of states it can go through is limited to human-like levels. I'm not aware of such a competition.
Problem is, nobody knows how many positions humans consider, because humans cannot accurately report on both conscious and unconscious thought - a milder version of a major problem with neural networks. If you believe what is reported, Capablanca (world chess champion with reasonable claim to being the greatest natural chess prodigy) answered the question “how many moves do you consider?” with “I only consider one move - the best one”. Of course tongue in cheek, but really nobody including the grandmaster knows all that goes into choosing a move.
 
  • Like
Likes QuantumQuest
  • #88
You can't get the number right, but you can get a rough estimate. "Did you consider this state?" Currently we know every chess engine considers much more boards than humans.
 
  • Like
Likes QuantumQuest
  • #89
Fooality said:
One thing I wonder about is the capacity to abstract generalized intelligence from the physical world. One thing that defines AI is the vast training sets, humans don't really use. But we do have vast training sets in terms of a continuous stream of experiences from birth, and somehow we are able to use that experience to rapidly learn new abstract things. Scary as it sounds, the bridge to real AI, or a demonstration that Google has it, may have to come from agents processing such streams of world experience: droids!
I keep wondering whether this technology would be applicable to solving mathematics problems, perhaps defining the game rules by some formal system. What I find hard to imagine is how to formulate the state of a partial proof in a form that can be the input of an artificial neural net.
 
  • #90
Was not clear in my OP, the question is that chess engines like Stockfish, to my understanding, rely on human-programmed heurstics to trim the decision tree in the middle game. I am guessing that this is where the NN outperforms
 
Last edited:
  • #91
PAllen said:
Problem is, nobody knows how many positions humans consider, because humans cannot accurately report on both conscious and unconscious thought
To illustrate this point I remember a game when I used to play weekend chess tournaments. I was about 1800, so a decent player. I was losing to a slightly weaker opponent having blown a big advantage when, to my horror, I noticed my opponent had a checkmate in one! which, obviously he hadn't seen.

While he was thinking more and more people crowded round our board. I was praying they would all go away but my opponent never noticed. When he finally moved, not the check mate, a roar of laughter went up and I slumped back in my chair. Only then did my opponent notice the crowd!

So, what on Earth was he thinking? What moves was he looking at and why didn't he notice the mate in one?

Sometimes I think looking at the top players doesn't help understand human thought because they are so exceptional. Looking at what an average player does is perhaps more interesting.
 
  • #92
Hendrik Boom said:
I keep wondering whether this technology would be applicable to solving mathematics problems, perhaps defining the game rules by some formal system. What I find hard to imagine is how to formulate the state of a partial proof in a form that can be the input of an artificial neural net.

Yeah, good question. I know mathematical proofs we're one of the first thing they tried to unleash computers on in the 50s, and they meet their first failures in making machines think. There's more to it than just formal logic it seems.

Thinking about Bridges of Königsberg problem solved by Euler. You have this question that seems to involve all this complexity, but you discard a lot of data to get down to the simplest representation, and in that context break down the notion of travel until the negative result is obvious, is proven. And it's proven to us because in that simple form we can understand it, we don't have the cognitive power to brute force it.

How does Euler's brain know to not think about the complete path, but rather just a single node (in his newly created graph theory) to find the solution for all complete paths? It's hard to imagine NN doing this without a priori knowledge that paths are composed of all the places visited, again real physical world knowledge.
 
  • #93
Fooality said:
Yeah, good question. I know mathematical proofs we're one of the first thing they tried to unleash computers on in the 50s, and they meet their first failures in making machines think. There's more to it than just formal logic it seems.

Thinking about Bridges of Königsberg problem solved by Euler. You have this question that seems to involve all this complexity, but you discard a lot of data to get down to the simplest representation, and in that context break down the notion of travel until the negative result is obvious, is proven. And it's proven to us because in that simple form we can understand it, we don't have the cognitive power to brute force it.

How does Euler's brain know to not think about the complete path, but rather just a single node (in his newly created graph theory) to find the solution for all complete paths? It's hard to imagine NN doing this without a priori knowledge that paths are composed of all the places visited, again real physical world knowledge.
I'm hoping initially to be able to automate somewhat the choice of proof tactics in a proof assistant. Not have AI-generated insight.
 
  • #94
Hendrik Boom said:
I'm hoping initially to be able to automate somewhat the choice of proof tactics in a proof assistant. Not have AI-generated insight.

Oh you're actually doing it? Cool good luck. If you can get the training data, I don't see why not.
 
  • #95
Fooality said:
Oh you're actually doing it? Cool good luck. If you can get the training data, I don't see why not.
Sorry. I don't actually have the resources to do this. So I spend my time wondering how it might be done instead.

Training data? I thought of finding some analogue if having the machine play itself. But now that you point it out, I suppose one could use human interaction with a proof assistant as training data.
 

Similar threads

Replies
13
Views
4K
Back
Top