This impressed me since Deepmind is neural networking. After digging into the details it turns out that AlphaZero ran on 4 of googles TPUs during the chess match. Wikipedia states how fast a 2nd generation TPU run at, but it's unclear to me if one TPU runs at 45 TFLOPS or 180 TFLOPS. After doing the math it doesn't seem to make much difference because the chess ELO rating system is not linear. Google doesn’t give us much details on the PC that ran Stockfish except that they only gave it 1GB of RAM cache, which is causing a lot of complaints in the chess community because that’s ridiculously low. My i5 core's 0.01087 TFLOPS is so-so. High performance PCs are about 0.050 TFLOPS. There are expensive PCs with a lot higher TFLOPS, but consider Google only gave Stockfish 1GB let’s use 0.060 TFLOPS. So 180 TFLOPS * 4 / 0.060 TFLOPS = 12,000. That means Deepmind AlphaZero was equivalent to 12,000 high end PCs.
Computer chess programs typically gain about 60 ELO points when doubling the computing power. If we double it 13.6 times we get roughly 12,000, which means AlphaZero has an advantage of about 816 ELO points. However, if we assume that one TPU is only 45 TFLOPS, then it comes to 756 ELO points. Not much difference.
The ELO rating system states that 400 ELO points lower means you’ll win 10% of the time. AlphaZero won 290 and lost 24, close to 10%. So AlphaZero’s rating is about 400 ELO points higher, but it has roughly 800 ELO point gain in computing power.
End result, this shows the Google DeepMind AlphaZero neural networking program has roughly 400 ELO points lower than Stockfish 8. That’s a huge difference, which means Stockfish 8 would win about 90% of the time. BTW the latest Stockfish chess program is version 9. Also lets not forget that google only gave Stockfish 8 1GB of RAM cache. So who knows what the real difference would be.
This is all in agreement with my assessment with neural networking, that in the end it will just be slower. I think a good comparison is to think in terms of programming languages. There are flexible languages such as Java that can run on any cpu. And there are easier to program languages. In both cases the end result is that it’s slower. I’m convinced that using the chess algorithm method such as Tree Search will eventually be the best and fastest method. Neural networking is the easy method. Google can use their army of computers, Google brain, to solve the neural network, which is essentially equivalent to writing the source code. It writes the source code for you, but it’s not written in an optimized computer language such as C/C++. It’s written in an highly interpretive language. In Java there’s what they call bytecode. It’s not assembly, or C, or Basic. It’s bytes of code that represents an instruction. An instruction that is not related to any cpu. It’s generic, which is why the same bytecode can run on any cpu. A neural network is even farther away from being optimized computer code. The payoff is that anyone can do machine learning, but the downfall is that it’s going to be a lot slower.
I push for AI developers to work on the Tree Search method because in the end it will make a big difference. IMO that difference could hundreds of times faster, which could make all the difference. Imagine a robot that’s for sale that could be a maid at someone’s house, or being a therapist, a psychologist, etc. Then imagine that robot being so slow that it would take it 300 times longer to figure things out, which would probably mean it would not be so useful.