I’ve been reading ever more about Google’s Tensor Processing Units (TPUs), as a majority of my ML is already done on Google Cloud GPUs, and paid public access to them aside from the Tensorflow Research Cloud doesn’t seem all that far off.
I would like to learn more about what to expect from working with a gen 2 TPU, or more specifically it’s like working on the Tensorflow Research Cloud. I get the impression there’s some sort of non-disclosure agreement with researchers, for I assume commercial/competitive reasons, as I see very little discussion about using them let alone editorials.
I know there was the benchmark paper they released for the previous gen, but it would be refreshing to at least see some current practical benchmarks on Resnet or GoogLeNet/Inception considering they already have sample code available for lucky TPU Alpha customers. I’m also wondering why that magic 180 TFLOP/s number quoted by Google purposely omits which kind of floating point ops they’re talking about