All talks
Accelerated Computing

Supercharge Your ML Research with Google's TPU Research Cloud

Architecture, access, and training on TPU pods, backed by a $376K compute grant.

Supercharge Your ML Research with Google's TPU Research Cloud title slide
Type
Talk
Category
Accelerated Computing
Level
Advanced
Duration
30 min
Language
English
TPUGoogle CloudTRCJAXML Research

Abstract

How to access and use Google's TPU Research Cloud to run serious ML research for free. Covers TPU architecture (TensorCores, MXUs, systolic arrays, ICI/OCS torus topology) versus GPUs and competitors, training with PyTorch/XLA on MNIST, the TRC application process, and a live demo spinning up a TPU VM, with the cost math behind a ~$376,000 grant.

Outline

  1. 01Tensors, operations, and matrix multiplication
  2. 02TPU architecture (TensorCores, MXUs, systolic arrays) vs GPU and competitors
  3. 03Reduced precision: bfloat16 vs float32
  4. 04Model training with PyTorch/XLA on MNIST
  5. 05TPU Research Cloud: how to apply and gain access
  6. 06Live demo: spinning up a TPU VM on GCP

Key takeaways

  • TRC grants free access to 1,000+ Cloud TPUs for accepted researchers
  • A v4-64 for 5 months is roughly $376K of compute
  • bfloat16 for matmul plus float32 accumulation is the precision sweet spot
  • You can provision a TPU VM in minutes with one gcloud command

Delivered 2 times