In this role, youβll work to shape the future of AI/ML hardware acceleration. You will have an opportunity to drive cutting-edge TPU (Tensor Processing Unit) technology that powers Google's most demanding AI/ML applications. Youβll be part of a team that pushes boundaries, developing custom silicon solutions that power the future of Google's TPU. You'll contribute to the innovation behind products loved by millions worldwide, and leverage your design and verification expertise to verify complex digital designs, with a specific focus on TPU architecture and its integration within AI/ML-driven systems.
In this role, you will be at the forefront of advancing ML accelerator performance and efficiency, employing a comprehensive approach that spans compiler interactions, system modeling, power architecture, and host system integration. You will prototype new hardware features, such as instruction extensions and memory layouts, by leveraging existing compiler and runtime stacks, and develop transaction-level models for early performance estimation and large-scale workload simulation.
A critical part of your work will be to optimize the accelerator design for maximum performance under strict power and thermal constraints; this includes evaluating novel power technologies and collaborating on thermal design. Furthermore, you will streamline host-accelerator interactions, minimize data transfer overheads, ensure seamless software integration across different operational modes like training and inference, and devise strategies to enhance overall ML hardware utilization.
To achieve these goals, you will collaborate closely with specialized teams, including XLA (Accelerated Linear Algebra) compiler, Platforms performance, package, and system design to transition innovations to production and maintain a unified approach to modeling and system optimization.
The ML, Systems, & Cloud AI (MSCA) organization at Google designs, implements, and manages the hardware, software, machine learning, and systems infrastructure for all Google services (Search, YouTube, etc.) and Google Cloud. Our end users are Googlers, Cloud customers and the billions of people who use Google services around the world.
We prioritize security, efficiency, and reliability across everything we do - from developing our latest TPUs to running a global network, while driving towards shaping the future of hyperscale computing. Our global impact spans software and hardware, including Google Cloudβs Vertex AI, the leading AI platform for bringing Gemini models to enterprise customers.
The US base salary range for this full-time position is $156,000-$229,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process.
Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google.