Google Deepmind

Research Engineer, Gemini Safety / AGI Safety & Alignment

Apply Now

🌎

New York City, New York, US

18h ago

👀 5 views

📥 0 clicked apply

Job Description

On-site

We are hiring for this role in London, Zurich, New York, Mountain View or San Francisco. Please clarify in the application questions which location(s) work best for you.

At Google DeepMind, we value diversity of experience, knowledge, backgrounds and perspectives and harness these qualities to create extraordinary impact. We are committed to equal employment opportunity regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, pregnancy, or related condition (including breastfeeding) or any other basis as protected by applicable law. If you have a disability or additional need that requires accommodation, please do not hesitate to let us know.

Snapshot

Our team is responsible for enabling AI systems to reliably work as intended, including identifying potential risks from current and future AI systems, and conducting technical research to mitigate them. As a Research Engineer, you will design, implement, and empirically validate approaches to alignment and risk mitigation, and integrate successful approaches into our best AI systems.

About Us

Conducting research into any transformative technology comes with responsibility to build mechanisms for safe and reliable development and deployment at every step. Technical safety research at Google DeepMind investigates questions related to evaluations, reward learning, fairness, interpretability, robustness, and generalisation in machine learning systems. Proactive research in these areas is essential to the fulfilment of the long-term goal of Google DeepMind: to build safe and socially beneficial AI systems.

The Role

We are seeking Research Engineers for our Gemini Safety and AGI Safety & Alignment (ASAT) teams.

Gemini Safety

Gemini Safety is seeking Research Engineers to contribute to the following areas:

Pretraining

In this role you will investigate new techniques to improve the safety behavior of Gemini via pretraining interventions.

You will conduct empirical studies on model behavior, analyze model performance across different scales, experiment with synthetic datasets, data weighting, and related techniques. You should enjoy working with very large scale datasets and have an empirical mindset.

Text output (core Gemini, reasoning models, Search AI mode, etc)

This role is focused on post training safety. You will be part of a very fast paced, intense effort at the heart of Gemini to improve safety and helpfulness for the core model, and help adapt the model to specific use cases such as reasoning or search.

Red teaming and adversarial resilience

In this role, you will build and apply automated red teaming via our most capable models, find losses and vulnerabilities in our Gen AI products, including Gemini itself, reasoning models, image and video generation, and whatever else we are building.

You may also work to improve resilience to jailbreaks and adversarial prompts across models and modalities, driving progress on a fundamentally unsolved problem with serious implications for future safety.

Image and video generation

This role is about safety for image and video generation, including Imagen, Veo, and Gemini. You will design evaluations for safety and fairness, improve the safety behavior of the relevant models working closely with the core modeling teams, and design mitigations outside the model (e.g. external classifiers).

AGI Safety & Alignment (ASAT)

Our AGI Safety & Alignment team is seeking Research Engineers to contribute to the following areas.

ASAT Engineering

General purpose engineers on ASAT implement our safety policies and improve our velocity. For example, you may:

Prototype, implement, maintain, and periodically run evaluations to support model testing across risk domains in the Frontier Safety Framework
Proactively identify and address infrastructure and engineering challenges to improve velocity, e.g. by solving model inference pain points or tackling scheduling problems for experiments

This role benefits from traditional software engineering skills, such as architecture design, the ability to learn new technologies quickly, distributed systems knowledge, and the ability to work with large codebases. It also requires an understanding of machine learning workflows, but not a deep understanding of machine learning theory. For work focused on supporting the Frontier Safety Framework, an understanding of dangerous capability evaluations and responsible capability scaling is a plus.

Applied Interpretability

The focus of this role is to put insights from model internals research into practice on both safety in Gemini post-training and dangerous capability evaluations in support of our Frontier Safety Framework.

Key responsibilities:

Rapid research ideation, iteration and production implementation of interpretability applications to address promising use cases.
Make interpretability methods scalable and deploy them in the production codebase.
Implementing techniques emerging from collaboration with mechanistic interpretability researchers, in addition to other approaches such as probing and training data attribution.

AGI Safety Research

In this role you will advance AGI safety & alignment research within one of our priority areas. Candidates should have expertise in the area they apply to. We are also open to candidates who could lead a new research area with clear impact on AGI safety & alignment. Areas of interest include, but are not limited to:

Dangerous Capability Evaluations: Designing evaluations for dangerous capabilities for use in the Frontier Safety Framework, particularly for automation of ML R&D
Safety cases: Producing conceptual arguments backed by empirical evidence that a specific AI system is safe
Alignable Systems Design: Prototyping AI systems that could plausibly support safety cases
Externalized Reasoning: Understanding the strengths and limitations of monitoring the “out loud” chain of thought produced by modern LLMs
Amplified Oversight: Supervising systems that may outperform humans
Interpretability: Understanding the internal representations and algorithms in trained LLMs, and using this knowledge to improve safety
Robustness: Expanding the distribution on which LLMs are trained to reduce out-of-distribution failures
Monitoring: Detecting dangerous outputs and responding to them appropriately
Control Evaluations: Designing and running red team evaluations that conservatively estimate risk from AI systems under the assumption that they are misaligned.
Alignment Stress Testing: Identifying assumptions made by particular alignment plans, and red teaming them to see whether they hold.

About You

You must have at least a year of experience working with deep learning and/or foundation models (whether from industry, academia, coursework, or personal projects).
Your knowledge of mathematics, statistics and machine learning concepts enables you to understand research papers in the field.
You are adept at building codebases that support machine learning at scale. You are familiar with ML / scientific libraries (e.g. JAX, TensorFlow, PyTorch, Numpy, Pandas), distributed computation, and large scale system design.
You are keen to address risks from foundation models, and plan for your research to impact production systems on a timescale between “immediately” and “a few years”.
You are excited to work with strong contributors to make progress towards a shared ambitious goal.

What we offer

At Google DeepMind, we want employees and their families to live happier and healthier lives, both in and out of work, and our benefits reflect that. Some select benefits we offer: enhanced maternity, paternity, adoption, and shared parental leave, private medical and dental insurance for yourself and any dependents, and flexible working options. We strive to continually improve our working environment, and provide you with excellent facilities such as healthy food, an on-site gym, faith rooms, terraces etc.

We are also open to relocating candidates to Mountain View and offer a bespoke service and immigration support to make it as easy as possible (depending on eligibility).

The US base salary range for this full-time position is between $114,000 - $245,000 + bonus + equity + benefits. Your recruiter can share more about the specific salary range for your targeted location during the hiring process.