Wayve Logo

Wayve

Staff Software Engineer, ML Platform

🌎

Sunnyvale

8h ago
πŸ‘€ 1 views
πŸ“₯ 0 clicked apply

Job Description

At Wayve we're committed to creating a diverse, fair and respectful culture that is inclusive of everyone based on their unique skills and perspectives, and regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, veteran status, pregnancy or related condition  (including breastfeeding) or any other basis as protected by applicable law.  

About us   

Founded in 2017, Wayve is the leading developer of Embodied AI technology.  Our advanced AI software and foundation models enable vehicles to perceive, understand, and navigate any complex environment, enhancing the usability and safety of automated driving systems.

Our vision is to create autonomy that propels the world forward.  Our intelligent, mapless, and hardware-agnostic AI products are designed for automakers, accelerating the transition from assisted to automated driving.  

At Wayve, big problems ignite usβ€”we embrace uncertainty, leaning into complex challenges to unlock groundbreaking solutions. We aim high and stay humble in our pursuit of excellence, constantly learning and evolving as we pave the way for a smarter, safer future.

At Wayve, your contributions matter.  We value diversity, embrace new perspectives, and foster an inclusive work environment; we back each other to deliver impact.  

Make Wayve the experience that defines your career!  

The role 

We are looking for a Staff Software Engineer to drive the direction of the Wayve Machine Learning platform. The ML Platform team owns the machine learning training infrastructure and works with users to ensure that this infrastructure is reliable and efficiently utilised.

Key responsibilities:

  • You will take ownership of the training infrastructure, which is used for distributed training of large jobs. Your technical decisions will drive high quality projects that ensure availability, reliability and scalability of the system.
  • You will be working across functions with machine learning research engineers to optimise models so that they can be trained efficiently by maximising their usage of hardware resources and improving their reliability and observability.
  • You will collaborate with technical and non-technical stakeholders to understand current user needs and identify future bottlenecks.
  • You will guide and mentor mid-level engineers and promote high software engineering standards

Examples Projects:

  • Training job scheduling and orchestration e.g. tooling to schedule jobs across multiple cloud providers depending on model needs and hardware availability.
  • Tooling which provides thousands of GPUs simultaneously to our driving simulator, which we use to test the driving performance of our models off road.
  • Profiling training jobs with tools such as NVIDIA Nsight, identifying bottlenecks and optimizing the models to increase efficiency.

About you  

In order to set you up for success in this role at Wayve, we’re looking for the following skills and experience.  

Essential

  • Minimum of 10 years experience in platform engineering or similar field with a proven track record of designing and scaling resilient systems
  • Proficiency in Python, with the ability to mentor engineers on best practices and scalable design
  • Extensive experience with concurrent, parallel and distributed computing, including performance tuning and optimisation for large-scale applications
  • Comprehensive knowledge of cloud platforms, preferably Azure, including architecture design, cost optimization, security best practices and declarative configuration (Terraform)
  • Proven experience with containerization and orchestration technologies, including advanced knowledge of Docker and Kubernetes
  • Leadership and mentorship experience, guiding mid-level engineers, driving technical decision-making and collaborating with cross-functional teams to align engineering initiatives with business goals.
  • Passion for building stable and scalable infrastructure that empowers users to train large models seamlessly, efficiently and at scale.

Desirable

  • Experience with ML frameworks, preferably PyTorch, with a strong understanding of their internal workings and optimisation strategies.
  • Proven ability to profile, optimise and scale ML training jobs using advanced tools such as NVIDIA Nsight or TensorRT

#LI-HH1

We understand that everyone has a unique set of skills and experiences and that not everyone will meet all of the requirements listed above. If you’re passionate about self-driving cars and think you have what it takes to make a positive impact on the world, we encourage you to apply.

For more information visit Careers at Wayve. 

To learn more about what drives us, visit Values at Wayve 


DISCLAIMER: We will not ask about marriage or pregnancy, care responsibilities or disabilities in any of our job adverts or interviews. However, we do look to capture information about care responsibilities, and disabilities among other diversity information as part of an optional DEI Monitoring form to help us identify areas of improvement in our hiring process and ensure that the process is inclusive and non-discriminatory.

 

 

More Jobs at Wayve