Our team owns the core preventative reliability platforms, tools, and programs used by infrastructure and product teams across the company. Our load testing platform enables traffic generation and monitoring to ensure products are prepared to handle surges in usage, our chaos platform enables product and infrastructure experimentation to validate system resiliency and operational excellence, and our scale validation game day program ensures Stripe can handle our biggest customers' biggest days.
We’re looking for an experienced distributed systems engineer with outstanding technical and leadership skills, strong collaboration skills and huge passion for customers to help deliver the foundation of our reliability infrastructure and work with various teams and across the entire stack to deliver world-class reliability solutions. In this role you’ll not only be in charge of designing, implementing and testing your reliability infrastructure components, but you’ll play an influential role in enabling engineering teams to make their services more reliable by identifying, creating, and deploying engineering practices, processes, and solutions.
We’re looking for someone who meets the minimum requirements to be considered for the role. If you meet these requirements, you are encouraged to apply. The preferred qualifications are a bonus, not a requirement.