Xero is a beautiful, easy-to-use platform that helps small businesses and their accounting and bookkeeping advisors grow and thrive.
At Xero, our purpose is to make life better for people in small business, their advisors, and communities around the world. This purpose sits at the centre of everything we do. We support our people to do the best work of their lives so that they can help small businesses succeed through better tools, information and connections. Because when they succeed they make a difference, and when millions of small businesses are making a difference, the world is a more beautiful place.
About the team
Xero’s Incident and Problem Management team are a part of the Site Reliability Engineering (SRE) organization and are responsible for the build, delivery and ongoing maintenance of robust process and tooling around Incident management.
The team is responsible for driving enduring reliability at Xero through robust, consistent and fast response to high severity incidents. They are responsible for building a world class process and ensuring that process matures as the demands of the business grows.
About the roles
We're looking to hire multiple roles at Lead Engineer & Senior Engineer level. These positions require experienced SRE professionals with a strong technical background, deep experience in SRE, a passion for building and delivering robust processes, and extensive experience of leading technical response to high severity cloud issues.
They will drive best practice across the business and contribute to the ongoing transformation of the Xero SRE culture. As expert communicators, they will lead technical discussions to identify and track actions associated with and identified during incident situations.
Across our SRE function, we're looking for those who are keen to deep dive into causes of incidents and proactively examine the potential causes of future incidents; working with engineering teams to remove the risk of that failure scenario. Ultimately building playbooks and automation to ensure quick and effective responses. In addition, provide ongoing training across the business to ensure the process is well understood and adhered to.
These roles will form the backbone of a new team, providing a Technical Duty Officer (TDO) function within the business. TDO’s are incident commanders who use SRE skillsets to drive fast mitigation and enduring resolution of impactful events.