Graphcore Logo
Graphcore
Principal BMC Engineer
🌎Bristol, UK
1 month ago
πŸ‘€ 5 views
πŸ“₯ 0 clicked apply

Job Description

About Graphcore

How often do you get the chance to build a technology that transforms the future of humanity?

Graphcore products have set the standard in made-for-AI compute hardware and software, gaining global attention and industry acclaim. Now we are developing the next generation of artificial intelligence compute with systems that will allow AI researchers to develop more advanced models, help scientists unlock exciting new discoveries, and power companies around the world as they put AI at the heart of their business.

Graphcore recently joined SoftBank Group – bringing large and ongoing investment from one of the world’s leading backers of innovative AI companies.

Job Summary

Reporting to the Firmware Manager, the Principal BMC Engineer is responsible for designing and supporting the delivery of the Baseboard Management Controller software stack for the next generation of AI server class systems. They will work in close conjunction with our partners, hardware vendors and our customers to help build best-in-class systems for large scale data-centre environments. The Principal BMC engineer is expected to demonstrate technical leadership and support any future growth of the team.

About the Firmware Team

The firmware team writes the software that ensures the full and complete bootup of Graphcore Hardware and Silicon. We design and maintain interfaces to allow our Drivers software to interact with Graphcore Silicon. We present telemetry and monitoring data for use by the host system via the SMBus interface, and by data-centre operatives via the BMC.

Responsibilities and Duties

  • Leading in the design and implementation of an OpenBMC solution for a server-class platform for Artificial Intelligence
  • Working closely with hardware teams to influence hardware design and review HW architecture & schematics.
  • Working closely with customers and end users to align on expectations and deliverables for their deployment needs and environments
  • Working with our partners and security team to ensure we meet product security goals.
  • Designing solutions for errors, stats & configuration appropriate to CPU, GPU, DIMM, SSDs, NICs, IB, PSU, BMC, FPGA, CPLD etc. for enterprise readiness of Server platforms.
  • Designing and developing performance optimised active monitoring BMC solutions using DMTF Standards including MCTP, Redfish, SPDM and PLDM specifications
  • Developing and reviewing code, writing and reviewing design documents, reviewing QA test plans and working closely with all team members to achieve consensus for design and testability as per product requirements.
  • Assisting in the hiring of additional team members, act as a mentor, promote excellence across the team.

Candidate Profile

Essential:

  • Domain expertise in BMC Firmware development on x86 or ARM Platforms including BMC-BIOS communications, thermal management, power management, firmware update, device monitoring, firmware security, etc.
  • Excellent programming and scripting skills using C/C++, Rust, Python, Go, Bash. both for Linux user-space programs and system programs with thorough code reviewing skills.
  • Strong in Linux fundamentals, various Linux distributions and packages, Linux upgrade mechanisms, building and deploying Linux images.
  • Out-of-band and In-band System Management experience with exposure to standards including IPMI, KCS, DMTF Standards (PLDM, MCTP, Redfish, etc), PMBus, NVMe, etc.
  • Understanding of REST architecture style especially JSON over HTTPs with OAuth
  • Expertise in system software and platform security for x86 or ARM based Rack/Blade server systems.
  • Possess excellent written and oral communication skills, good work ethics, high sense of team-work, love to produce quality work and commitment to finish your tasks every single day. You are a self-starter who loves to find creative solutions to challenging problems

Desirable

  • Industry experience with using Data Centre – Secure Control Modules (DC-SCMs) at either version 1 or version 2 of the OCP specification.
  • Board Bring-up expertise with hands-on experience in Device drivers like I2C/I3C, SPI, PCIe, SMBus, Mail-box etc. as well as the device trees for uboot and Linux kernel.
  • Ideally made some contributions to industry standards like the Open Compute Project, OpenBMC, IPMI, DMTF etc.

Benefits

In addition to a competitive salary, Graphcore offers flexible working, a generous annual leave policy, private medical insurance and health cash plan, a dental plan, pension (matched up to 5%), life assurance and income protection. We have a generous parental leave policy and an employee assistance programme (which includes health, mental wellbeing, and bereavement support). We offer a range of healthy food and snacks at our central Bristol office and have our own barista bar! We welcome people of different backgrounds and experiences; we’re committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments