HPC/Storage/GPU Engineer
We are seeking a highly skilled HPC / Storage / GPU Engineer to join our technology team. This role is critical in driving our technology roadmap and ensuring our infrastructure remains at the forefront of the industry. The ideal candidate will be hands-on, experienced with existing production environments, and adept at navigating and optimizing high-performance computing, storage systems, and scheduling solutions. You will also be expected to propose and drive the adoption of Infrastructure as Code (IaC) practices to make our storage solutions scalable and manageable, and develop our growing needs with GPU, balancing on-premises and cloud-based resources. A developer mindset is essential to create scalable, maintainable, and efficient solutions. Our infrastructure employs a hybrid approach, combining cloud and on-premises resources, and we require skills accordingly.
Key Responsibilities
- Design, implement, and manage HPC systems to support trading operations.
- Propose and implement an approach for Infrastructure as Code (IaC), creating APIs to enhance scalability and manageability of our storage.
- Approach problems with a developer mindset, creating scalable and maintainable code to enhance our infrastructure.
- Optimize and maintain storage solutions, focusing on performance, reliability, and scalability.
- Develop and enforce best practices and guidelines for storage interaction to ensure security and efficiency.
- Implement and manage GPU-based solutions to accelerate computational workloads, balancing on-premises and cloud resources.
- Collaborate with cross-functional teams to drive technology enhancements and operational efficiency.
- Troubleshoot and resolve complex technical issues in a production environment.
- Stay up-to-date with the latest technologies and best practices in HPC, storage, and GPU computing.
Requirements
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- 5+ years of experience in HPC, storage systems, and GPU computing.
- Strong communication skills and the ability to work effectively in a team.
- A developer mindset, with a focus on creating scalable, maintainable, and efficient solutions.
- Experience in software development, particularly in scripting and automation using languages like Python.
- Knowledge of parallel file systems (e.g., GPFS), batch systems (e.g., Slurm, Grid Engine), and high-performance network interconnects.
- Strong Linux systems administration skills.
- Experience with VAST and Weka storage solutions is highly desirable.
- Solid understanding of trading infrastructure and low-latency systems.
- Excellent problem-solving skills and the ability to work in a fast-paced, dynamic environment.
- Skills in managing hybrid cloud/on-premises environments.
Preferred Qualifications
- Experience with containerization and orchestration tools (e.g., Docker, Kubernetes).
- Familiarity with cloud computing platforms and hybrid cloud environments.
- Knowledge of automation using Python.
- Experience proposing and implementing Infrastructure as Code (IaC) practices from the ground up.
- Expertise in balancing on-premises and cloud-based GPU resources to optimize performance and cost.
Anticipated New York annual base salary range $200,000-$300,000 plus eligible for discretionary bonus.
Benefits
Towerโs dual offices and garden roofdecks are located in TriBeCa and SoHo, neighborhoods in downtown Manhattan. While we work hard, Towerโs cubicle-free workplace, jeans-clad workforce, and well-stocked kitchens reflect the premium the firm places on quality of life. Benefits include:
- 401(k) with company matching
- 5 weeks of paid vacation per year plus 11 paid holidays
- Free breakfast, lunch, and snacks on a daily basis
- Reimbursement for health and wellness expenses
- Free events and workshops
- Donation matching program
Tower Research Capital is an equal opportunity employer.