About PhonePe Group:
PhonePe is India’s leading digital payments company with 50 crore (500 Million) registered users and 3.7 crore (37 Million) merchants covering over 99% of the postal codes across India. On the back of its leadership in digital payments, PhonePe has expanded into financial services (Insurance, Mutual Funds, Stock Broking, and Lending) as well as adjacent tech-enabled businesses such as Pincode for hyperlocal shopping and Indus App Store which is India's first localized App Store. The PhonePe Group is a portfolio of businesses aligned with the company's vision to offer every Indian an equal opportunity to accelerate their progress by unlocking the flow of money and access to services.
Culture
At PhonePe, we take extra care to make sure you give your best at work, Everyday! And creating the right environment for you is just one of the things we do. We empower people and trust them to do the right thing. Here, you own your work from start to finish, right from day one. Being enthusiastic about tech is a big part of being at PhonePe. If you like building technology that impacts millions, ideating with some of the best minds in the country and executing on your dreams with purpose and speed, join us!
About the Role
As an SRE (5 to 7 years) (Big Data) Engineer at PhonePe, you will be responsible for ensuring the stability, scalability, and performance of distributed systems operating at scale. You will collaborate with development, infrastructure, and data teams to automate operations, reduce manual efforts, handle incidents, and continuously improve system reliability. This role requires strong problem-solving skills, operational ownership, and a proactive approach to mentoring and driving engineering excellence.
Roles and Responsibilities
- Ensure the ongoing stability, scalability, and performance of PhonePe’s Hadoop ecosystem and associated services.
- Manage and administer Hadoop infrastructure including HDFS, HBase, Hive, Pig, Airflow, YARN, Ranger, Kafka, Pinot, and Druid.
- Automate BAU operations through scripting and tool development.
- Perform capacity planning, system tuning, and performance optimization.
- Set-up, configure, and manage Nginx in high-traffic environments.
- Administration and troubleshooting of Linux + Bigdata systems, including networking (IP, Iptables, IPsec).
- Handle on-call responsibilities, investigate incidents, perform root cause analysis, and implement mitigation strategies.
- Collaborate with infrastructure, network, database, and BI teams to ensure data availability and quality.
- Apply system updates, patches, and manage version upgrades in coordination with security teams.
- Build tools and services to improve observability, debuggability, and supportability.
- Participate in Kerberos and LDAP administration.
- Experience in capacity planning and performance tuning of Hadoop clusters.
- Work with configuration management and deployment tools like Puppet, Chef, Salt, or Ansible.
Skills Required
- Minimum 1 year of Linux/Unix system administration experience.
- Over 4 years of hands-on experience in Hadoop administration.
- Minimum 1 years of experience managing infrastructure on public cloud platforms like AWS, Azure, or GCP (optional ) .
- Strong understanding of networking, open-source tools, and IT operations.
- Proficient in scripting and programming (Perl, Golang, or Python).
- Hands-on experience with maintaining and managing the Hadoop ecosystem components like HDFS, Yarn, Hbase, Kafka .
- Strong operational knowledge in systems (CPU, memory, storage, OS-level troubleshooting).
- Experience in administering and tuning relational and NoSQL databases.
- Experience in configuring and managing Nginx in production environments.
- Excellent communication and collaboration skills.
Good to Have
- Experience designing and maintaining Airflow DAGs to automate scalable and efficient workflows.
- Experience in ELK stack administration.
- Familiarity with monitoring tools like Grafana, Loki, Prometheus, and OpenTSDB.
- Exposure to security protocols and tools (Kerberos, LDAP).
- Familiarity with distributed systems like elasticsearch or similar high-scale environments.
PhonePe Full Time Employee Benefits (Not applicable for Intern or Contract Roles)
- Insurance Benefits - Medical Insurance, Critical Illness Insurance, Accidental Insurance, Life Insurance
- Wellness Program - Employee Assistance Program, Onsite Medical Center, Emergency Support System
- Parental Support - Maternity Benefit, Paternity Benefit Program, Adoption Assistance Program, Day-care Support Program
- Mobility Benefits - Relocation benefits, Transfer Support Policy, Travel Policy
- Retirement Benefits - Employee PF Contribution, Flexible PF Contribution, Gratuity, NPS, Leave Encashment
- Other Benefits - Higher Education Assistance, Car Lease, Salary Advance Policy
Working at PhonePe is a rewarding experience! Great people, a work environment that thrives on creativity, the opportunity to take on roles beyond a defined job description are just some of the reasons you should work with us. Read more about PhonePe on our blog.
Life at PhonePe
PhonePe in the news