Lead Infrastructure Engineer - SRE/DevOps, Python, Cloud & BI Platforms
JPMorganChase
Software Engineering, Other Engineering
Bengaluru, Karnataka, India · Hyderabad, Telangana, India
Join a dynamic team at the forefront of technology, where your skills drive innovation and modernize mission-critical systems. Grow your career by solving impactful challenges and advancing automation.
As a Lead Infrastructure Engineer at JPMorgan Chase within the Chief Technology Office team, you will design, build, and maintain automation frameworks and infrastructure that improve platform hygiene, uptime, and reliability. You will use your expertise to reduce manual intervention, minimize privileged access, and accelerate incident response. You will collaborate with your team to optimize applications and infrastructure, share knowledge, and contribute to a culture of continuous improvement.
Job responsibilities
- Design, develop, and maintain automation scripts, pipelines, and microservices to support platform operations, deployments, failovers, and incident remediation.
- Build self-healing and auto-remediation workflows to reduce manual toil and improve system uptime.
- Develop and maintain Infrastructure as Code (IaC) using Terraform, Ansible, or equivalent tools.
- Automate routine operational tasks such as certificate rotation, password resets, service restarts, health checks, and content migrations.
- Create and maintain CI/CD pipelines using tools like Jenkins, Spinnaker, Jules, or equivalent.
- Monitor, troubleshoot, and optimize platform performance using observability tools (Dynatrace, Splunk, Grafana, OPS Hub).
- Define and track SLIs, SLOs, and error budgets; drive continuous improvement in platform reliability.
- Participate in on-call rotations and incident response; conduct blameless post-incident reviews and implement preventive measures.
- Develop and maintain runbooks, playbooks, and disaster recovery procedures.
- Support SR/DR testing, failover automation, and resiliency validation.
- Leverage AI/ML capabilities (e.g., GitHub Copilot, LLM Suite, predictive analytics) to enhance automation development, code quality, and operational efficiency.
Required qualifications, capabilities and skills
- Formal training or certification on software engineering concepts and 5+ years applied experience
- Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent experience)
- 6+ years of experience in SRE, DevOps, or platform automation roles
- Strong proficiency in scripting and programming languages such as Python, Bash, PowerShell, or Go
- Hands-on experience with CI/CD tools (Jenkins, Spinnaker, GitLab CI, or equivalent)
- Experience with Infrastructure as Code (Terraform, Ansible, CloudFormation)
- Strong experience with monitoring and observability tools (Dynatrace, Splunk, Grafana, Datadog)
- Experience with cloud platforms (AWS, Azure, or GCP)
- Hands-on experience in supporting infrastructure platform and SRE for BI tools such as SAP BusinessObjects, ThoughtSpot, Tableau, Qlik Sense, or IBM Cognos
- Experience with AI/ML tools and frameworks (GitHub Copilot, OpenAI APIs, TensorFlow, or equivalent)
- Hands-on experience with ServiceNow, JIRA ticketing tools
Preferred qualifications, capabilities and skills
- Experience with Snowflake, Databricks, or other cloud data platforms
- Knowledge of SDLC processes, Agile/Scrum methodologies, and change management
- Familiarity with secrets management, least-privilege access, and security best practices
- Experience with Autosys, Control-M, or similar job scheduling tools
Design, build, and automate infrastructure to enhance platform reliability and accelerate incident response.