hero

Find Your Dream Job Today

Out for Undergrad
companies
Jobs

Site Reliability Engineer II

JPMorganChase

JPMorganChase

Software Engineering, Other Engineering
Bengaluru, Karnataka, India
Posted on Nov 12, 2025

Play a key role in ensuring system reliability at one of the world’s most iconic and largest financial institutions.

As a Site Reliability Engineer II at JPMorgan Chase within the Corporate Technology, Finance Last Mile Reporting team, you will use technology to solve business problems and leverage software engineering best practices as we strive towards excellence. This role often works independently to execute small to medium projects, but you’ll also have the opportunity to collaborate with cross functional teams to continually improve your level of knowledge about JPMorgan Chase’s business and relevant technologies.

Job responsibilities

  • Design, implement & Maintain scalable highly available resilient systems for Banking domain
  • Establish and improve SRE best practices including monitoring Alerting, Incident response & Automation
  • Collaborate with Development and Business Operation team to enhance System Reliability , performance & Scalability
  • Implement DevOps methodology such as CI/CD pipeline, Infrastructure as code & Automated deployments
  • Monitor and improve system observability using tools such as Prometheus, Grafana, ELK stack, Dynatrace, Control+M, etc.
  • Analyze system failure and conduct Root cause analysis to prevent future incidence
  • Optimize system performance and ensure compliance with Banking security and regulatory standards.
  • Lead incident management and troubleshooting efforts ensuring minimal service disruption
  • Leverages technology to solve business problems by writing high quality, maintainable, and robust code following best practices in software engineering
  • Recognizes the toil within your role and proactively works towards eliminating it through either systems engineering or updating application code
  • Understands observability patterns and strives to implement and improve service level indicators, objectives monitoring, and alerting solutions for optimal transparency and analysis .Implement and refine error budgets and SLI/SLO/SLA to improve reliability

Required qualifications, capabilities, and skills

  • Formal training or certification on site reliability engineering concepts and 2+ years applied experience
  • Ability to code in at least one programming language such as Python, Java etc and understanding of SQL and databases such as Oracle, MYSQL etc
  • Experience maintaining and working on a Cloud-base infrastructure
  • Strong experience with site reliability concepts, principles, and practices
  • Exposure observability such as white and black box monitoring, service level objective alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, and others
  • Good knowledge on containers or a common Server OS such as Linux and Windows
  • Strong knowledge of continuous integration and continuous delivery tools like Jenkins, GitLab, Terraform, Ansible and common networking technologies
  • Ability to work in a large, collaborative team and demonstrates the willingness to vocalize ideas with peers and managers
  • Experience Microservice Architecture & Container Orchestration like Kubernetes, Docker-Swarm, etc
  • Experience in Databricks data engineering, data warehousing concepts, ETL processes (Job Runs, Data Ingestion and Delta Live Tables, Spark Streaming).
  • Ability to demonstrate and apply existing and new system processes, methodologies, and skills to contribute to the development of systems

Preferred qualifications, capabilities, and skills

  • General knowledge of financial services industry
  • Strong Problem solving & Analytical skills
  • Ability to work in high pressure environments & Manage incidents effectively
  • Passion for continuous improvement & Automation

Work in technology while increasing your knowledge and skillsets at one of the world's leading organizations