Lead Software Engineer
JPMorganChase
We have an opportunity to impact your career and provide an adventure where you can push the limits of what's possible.
As a Lead Software Engineer at JPMorgan Chase within the Consumer and community banking technology team, you will be responsible for the stability, reliability, and operational excellence of ATM Product applications.
Job responsibilities
-
Manage a team of software engineers and L3/RTE support engineers; set goals, coach, conduct performance reviews, and hire/onboard talent.
Establish a 24x7 follow‑the‑sun or regional on‑call model; define clear escalation paths, rotations, and coverage for peak/maintenance windows.
Foster a culture of operational excellence, accountability, and blameless learning.
Provide technical direction and prioritization; balance product delivery with platform health, resiliency, and technical debt reduction.
-
Own day‑to‑day operations for assigned ATM services: incident response, service restoration, and business communications.
Define and manage SLOs/SLIs, error budgets, and capacity plans; track availability and latency KPIs and drive corrective actions.
Maintain runbooks, playbooks, and CMDB/service catalogs; ensure they remain accurate and auditable.
-
Lead major incident bridges; coordinate triage with SRE, infra, networks, vendors, and downstream partners; ensure clear comms and rapid mitigation.
Run post‑incident reviews; manage problem records through root‑cause analysis and verified long‑term fixes.
Govern change and release processes (CAB, blackout windows, rollback plans, deployment checklists); enforce pre‑prod quality gates and change controls.
-
Ensure comprehensive monitoring, alerting, and tracing across services and device integrations; tune alerts to reduce noise and improve MTTD/MTTR.
Drive automation for common operational tasks (e.g., failovers, restarts, config/secret rotations, log analysis, data fixes) and self‑healing where feasible. Track toil and operational debt; prioritize tech‑debt and resiliency backlogs with product/architecture for sustainable improvements.
-
Enforce secure SDLC controls in production (secrets management, certificate/TLS/mTLS hygiene, key rotations, vulnerability remediation).
Partner with cybersecurity, risk, and audit; maintain evidence for control testing, SOX/SOC obligations, and regulatory audits.
Oversee disaster recovery, backup/restore procedures, and resilience testing (failover/failback, chaos experiments) with documented RTO/RPO.
-
Coordinate with ATM hardware vendors and SDK providers for patching, firmware deployments, compatibility validation, and fleet‑wide rollout plans.
Manage change windows with branches/operations; monitor device telemetry and address systemic issues proactively.
Required qualifications, capabilities, and skills
- Formal training or certification on software engineering concepts and 5+ years applied experience
- Experience in software engineering/production operations for distributed systems, including 2+ years managing engineers or leading run/production support teams.
- Strong background operating Java‑based backend services; proficiency with REST/gRPC, messaging/streaming (Kafka, MQ), and relational/NoSQL databases (Oracle, PostgreSQL, DynamoDB/Cassandra).
- Demonstrated expertise in incident, problem, and change management at scale; experience running bridges and communicating with senior stakeholders.
- Hands‑on with observability stacks (metrics, logs, traces), alerting, and dashboarding; proven track record improving MTTD/MTTR and service availability.
- Practical experience with CI/CD, infrastructure‑as‑code, containerization/orchestration (Docker, Kubernetes), and release governance.
- Deep understanding of secure operations: vulnerability management, secrets/cert management, least‑privilege access, and audit readiness.
- Excellent people leadership, communication, and stakeholder management skills; adept at balancing stability, resiliency, and delivery priorities.
Preferred qualifications, capabilities, and skills
- ATM/channel operations experience, device control and telemetry, and vendor SDK/firmware rollout management.
- Payments and card standards knowledge (ISO 8583/20022); experience with HSMs, PIN translation, and cryptographic key management.
- Experience with device/edge fleet management, over‑the‑air updates, and remote monitoring/telemetry.
- Prior experience running hybrid/global teams and follow‑the‑sun support models; familiarity with chaos/resiliency testing.
We have an opportunity to impact your career and provide an adventure where you can push the limits of what's possible.