hero

Find Your Dream Job Today

Our mission is to help high-achieving LGBTQ+ undergraduates reach their full potential.

Site Reliability Engineer - VP - Pune

Citi

Citi

Software Engineering
Pune, Maharashtra, India
Posted on May 3, 2025

The SRE Observability Specialist is a hands-on expert, delivering the future of Observability across Services Technology. This role is a part of a central SRE enablement team within Services Production, working closely with SREs, developers, and platform teams to embed telemetry, implement SLOs, and build meaningful visualizations for key production flows — particularly in critical Payments Business.

The ideal candidate will have deep technical knowledge, a collaborative mindset, and the ability to translate strategy into scalable engineering outcomes. You will also act as a bridge between Services Technology teams and central infrastructure/CTO teams, prioritising observability needs from line-of-business teams and driving improvements. A strong understanding of observability tooling, evolving AI/ML capabilities, and enterprise tooling ecosystems will be essential.

Key Responsibilities:

  • Deliver against the observability roadmap for Services Technology by building scalable, reusable telemetry solutions.

  • Create and maintain dashboards and visualizations for critical client journeys, including real-time flows across Payments.

  • Guide line-of-business teams in implementing SLIs/SLOs, golden signals, and effective alerting to support operational excellence.

  • Support integration and adoption of observability tooling across on-prem, public cloud (AWS/GCP), and containerized environments (ECS, Kubernetes).

  • Customize shared dashboards and observability components in partnership with CTI and other central Engineering functions, ensuring usability and flexibility.

  • Provide technical support and implementation guidance to SREs and developers facing integration or tooling challenges.

  • Effectively manage the observability book of work for Services Technology and drive initiatives to reduce MTTD and improve recovery outcomes.

  • Serve as a key connection point between line-of-business SREs and central infrastructure functions by gathering tooling feedback, surfacing systemic issues, and influencing platform enhancements via the Services Observability Forum.

  • Stay current with observability trends, including AI/ML-driven insights, anomaly detection, and emerging OSS practices, and assess their applicability.

  • Maintain strong knowledge of observability platform features and vendor offerings to advise teams and maximize the value of tooling investments.

Qualifications:

  • 10+ years of experience in SRE, Observability Engineering, or platform infrastructure roles focused on operational telemetry.

  • Hands-on experience in observability tools and stacks such as Grafana, Prometheus, OpenTelemetry, ELK, Splunk, and similar platforms.

  • Deep understanding of SLIs, SLOs, Error Budgets, and telemetry best practices in high-availability environments.

  • Proven ability to troubleshoot integration issues and support observability across hybrid platforms (on-prem, cloud, containers).

  • Experience building dashboards aligned to business outcomes and incident workflows, especially in critical flows like payments.

  • Familiarity with modern observability tooling ecosystems, including AI/ML capabilities, trace correlation, baselining, and alert tuning.

  • Strong interpersonal and collaboration skills — able to operate across federated engineering teams and central infrastructure groups.

  • Experience in enablement or platform teams with a track record of scaling best practices across diverse business units.

Education:

  • Bachelor’s degree in Computer Science, Engineering, or a related technical field, or equivalent practical experience.

------------------------------------------------------

Job Family Group:

Technology

------------------------------------------------------

Job Family:

Applications Support

------------------------------------------------------

Time Type:

Full time

------------------------------------------------------

Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.

If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.

View Citi’s EEO Policy Statement and the Know Your Rights poster.