Senior Manager Software Engineering - Production Support - 2281060
UnitedHealth Group
Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by diversity and inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health equity on a global scale. Join us to start Caring. Connecting. Growing together
We are seeking a highly motivated and experienced Service Level Owner (SLO) to manage the performance, reliability, and overall health of critical software applications. As an SLO, you will be responsible for defining, measuring, and improving the service levels of our applications, ensuring they meet the needs of our business and users. You will be a key point of contact for incident resolution, problem management, and change management, working closely with development, operations, and business stakeholders to deliver exceptional service.
Primary Responsibilities:
- Service Level Management:
- Define, monitor, and report on SLOs, SLAs, and KPIs for assigned applications
- Proactively identify and address potential service level breaches
- Develop and implement strategies to improve application performance, reliability, and availability
- Incident & Problem Management:
- Lead and participate in incident management activities, including troubleshooting, root cause analysis, and resolution
- Drive problem management efforts to identify and address underlying issues that lead to incidents
- Coordinate with development, operations, and other teams to ensure timely and effective resolution of incidents and problems
- Change Management:
- Act as a gatekeeper for production changes, reviewing and approving changes to ensure they meet quality standards and do not negatively impact service levels
- Work with change management teams to ensure that changes are properly planned, tested, and implemented
- Technical Leadership:
- Provide technical guidance and support to development and operations teams
- Participate in architecture reviews to ensure that new and existing applications are designed for performance, scalability, and reliability
- Contribute to the development of technical solutions related to security, cloud migration, and other strategic initiatives
- Communication & Reporting:
- Prepare and present regular status reports to stakeholders, including weekly/monthly performance reviews and project updates
- Communicate effectively with business partners to understand their needs and priorities
- Maintain clear and concise documentation of application architecture, dependencies, and service level agreements
- Continuous Improvement:
- Identify opportunities to improve processes, tools, and technologies to enhance service delivery
- Stay up-to-date on industry trends and best practices in service level management
- Promote a culture of continuous learning and improvement within the team
- Compliance:
- Adhere to company policies and procedures, including security and compliance requirements
- Ensure that applications meet all relevant regulatory requirements
- Project Management
- Build project standards and processes and implement them across the teams
- SSMO Delivery
- Implement and enforce defined SSMO delivery processes/guidelines, including Problem Management, Incident
- Management, Change Management, SLA Compliance, and productivity goals
- Comply with the terms and conditions of the employment contract, company policies and procedures, and any and all directives (such as, but not limited to, transfer and/or re-assignment to different work locations, change in teams and/or work shifts, policies in regards to flexibility of work benefits and/or work environment, alternative work arrangements, and other decisions that may arise due to the changing business environment). The Company may adopt, vary or rescind these policies and directives in its absolute discretion and without any limitation (implied or otherwise) on its ability to do so
Required Qualifications:
- B.E./MCA/B.Tech / MTECH /MS Graduation (Minimum 16 years of formal education, Correspondence courses are not relevant)
- 12+ years of experience supporting production applications in a complex, distributed environment
- Technical Skills:
- Solid understanding of architecture principles, infrastructure, and dependencies
- Expertise in application log analysis and troubleshooting using tools like Splunk
- Proficient in creating and maintaining Splunk dashboards for monitoring and alerting
- Proven experience supporting cloud-based applications (AWS preferred) and streaming applications
- Proficiency in using monitoring and observability tools like Dynatrace, Splunk APM, or similar technologies to identify performance bottlenecks and proactively address issues
- Familiarity with CI/CD pipelines and tools like Jenkins and GitHub
- Knowledge of database technologies (e.g., SQL, NoSQL) and their impact on application performance
- Process & Methodology:
- Deep understanding of Service Level Objectives (SLOs), Service Level Agreements (SLAs), and Key Performance Indicators (KPIs)
- Extensive experience in incident management, problem management, change management, and root cause analysis
- Proven ability to manage and lead war rooms for high-priority incidents
- Experience implementing and adhering to ITIL or similar service management frameworks
- Soft Skills:
- Excellent analytical and problem-solving skills
- Outstanding communication (written and verbal) and presentation skills. Ability to effectively communicate technical concepts to both technical and non-technical audiences
- Solid stakeholder management skills, with the ability to build and maintain relationships with business partners, development teams, and operations teams
- Self-motivated, proactive, and able to work independently and as part of a team
- Solid work ethic, positive attitude, and a commitment to continuous improvement
- Ability to adapt to a fast-paced and dynamic environment
- Other:
- Ability to work in Evening SHIFT (11:30 am – 8.30 pm or 2:30 PM – 12AM)
- Experience in system change verification ensuring alignment with business requirements
At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone – of every race, gender, sexuality, age, location and income – deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes – an enterprise priority reflected in our mission.