hero

Find Your Dream Job Today

Assistant Vice President, Production Services Specialist (Event and Incident Management), Application Production Services and Engineering

Bank of America

Bank of America

Administration
Singapore · United States · Remote
Posted on Jul 26, 2025

Job Description:

At Bank of America, we are guided by a common purpose to help make financial lives better through the power of every connection. We do this by driving Responsible Growth and delivering for our clients, teammates, communities and shareholders every day.

Being a Great Place to Work is core to how we drive Responsible Growth. This includes our commitment to being an inclusive workplace, attracting and developing exceptional talent, supporting our teammates’ physical, emotional, and financial wellness, recognizing and rewarding performance, and how we make an impact in the communities we serve.

Bank of America is committed to an in-office culture with specific requirements for office-based attendance and which allows for an appropriate level of flexibility for our teammates and businesses based on role-specific considerations.

At Bank of America, you can build a successful career with opportunities to learn, grow, and make an impact. Join us!

Job Description:

APSRE Event and Incident Management (EIM) is responsible for providing 24X7 end to end detailed monitoring and support of all infrastructure and application components critical to the Bank of America Wealth Management, Credit Cards, Deposit and Loans TI. The APSRE Event and Incident Management team is responsible for handling real-time technical (application/infrastructure) escalation, triage and restoration of issues impacting critical business services to customers and partners in a timely manner while keeping partners advised of significant progress or challenges during the restoration period. This includes monitoring of the network, databases, middleware, backup, storage, application specific and operating system specific components with the goal of proactively identifying and resolving performance issues prior to interruption of services to the customer.

Responsibilities:

  • Ensuring standard call facilitation and call guidance for all incidents reported.
  • Maintaining accurate status and progress of incident recovery efforts.
  • Host Incident bridge when required and engage Level 2 & Level 3 teams as needed for problem resolution.
  • Ability to craft incident statements, and quickly gain a deep understanding of the issue in flight and provide an accurate high-level recap within 15mins in an active incident call.
  • Ensures incident tickets are handled timely and efficiently.
  • Ability to lead and take charge when running triage and provide high quality updates on Incident Bridge board (Virtual on Watch)
  • Monitoring Triage Orchestration to ensure that incidents are being managed appropriately and making adequate traction given impact and urgency and alert executive stakeholders of possible escalations from high impact incidents.
  • Ability to coordinate with different domains and successfully manage multiple moderately highly detailed tasks concurrently.
  • Assumes authoritative voice for communicating status of major and significant incidents.
  • Gather comprehensive business impact information from a wide variety of sources and contacts, with speed and accuracy.
  • Use of monitoring tools to proactively identify and research potential production incidents.
  • Perform windows server admin related tasks.
  • Responding to alerts regarding potential production incidents.
  • Triaging and escalating to support partners as needed for problem resolution.
  • Perform trending and analysis using monitoring tools and reports to proactively identify and address potential issues prior to production impact.
  • Perform environment/data center traffic routing.
  • Partner with Application Support teams to coordinate support for the schedule changes requiring rout-away.
  • Identify opportunities for additional monitoring and automation and partner with Monitoring Architecture and Engineering to implement.
  • Develop procedures for trouble shooting and possible resolution of issues.
  • Execute procedures reliably and escalate appropriately to solve incidents quickly.

Required Skills:

• Proven team player who can work comfortably in a multicultural environment.

• Proven ability to work independently, multitask and effectively work in a complex environment with a global team structure

• Excellent verbal and written communication skills; Strong influencer, facilitator, and collaborator.

• Must be pro-active, enthusiastic, flexible, results driven with attention to detail.

• Knowledge of Splunk, Dynatrace, SiteScope, Tivoli Netcool/Web GUI.

• Experience with Java Virtual Machine, Unix/Linux OS, Windows Server.

• Experience in a large IT production support environment

• Excellent understanding/exposure to ITIL/ITSM.

• Ability to work in non-contiguous shifts including the potential for weekend days.