hero

Find Your Dream Job Today

Our mission is to help high-achieving LGBTQ+ undergraduates reach their full potential.

Senior Telemetry Data Engineer

Microsoft

Microsoft

Data Science
Posted on Dec 18, 2024

Senior Telemetry Data Engineer

Multiple Locations, Netherlands

Save

Share job

Date posted
Dec 18, 2024
Job number
1792973
Work site
Up to 100% work from home
Travel
0-25 %
Role type
Individual Contributor
Profession
Software Engineering
Discipline
Data Engineering
Employment type
Full-Time

Overview

Microsoft is on a mission to empower every person and every organization on the planet to achieve more. Our culture is centered on embracing a growth mindset, a theme of inspiring excellence, and encouraging teams and leaders to bring their best each day. In doing so, we create life-changing innovations that impact billions of lives around the world. You can help us achieve our mission.

Cloud Operations + Innovation (CO+I) is the engine that powers Microsoft’s core cloud platforms and services that millions of people use every day. With more than 95% of Fortune 500 business on Azure, 180 million using Office 365, and millions using other services – all running on Microsoft's cloud infrastructure – CO+I builds and operates the foundation upon which Microsoft’s mission to empower every person and organization comes to life.

Are you passionate about cloud computing? Do you get excited about taking a hands-on approach to transforming Microsoft’s most critical business through investigation, data analysis, and automation? If so, come and help us build the most reliable & efficient datacenter infrastructure on the planet. The CO+I Critical Environment Systems Intelligence (CESI) team is responsible for designing and delivering solutions to support global datacenter operations and to improve availability. CESI is helping to drive CO+I’s transition to a customer centric, data driven, observability based, live service culture. As a Data Engineer, you will be a key player in this transition.

As a Senior Data Engineer on the CO+I Critical Environment Service Intelligence (CESI) team, you partner and collaborate on the design and delivery of automated solutions to monitor, detect, and alert on data center critical environment mechanical and electrical resources. You will collaborate with other CO+I teams to contribute and benefit from their work to ensure that we are constantly improving across the fleet. You will work with massive amounts of data with low latency requirements across cutting edge technologies, with the potential for significant impact to both internal partners and external customers.

In alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day.

Qualifications

Qualifications:

  • Subject matter expertise level in supervised and unsupervised machine learning models for anomaly detection.
  • Demonstrated subject matter expertise in utilizing data lakes within Lakehouse architectures to process, aggregate, and manage real-time data streams from cloud-based services.
  • Demonstrated subject matter expertise in managing and processing large-scale data formats with a focus on real-time serialization and deserialization to ensure low-latency during data handling. This includes advanced proficiency in Kusto query language (KQL) with experience, and proficiency in coding with Python, GoLang, or Spark.
  • Experience with generative AI or Copilots for troubleshooting data center environments.
  • Expertise in processing data frames from networking layers and protocols, including BGP, TCP/IP, and GPRS tunneling protocol.
  • Proven experience on building applications using artificial intelligence (AI) techniques, including machine learning (ML) and data science, to enhance and automate various IT operations (AIOPS).
  • Bachelor's or master’s degree in computer science, data engineering, or a related field.
  • Excellent problem-solving skills and attention to detail.
  • Ability to work collaboratively with cross-functional teams.
  • Strong written and verbal communication skills.

Preferred Qualifications:

  • Indepth experience in designing and implementing telemetry systems for data center networks.
  • Familiarity with HVAC, CRAC, AHU, Chillers, and other critical environment equipment.
  • Knowledge of incident management and data center operations.
  • Certifiable knowledge in cloud computing.

About Us: We are committed to maintaining the highest standards of operational excellence in our data centers. Join us in our mission to enhance our telemetry capabilities and ensure the reliability and efficiency of our critical environments.

Background Check Requirements:

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to, the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

#COICareers

Responsibilities

As a Senior Technical Program Manager in DC Critical Environments, you will:

  • Drive a program that covers end-to-end monitoring, and processing of critical environment (CE) infrastructure telemetry for all leased sites, to bring those sites on par with owned datacenter sites.
  • Design and implement telemetry data ingestion and data processing systems for leased sites.
  • Prototype, pilot, and deploy multi-signal anomaly detection and prevention systems leveraging machine learning and statistical analysis for DC leased sites
  • Define and drive an operationalization plan for the telemetry pipeline for leased sites.
  • Ensure interoperability of detection methods, systems, and workflows by defining conceptual, logical, and physical data models.
  • Understand the signals coming from the EPMS and BAS systems for leased sites.
  • Ensure high percent coverage and mapping of leased site signals including thermal, power, and other environmental conditions and data.
  • Define a set of reusable primitives for mapping logical and physical topology of data centers leased sites.
  • Ensure there is a high-frequency, high-volume, low-latency streaming and micro-batching capable pipeline to process DC CE telemetry from leased sites.
  • Architect a staging model to ensure the onboarding of leased sites CE telemetry (thermal, power, and other environmental subjects).

Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.
Industry leading healthcare
Educational resources
Discounts on products and services
Savings and investments
Maternity and paternity leave
Generous time away
Giving programs
Opportunities to network and connect

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.