Platform Engineering Manager
Sia Partners
Job description
Sia Partners is looking for a Platform Engineering Manager to support the design and delivery of next-generation AI and Generative AI platforms within Sia’s AI Factory. This role is pivotal in bridging high-level product vision with robust, cloud-native engineering execution.
As a Platform Engineering Manager, you will be responsible for building and evolving internal development and MLOps platforms that improve automation, scalability, reliability, and developer productivity. You will operate as a player-coach, combining hands-on technical leadership with people management and strategic ownership.
This is a product-focused role, working closely with product managers, data scientists, ML engineers, application engineers, and security teams to deliver platform capabilities that directly support AI, data, and software workloads
Key Responsibilities
-
Leadership of Platform / DevOps / SRE engineering teams, ensuring delivery excellence and strong engineering culture
-
Ownership of internal platform products, working closely with product, application, and data engineering teams to deliver scalable, reliable, and secure platform capabilities.
-
Support to Data Scientists, ML Engineers, Data Engineers, and Software Engineers by providing reliable, scalable, and easy-to-use platform services
-
Definition and execution of the platform engineering strategy and roadmap, aligned with business and delivery objectives
-
Development and operation of internal developer platforms enabling automation, self-service, and scalability
-
Support to Data Scientists, Data Engineers, and Software Engineers by providing reliable, secure, and scalable platforms
-
Cloud services: architecture and operations across AWS, Azure, and GCP, including compute, storage, networking, access management, cost monitoring, and cost optimization
-
Infrastructure as Code: design, standardization, and governance using Terraform
-
Containers: containerization and orchestration of applications using Docker and Kubernetes, including Kubernetes platform ownership
-
CI/CD: definition and standardization of continuous integration and deployment pipelines
-
Observability & reliability: monitoring, logging, alerting, and application of SRE principles to ensure availability, performance, and resilience
-
Contribution to technological, architectural, and governance decisions to address the challenges of scaling AI and data platforms
-
Collaboration with product, application, data, and security teams to gather requirements and deliver platform capabilities
-
Planning and management of platform initiatives, including timelines, resourcing, and budget oversight
-
Mentoring engineers and fostering knowledge sharing and continuous improvement.