Asset Management - AI Systems Engineer - Associate/VP
JPMorganChase
Software Engineering, Data Science
Shanghai, China
Posted on May 27, 2026
Key Responsibilities
- Inference Platform & Optimization: Build and optimize enterprise LLM serving platforms (e.g., vLLM, TensorRT-LLM) using techniques like PagedAttention, continuous batching, and quantization (AWQ/FP8) for high throughput and low latency.
- GPU Pooling & AI Infra: Design GPU pooling, virtualization, and scheduling solutions on Kubernetes to maximize hardware utilization. Manage distributed training clusters and high-performance networking (RDMA/NCCL).
- Model Deployment & MLOps: Streamline the CI/CD pipeline for AI models. Implement automated benchmarking, zero-downtime deployment, and comprehensive observability (TTFT, TPS, GPU metrics).
Qualifications
1. Education & Experience:
- Bachelor’s, Master’s, or Ph.D. in Computer Science, Computer Engineering, or a related field.
- 3+ years of experience in Backend Systems, Distributed Systems, or AI Infrastructure/MLOps, with at least 1-2 years specifically focused on LLM serving, GPU optimization, or ML Systems.
2. Core Engineering & Systems Skills:
- Expert-level proficiency in Python and strong proficiency in Java (essential for inference engines and CUDA integration).
- Deep understanding of Linux internals, networking, and distributed systems architecture.
- Hands-on experience with container orchestration (Kubernetes, Docker) and building custom K8s operators or controllers.
3. AI Infrastructure & Optimization Skills:
- Deep familiarity with LLM inference engines (vLLM, TensorRT-LLM, TGI) and understanding of their underlying architectural designs. Or
- Solid understanding of GPU architecture (NVIDIA Ampere/Hopper), CUDA programming, and GPU memory management. Or
- Experience with distributed training frameworks (DeepSpeed, Megatron-LM, Ray) and high-performance networking (RDMA, RoCE, InfiniBand).
4. Mindset & Soft Skills:
- A "hacker" mindset with a passion for squeezing every drop of performance out of hardware.
- Ability to collaborate effectively with AI Researchers (to understand their models) and Backend Engineers (to integrate AI into business systems).
Preferred
- Contributions to open-source AI Infra projects (e.g., vLLM, Ray, PyTorch).
- Experience writing custom CUDA kernels or using Triton for operator fusion.
- Financial industry (Asset Management/Quant) experience is a plus.
- Language: Professional working proficiency in English to collaborate with global teams.
J.P. Morgan is a global leader in financial services, providing strategic advice and products to the world’s most prominent corporations, governments, wealthy individuals and institutional investors. Our first-class business in a first-class way approach to serving clients drives everything we do. We strive to build trusted, long-term partnerships to help our clients achieve their business objectives.
We recognize that our people are our strength and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. We also make reasonable accommodations for applicants’ and employees’ religious practices and beliefs, as well as mental health or physical disability needs. Visit our FAQs for more information about requesting an accommodation.
J.P. Morgan Asset & Wealth Management delivers industry-leading investment management and private banking solutions. Asset Management provides individuals, advisors and institutions with strategies and expertise that span the full spectrum of asset classes through our global network of investment professionals. Wealth Management helps individuals, families and foundations take a more intentional approach to their wealth or finances to better define, focus and realize their goals.
As an AI Systems Engineer, you will be the backbone of our AI initiatives. While our AI Researchers focus on model intelligence, your mission is to make our AI systems fast, scalable, cost-efficient, and highly reliable. You will design and build the underlying AI infrastructure, including GPU resource pooling, high-performance LLM inference platforms, and distributed training frameworks. You will solve hardcore engineering challenges in model deployment, memory optimization, and distributed systems to empower our asset management business with enterprise-grade AI capabilities.