Founding ML Engineer (CUDA, ROCm, C++)
A rapidly scaling startup, recently emerging from stealth mode and backed by a top-tier venture capital fund, is embarking on a mission to democratize AI across any hardware platform
Our client’s R&D team is building a highly efficient engine for deploying genAI models. This entails a wide array of tasks, ranging from fine-tuning GPU kernels to optimizing system performance. The Founding ML Engineer will play a pivotal role in driving significant enhancements in GPU performance while spearheading innovative AI and machine learning initiatives.
To tackle this mission – they are seeking an expert-level engineer for either Kernel, Compiler, or Runtime Optimization, with a robust background in CUDA, ROCm, or Triton kernel optimization.
This role presents an exceptional opportunity to shape the technical direction of the company and contribute to groundbreaking advancements in AI technology.