Founding ML Engineer (CUDA, ROCm, C++)

A rapidly scaling startup, recently emerging from stealth mode and backed by a top-tier venture capital fund, is embarking on a mission to democratize AI across any hardware platform

Salary: $100 000 USD+ Equity (highly negotiable for the right candidate)

Hybrid role: 2-3 days at an office in Warsaw/Gdansk

Full-time position

B2B contract or Contract of Employment, negotiable

Home office budget & relocation/traveling

cost included

Our client’s R&D team is building a highly efficient engine for deploying genAI models. This entails a wide array of tasks, ranging from fine-tuning GPU kernels to optimizing system performance. The Founding ML Engineer will play a pivotal role in driving significant enhancements in GPU performance while spearheading innovative AI and machine learning initiatives.

To tackle this mission – they are seeking an expert-level engineer for either Kernel, Compiler, or Runtime Optimization, with a robust background in CUDA, ROCm, or Triton kernel optimization.

This role presents an exceptional opportunity to shape the technical direction of the company and contribute to groundbreaking advancements in AI technology.

Requirements

Deep understanding and experience in GPU performance optimizations.

Proven track record of kernel optimizations on CUDA, ROCm, or other accelerators.

Proficiency in programming languages such as C/C++ and Python.

Experience with the training and deployment of ML models.

Familiarity with distributed systems development or distributed ML workloads.

Bachelor's, Master’s or PhD’s degree in Computer Science, Electrical Engineering, or a related field.

Great understanding of English with strong communication and collaboration skills.

Nice to have

An exceptional candidate will also have:

Familiarity with OSS projects like FlashAttention, mlc-llm, vllm

Experience with machine learning compilers or frameworks such as TVM, MLIR, Pytorch, Tensorflow, ONNX Runtime, TensorRT.

Responsibilities

Analyzing the bottlenecks in ML training and inference

Developing and optimizing computing kernels in CUDA, Triton or ROCm

Working on the GPU performance optimizations to maximize performance

Hybrid/Warsaw

Founding ML Engineer (CUDA, ROCm, C++)

7 900 - 8 400 USD net/month - B2B

net/month - B2B

7,900 - 8,400 USD

Apply Now

apply via email

[email protected]

DevsData LLC needs the contact information and your personal data to be processed for the needs of a recruitment process. For information on how to withdraw your consent, as well as our privacy practices and commitment to protecting your privacy, check out our Privacy Policy.

Copy GDPR formula

Read about our Privacy Policy

Any questions? E-mail me.

What will be your next steps?

1

Quick non-technical
conversation

It's all about communication! We want to see how your social and decision-making skills can contribute to efficient team performance.

2

60 to 90 minutes technical
interview

During the technical interview, we want to assess the candidate's specific knowledge, skills, and abilities in relation to our client's needs.

3

Client
interview

The problem-solving challenge is all about using logic and creativity to make sense of a situation and develop an intelligent solution.

4

Offer

You did it! After managing to get through all of these rigorous stages, it's finally time to recommend you directly to our clients.

1

Quick non-technical
conversation

It's all about communication! We want to see how your social and decision-making skills can contribute to efficient team performance.

2

60 to 90 minutes technical
interview

During the technical interview, we want to assess the candidate's specific knowledge, skills, and abilities in relation to our client's needs.

3

Client
interview

The problem-solving challenge is all about using logic and creativity to make sense of a situation and develop an intelligent solution.

4

Offer

You did it! After managing to get through all of these rigorous stages, it's finally time to recommend you directly to our clients.

SIMILIAR JOB OFFERS

Other job offers you may be interested in

WE’RE ALWAYS LOOKING FOR TOP TALENTS

Still searching for that perfect job?

We are always on the lookout for exceptional software engineers and bright business people. You will have the opportunity to do what you love with the best people in the industry.

[email protected]

Email copied

Founding ML Engineer (CUDA, ROCm, C++)

Requirements

Nice to have

Responsibilities

Please include GDPR consent

Founding ML Engineer (CUDA, ROCm, C++)

What will be your next steps?

1

Quick non-technical conversation

2

60 to 90 minutes technical interview

3

Client interview

4

Offer

1

Quick non-technical conversation

2

60 to 90 minutes technical interview

3

Client interview

4

Offer

[email protected]

Our global locations

🇵🇱 Warsaw, Poland

🇺🇸 New York

🇬🇧 London, UK

🇪🇸 Barcelona, Spain

Or meet our local partners in other regions we serve

Chicago

Sydney, Australia

Lisbon, Portugal

Oslo, Norway

Tallinn, Estonia

Mexico City, Mexico

Amsterdam, Netherlands

Calgary, Canada

Bucharest, Romania

Sofia, Bulgaria

Book a call with our team

For software development projects, minimum engagement is $15,000.

Best back-end engineers I've ever worked with...​

Thank you

Quick non-technical
conversation

60 to 90 minutes technical
interview

Client
interview

Quick non-technical
conversation

60 to 90 minutes technical
interview

Client
interview

Best back-end engineers I've ever worked with...