Mistral AI is hiring an expert in the role of serving and training large language models at high speed on GPUs.
The role will involve
writing low-level code to take all advantage of high-end GPUs (H100) and max out their capacity
rethinking various part of the generative model architecture to make them more suitable for efficient inference
integrating low-level efficient code in a high-level MLOps framework
To apply for this job please visit www.linkedin.com.