Are you passionate about pushing the boundaries of Generative AI and multimodal learning?
We’re looking for a Senior Multimodal AI Engineer to lead the development of cutting-edge models that integrate text, vision, and audio for high-performance inference across diverse deployment environments—from edge devices to server-side systems.
What You’ll Be Doing:
Designing and optimising multimodal AI models for real-time, scalable inference.
Collaborating with cross-functional teams to integrate models into advanced hardware platforms.
Applying the latest techniques in transformers, diffusion models, and model compression.
Driving innovation in Generative AI and contributing to the evolution of our AI stack.
Ensuring robust deployment and continuous improvement of models in production.
What You Bring:
5+ years of experience in multimodal model development (text, image, audio).
Strong skills in deep learning frameworks like PyTorch, TensorFlow, or JAX.
Expertise in NLP, computer vision, and speech processing.
Familiarity with model optimisation techniques (quantisation, pruning, distillation).
Experience with distributed systems, HPC, or in-memory computing platforms.
Bonus: Experience deploying models on edge devices or using frameworks like ONNX or TensorRT.
Location: This role is based in Italy, with relocation support available to Bologna, Florence, or Milan — or you can work fully remotely from anywhere within Italy.
This is a unique opportunity to work at the intersection of AI research and high-performance computing, contributing to real-world applications in a fast-paced, innovation-driven environment.